#Kubernetes 快速部署
一、软件环境
下面是部署过程中相关系统及软件的版本
- 操作系统: CentOS Linux release 7.6.1810 (Core)
- 内核:5.4.221-1
- containerd: v1.6.5
- nerdctl:0.23.0
- buildkit:0.10.5
- Kubernetes
- Client Version: v1.25.3
- Kustomize Version: v4.5.7
- Server Version: v1.22.2
二、节点规划
实验环境为阿里云,节点规格及配置如下
地域 | 实例规格 | 操作系统 | CPU | 内存 | 云盘 | 付费方式 | 费用 |
---|---|---|---|---|---|---|---|
呼和浩特 | ecs.s6-c1m4.large | CentOS 7.6 64位 | 2 核 | 8 GiB | ESSD 40GiB (2280 IOPS) | 抢占式实例 | 0.081 |
主机名规划
Role | Hostname | IP |
---|---|---|
master | k8s-master01 | 172.16.0.15 |
node | k8s-worker02 | 172.16.0.16 |
node | k8s-worker03 | 172.16.0.14 |
三、网络规划
网络类别 | 网段 |
---|---|
节点网段 | 172.31.112.0/20 |
Service 网段 | 10.96.0.0/12 |
Pod 网段 | 10.244.0.0/16 |
四、环境初始化
1. 设置系统主机名
$ hostnamec set-hostname k8s-master01
$ hostnamec set-hostname k8s-worker02
$ hostnamec set-hostname k8s-worker03
2. 配置 hosts 解析
$ cat /etc/hosts
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
172.16.0.15 k8s-master01
172.16.0.16 k8s-worker02
172.16.0.14 k8s-worker03
检查解析
$ ping -c1 k8s-master01 && ping -c1 k8s-master02 && ping -c1 k8s-master03
3. 安装相关依赖包
系统环境基础依赖
$ yum -y install iotop iftop dstat sysstat nc nmap htop screen conntrack ntpdate ntp ipvsadm ipset chrony jq iptables-services curl wget vim net-tools git tree lsof
containerd 依赖
$ rpm -ivh http://rpmfind.net/linux/centos/8-stream/BaseOS/x86_64/os/Packages/libseccomp-2.5.1-1.el8.x86_64.rpm
4. 清空防火墙规则并关闭 SELINUX
$ systemctl stop firewalld && systemctl disable firewalld
$ yum -y install iptables-services && systemctl start iptables && systemctl enable iptables
$ iptables -F && /usr/libexec/iptables/iptables.init save
$ setenforce 0 && sed -i 's/^SELINUX=.*/SELINUX=disabled/g' /etc/selinux/config
5. 禁用 SWAP
$ swapoff -a && sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
6. 调整内核参数用于 K8S 环境
$ cat > /etc/sysctl.d/kubernetes.conf << EOF
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
net.ipv4.ip_forward=1
vm.swappiness=0
vm.overcommit_memory=1
vm.panic_on_oom=0
fs.inotify.max_user_instances=8192
fs.inotify.max_user_watches=1048576
fs.file-max=52706963
fs.nr_open=52706963
net.ipv6.conf.all.disable_ipv6=1
net.netfilter.nf_conntrack_max=2310720
# 下面的内核参数可以解决ipvs模式下长连接空闲超时的问题
net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 10
net.ipv4.tcp_keepalive_time = 600
EOF
载入内核模块
$ modprobe overlay
$ modprobe ip_conntrack
$ modprobe nf_conntrack
$ modprobe br_netfilter
$ modprobe ip_vs
$ modprobe ip_vs_sh
$ modprobe ip_vs_rr
$ modprobe ip_vs_wrr
刷新内核参数
$ sysctl -p /etc/sysctl.d/kubernetes.conf
检查内核模块载入
lsmod | grep -e ip_vs -e nf_conntrack_ipv4 -e netfilter
创建内核自启模块
$ cat > /lib/modules-load.d/k8s.conf << EOF
ip_conntrack
nf_conntrack
br_netfilter
ip_vs
ip_vs_sh
ip_vs_rr
ip_vs_wrr
overlay
EOF
7. 配置系统时区
$ timedatectl set-timezone Asia/Shanghai && timedatectl set-local-rtc 0
8. 启动时间同步服务
$ systemctl enable chronyd --now
$ chronyc sources
$ date
9. 配置日志服务
$ mkdir /etc/systemd/journald.conf.d/
$ cat > /etc/systemd/journald.conf.d/99-prophet.conf << EOF
[Journal]
Storage=persistent
Compress=yes
SyncIntervalSec=5m
RateLimitInterval=30s
RateLimitBurst=1000
SystemMaxUse=10G
SystemMaxFileSize=200M
MaxRetentionSec=2week
ForwardToSyslog=no
EOF
$ systemctl restart systemd-journald
10. 关闭 NUMA
$ cp /etc/default/grub{,.bak}
# GRUB_CMDLINE_LINUX 行添加 numa=off
$ sed -r -i 's/GRUB_CMDLINE_LINUX="(.*)"/GRUB_CMDLINE_LINUX="\1 numa=off"/g' /etc/default/grub
11. 关闭无用服务
$ systemctl stop postfix && system disable postfix
12. 升级内核
安装 Yum 仓库
$ rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-4.el7.elrepo.noarch.rpm
$ yum --enablerepo=elrepo-kernel install -y kernel-lt
$ grub2-set-default 0 && grub2-mkconfig -o /boot/grub2/grub.cfg
重启节点
$ reboot
检查内核版本
$ uname -r
5.4.221-1.el7.elrepo.x86_64
五、CRI 部署
1. 部署 buildkit
创建 buildkit 目录
$ mkdir -p /usr/local/buildkit
下载 buildkit 二进制包
$ wget http://download.yo-yo.fun/k8s/buildkit-v0.10.5.linux-amd64.tar.gz
解压安装
$ tar xf buildkit-v0.10.5.linux-amd64.tar.gz -C /usr/local/buildkit
配置 Buildkit 系统环境
$ echo "export PATH=$PATH:/usr/local/buildkit/bin" >> /etc/profile
$ . /etc/profile
配置 Buildkit 服务启动脚本
$ cat > /etc/systemd/system/buildkit.service << EOF
[Unit]
Description=BuildKit
Documentation=https://github.com/moby/buildkit
[Service]
ExecStart=/usr/local/buildkit/bin/buildkitd --oci-worker=false --containerd-worker=true
Restart=always
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
重载服务配置
$ systemctl daemon-reload
$ systemctl enable buildkit --now
部署过程中 buildkit 服务可能处于 failed 状态,这是因为 Buildkit 启动会尝试连接 Containerd,连接失败会导致服务停止,启动脚本里设置了重试间隔,但仍有较低概率见到此状态,后续 Containerd 部署启动成功后,自动恢复正常运行
2. 部署 nerdctl
下载 nerdctl 二进制包
$ wget http://download.yo-yo.fun/k8s/nerdctl-0.23.0-linux-amd64.tar.gz
解压 nerdctl 客户端工具
$ tar xf nerdctl-0.23.0-linux-amd64.tar.gz -C /usr/local/bin
3. 部署 containerd
下载 containerd 二进制包
$ wget http://download.yo-yo.fun/k8s/cri-containerd-cni-1.6.5-linux-amd64.tar.gz
解压 containerd 二进制包
$ tar xf cri-containerd-cni-1.6.5-linux-amd64.tar.gz -C /
配置 Containerd 系统环境
$ echo 'export PATH=$PATH:/usr/local/bin:/usr/local/sbin' >> /etc/profile
创建 Containerd 配置目录
$ mkdir /etc/containerd
生成 Containerd 配置
$ cat > /etc/containerd/config.toml << EOF
disabled_plugins = []
imports = []
# 设置为 -1000 进程永远不会被杀死
oom_score = -999
plugin_dir = ""
required_plugins = []
# 持久化数据存储目录
# 包括 Snapshots, Content, Metadata 以及各种插件的数据
root = "/var/lib/containerd"
# 临时状态数据存储目录
# 包括 sockets、pid、挂载点、运行时状态,以及不需要持久化的持久化的插件数据
state = "/run/containerd"
temp = ""
version = 2
[cgroup]
path = ""
[debug]
address = ""
format = ""
gid = 0
level = ""
uid = 0
[grpc]
address = "/run/containerd/containerd.sock"
gid = 0
max_recv_message_size = 16777216
max_send_message_size = 16777216
tcp_address = ""
tcp_tls_ca = ""
tcp_tls_cert = ""
tcp_tls_key = ""
uid = 0
[metrics]
address = ""
grpc_histogram = false
[plugins]
[plugins."io.containerd.gc.v1.scheduler"]
deletion_threshold = 0
mutation_threshold = 100
pause_threshold = 0.02
schedule_delay = "0s"
startup_delay = "100ms"
[plugins."io.containerd.grpc.v1.cri"]
device_ownership_from_security_context = false
disable_apparmor = false
disable_cgroup = false
disable_hugetlb_controller = true
disable_proc_mount = false
disable_tcp_service = true
enable_selinux = false
enable_tls_streaming = false
enable_unprivileged_icmp = false
enable_unprivileged_ports = false
ignore_image_defined_volumes = false
max_concurrent_downloads = 3
max_container_log_line_size = 16384
netns_mounts_under_state_dir = false
restrict_oom_score_adj = false
sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.6"
selinux_category_range = 1024
stats_collect_period = 10
stream_idle_timeout = "4h0m0s"
stream_server_address = "127.0.0.1"
stream_server_port = "0"
systemd_cgroup = false
tolerate_missing_hugetlb_controller = true
unset_seccomp_profile = ""
[plugins."io.containerd.grpc.v1.cri".cni]
# cni 插件二进制文件目录
bin_dir = "/opt/cni/bin"
# cni 插件配置目录
conf_dir = "/etc/cni/net.d"
conf_template = ""
ip_pref = ""
max_conf_num = 1
[plugins."io.containerd.grpc.v1.cri".containerd]
default_runtime_name = "runc"
disable_snapshot_annotations = true
discard_unpacked_layers = false
ignore_rdt_not_enabled_errors = false
no_pivot = false
snapshotter = "overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.default_runtime.options]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
BinaryName = ""
CriuImagePath = ""
CriuPath = ""
CriuWorkPath = ""
IoGid = 0
IoUid = 0
NoNewKeyring = false
NoPivotRoot = false
Root = ""
ShimCgroup = ""
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
base_runtime_spec = ""
cni_conf_dir = ""
cni_max_conf_num = 0
container_annotations = []
pod_annotations = []
privileged_without_host_devices = false
runtime_engine = ""
runtime_path = ""
runtime_root = ""
runtime_type = ""
[plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime.options]
[plugins."io.containerd.grpc.v1.cri".image_decryption]
key_model = "node"
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = ""
[plugins."io.containerd.grpc.v1.cri".registry.auths]
[plugins."io.containerd.grpc.v1.cri".registry.configs]
[plugins."io.containerd.grpc.v1.cri".registry.headers]
# 镜像加速配置
[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
endpoint = ["https://hz6lwzkb.mirror.aliyuncs.com"]
[plugins."io.containerd.grpc.v1.cri".registry.mirrors."k8s.gcr.io"]
endpoint = ["https://registry.aliyuncs.com/google_containers/"]
[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
tls_cert_file = ""
tls_key_file = ""
[plugins."io.containerd.internal.v1.opt"]
path = "/opt/containerd"
[plugins."io.containerd.internal.v1.restart"]
interval = "10s"
[plugins."io.containerd.internal.v1.tracing"]
sampling_ratio = 1.0
service_name = "containerd"
[plugins."io.containerd.metadata.v1.bolt"]
content_sharing_policy = "shared"
[plugins."io.containerd.monitor.v1.cgroups"]
no_prometheus = false
[plugins."io.containerd.runtime.v1.linux"]
no_shim = false
runtime = "runc"
runtime_root = ""
shim = "containerd-shim"
shim_debug = false
[plugins."io.containerd.runtime.v2.task"]
platforms = ["linux/amd64"]
sched_core = false
[plugins."io.containerd.service.v1.diff-service"]
default = ["walking"]
[plugins."io.containerd.service.v1.tasks-service"]
rdt_config_file = ""
[plugins."io.containerd.snapshotter.v1.aufs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.btrfs"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.devmapper"]
async_remove = false
base_image_size = ""
discard_blocks = false
fs_options = ""
fs_type = ""
pool_name = ""
root_path = ""
[plugins."io.containerd.snapshotter.v1.native"]
root_path = ""
[plugins."io.containerd.snapshotter.v1.overlayfs"]
root_path = ""
upperdir_label = false
[plugins."io.containerd.snapshotter.v1.zfs"]
root_path = ""
[plugins."io.containerd.tracing.processor.v1.otlp"]
endpoint = ""
insecure = false
protocol = ""
[proxy_plugins]
[stream_processors]
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar"]
accepts = ["application/vnd.oci.image.layer.v1.tar+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar"
[stream_processors."io.containerd.ocicrypt.decoder.v1.tar.gzip"]
accepts = ["application/vnd.oci.image.layer.v1.tar+gzip+encrypted"]
args = ["--decryption-keys-path", "/etc/containerd/ocicrypt/keys"]
env = ["OCICRYPT_KEYPROVIDER_CONFIG=/etc/containerd/ocicrypt/ocicrypt_keyprovider.conf"]
path = "ctd-decoder"
returns = "application/vnd.oci.image.layer.v1.tar+gzip"
[timeouts]
"io.containerd.timeout.bolt.open" = "0s"
"io.containerd.timeout.shim.cleanup" = "5s"
"io.containerd.timeout.shim.load" = "5s"
"io.containerd.timeout.shim.shutdown" = "3s"
"io.containerd.timeout.task.state" = "2s"
[ttrpc]
address = ""
gid = 0
uid = 0
EOF
启动 Containerd 服务
$ systemctl enable containerd --now
检查 CNI 组件
$ systemctl status containerd
$ systemctl status buildkit
$ nerdctl version
六、集群部署
1. 添加 Kubernetes Yum 仓库
生成仓库文件
$ cat > /etc/yum.repos.d/kubernetes.repo << EOF
[kubernetes]
baseurl = http://mirrors.cloud.aliyuncs.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled = 1
gpgcheck = 0
name = Kubernetes Repository
repo_gpgcheck = 0
EOF
刷新缓存
$ yum clean all
$ yum makecache fast
2. 安装 Kubernetes 基础组件
$ yum install -y kubelet-1.22.2 kubeadm-1.22.2 kubectl-1.22.2 --disableexcludes=kubernetes
检查安装
$ kubeadm version
3. 启动 kubelet 服务
$ systemctl enable kubelet --now
此时查看 kubelet 日志,会显示错误级别细信息,大意是在找不到配置文件,这个没事,后面 kubeadm 初始化集群时会自动生成
4. 生成 kubeadm 集群初始化配置
默认所生成的初始化配置有很多配置项,我们只需要保留最重要、满足需求的即可,主要包括以下内容
- localAPIEndpoint
- advertiseAddress:master 对外暴露的节点地址
- bindPort:master 对外暴露的节点端口
- nodeRegistration
- criSocket:使用 containerd 的 Unix socket 地址
- mode:kube-proxy 模式
- imageRepository:Kubernetes 镜像拉取仓库地址
- kubernetesVersion:Kubernetes 版本
- networking
- serviceSubnet:service 资源网段配置
- podSubnet:pod 资源网段配置
完整配置如下:
$ cat > /root/kubeadm.yaml << EOF
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.16.0.15 # 指定 master 节点内网IP
bindPort: 6443
nodeRegistration:
criSocket: /run/containerd/containerd.sock # 使用 containerd的Unix socket 地址
imagePullPolicy: IfNotPresent
taints: # 给 master 添加污点,使 master 节点不参与调度选择
- effect: "NoSchedule"
key: "node-role.kubernetes.io/master"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs # kube-proxy 模式
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
imageRepository: registry.aliyuncs.com/google_containers
kubernetesVersion: 1.22.2
networking:
dnsDomain: cluster.local
serviceSubnet: 10.96.0.0/12
podSubnet: 10.244.0.0/16 # 指定 pod 子网
scheduler: {}
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
failSwapOn: false
EOF
5. 拉取 Kubernetes 集群组件镜像
$ kubeadm config images pull --config /root/kubeadm.yaml
6. 执行集群初始化(todo)
$ kubeadm init --config /root/kubeadm.yaml
初始化过程主要会经过以下几个阶段
检查、拉取镜像
生成配置,启动kubelet
生成相关证书
生成各组件配置
初始化各组件
选主
生成集群哈希码
应用CoreDNS、kube-proxy插件
初始化结果
其他node节点加入集群的方法
kubeadm join 172.31.117.180:6443 --token v7c44j.c1zwjs3ge5r9s3d6 --discovery-token-ca-cert-hash sha256:29c24b369db0b1139bff4c9b11b2a0520f578bd7259eb70e224ca611c2d8fee7
如果丢失集群 join 信息,可以通过 kubeadm token create --print-join-command
命令再次获取
7. 主节点初始化
$ mkdir -p $HOME/.kube
$ sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
此时,如果我们查看 coredns 它很有可能是一直处于 ContainerCreating
创建启动中的状态,原因是 cni 配置的问题,主要有两种可能性
- 容器错误使用 containerd 提供的 cni 配置文件
- flannel 镜像拉取中,还未生成
/etc/cni/net.d/10-flannel.conflist
CNI Flannel 插件配置文件
调整使用 flannel cni 配置,或等待 flannel 镜像拉取完成,即可解决
8. 部署 flannel 网络插件
安装 flannel 插件
$ wget https://raw.fastgit.org/flannel-io/flannel/master/Documentation/kube-flannel.yml
$ kubectl apply -f kube-flannel.yml
9. 禁用默认 containerd cni 配置
此时,我们 cni 配置目录中,有两个配置文件
- /etc/cni/net.d/10-containerd-net.conflist
- /etc/cni/net.d/10-flannel.conflist
前者是 containerd 的默认配置,网段为 10.88.0.0/16
,这个 cni 插件类型是 bridge
网络,网桥的名称为 cni0
,使用 bridge 网络的容器 无法跨多个宿主机进行通信,跨主机通信需要借助其他的 cni 插件,如我们上面安装的 Flannel
所以,接下来将配置改名,确保 kubelet 使用 flannel 配置文件
$ mv /etc/cni/net.d/10-containerd-net.conflist /etc/cni/net.d/10-containerd-net.conflist.bak ; ifconfig cni0 down ; ip link delete cni0
PS:网卡 cni0 有可能是不存在的(主要是 worker 节点),删除时会提示失败,这个不重要
重启服务
$ systemctl daemon-reload
$ systemctl restart containerd kubelet
检查启动
$ systemctl status containerd kubelet
10. 重建 coredns
hang 在启动中的 coredns 容器,需要重建才可以正常部署运行
$ kubectl rollout restart deployment/coredns -n kube-system
重新部署后,IP 会显示为 flannel 网段
$ kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-flannel kube-flannel-ds-22zvs 1/1 Running 0 123m 172.16.0.16 k8s-worker02 <none> <none>
kube-flannel kube-flannel-ds-5tvk4 1/1 Running 0 123m 172.16.0.14 k8s-worker03 <none> <none>
kube-flannel kube-flannel-ds-xgq8w 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
kube-system coredns-79dbd6459d-2tlgq 1/1 Running 0 123m 10.244.1.2 k8s-worker03 <none> <none>
kube-system coredns-79dbd6459d-wv4bb 1/1 Running 0 123m 10.244.1.3 k8s-worker03 <none> <none>
kube-system etcd-k8s-master01 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
kube-system kube-apiserver-k8s-master01 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
kube-system kube-controller-manager-k8s-master01 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
kube-system kube-proxy-258cg 1/1 Running 0 123m 172.16.0.14 k8s-worker03 <none> <none>
kube-system kube-proxy-96plz 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
kube-system kube-proxy-kb88w 1/1 Running 0 123m 172.16.0.16 k8s-worker02 <none> <none>
kube-system kube-scheduler-k8s-master01 1/1 Running 0 123m 172.16.0.15 k8s-master01 <none> <none>
11. 创建 demo 资源
app.yml
apiVersion: apps/v1 # API版本
kind: Deployment # API对象类型
metadata:
name: nginx-deploy
labels:
chapter: first-app
spec:
selector:
matchLabels:
app: nginx
replicas: 2 # Pod 副本数量
template: # Pod 模板
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
imagePullPolicy: IfNotPresent
ports:
- containerPort: 80
创建资源
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx-deploy-5d59d67564-7trf2 1/1 Running 0 46s
nginx-deploy-5d59d67564-8sd8k 1/1 Running 0 46s
七、问题处理
1. node “master” not found
经过排查,发现问题出在 kubeadm 生成的初始化配置,在对配置文件进行精简,只保留创建集群核心配置,问题消失
2. could not add IP address to “cni0”: Permission denied
执行 kubectl describe pod coredns-XXXX -n kube-system
发现日志返回错误信息
coredns 10.88.0.1 could not add IP address to "cni0": permission denied
经过排查,发现问题的原因是 coredns 使用 containerd 提供的 cni 配置文件,解决方法也很简单
$ mv /etc/cni/net.d/10-containerd-net.conflist /etc/cni/net.d/10-containerd-net.conflist.bak ; \
ifconfig cni0 down ; \
ip link delete cni0
可能会提示 cni0 不存在,这无所谓,后面创建 Pod 容器后,会自动生成
执行完毕后,重建 coredns deployment 管理的 Pod 容器
$ kubectl rollout restart deployment/coredns -n kube-system
查看 coredns 容器 IP,是否为设置网段
$ kubectl get pods -A -o wide
3. scheduler、controller-manager Unhealthy
集群创建之后,查看组件状态时,scheduler、controller-manager 返回 Unhealthy,错误提示为:
$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused
controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused
etcd-0 Healthy {"health":"true"}
问题原因在于 这两个服务的静态 Pod 清单文件中默认端口被设置为 0 导致的
$ grep -P "port=0" ./kube*
./kube-controller-manager.yaml: - --port=0
./kube-scheduler.yaml: - --port=0
解决方式很简单,注释掉即可
$ sed -r -i s'/- --port=0/# - --port=0/'g /etc/kubernetes/manifests/kube-*
重启服务
$ systemctl restart kubelet
检查组件状态
$ kubectl get cs
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy {"health":"true"}
可以看到有一个 warning,这是因为通过查看组件状态判断集群是否运行正常,这个方式已经过时的,通过下面的命令可以看到更详细的状态检查
$ kubectl get --raw='/readyz?verbose'
集群正常情况下,你会看到如下返回
[+]ping ok
[+]log ok
[+]etcd ok
[+]informer-sync ok
[+]poststarthook/start-kube-apiserver-admission-initializer ok
[+]poststarthook/generic-apiserver-start-informers ok
[+]poststarthook/priority-and-fairness-config-consumer ok
[+]poststarthook/priority-and-fairness-filter ok
[+]poststarthook/start-apiextensions-informers ok
[+]poststarthook/start-apiextensions-controllers ok
[+]poststarthook/crd-informer-synced ok
[+]poststarthook/bootstrap-controller ok
[+]poststarthook/rbac/bootstrap-roles ok
[+]poststarthook/scheduling/bootstrap-system-priority-classes ok
[+]poststarthook/priority-and-fairness-config-producer ok
[+]poststarthook/start-cluster-authentication-info-controller ok
[+]poststarthook/aggregator-reload-proxy-client-cert ok
[+]poststarthook/start-kube-aggregator-informers ok
[+]poststarthook/apiservice-registration-controller ok
[+]poststarthook/apiservice-status-available-controller ok
[+]poststarthook/kube-apiserver-autoregistration ok
[+]autoregister-completion ok
[+]poststarthook/apiservice-openapi-controller ok
[+]shutdown ok
readyz check passed
八、一键部署
为了简化部署过程,提高部署效率,这边做了两件事情
- 构建自定义镜像
- 构建 Ansible 角色
先说第一个,构建自定义镜像,本来打算使用 Terraform alicloud_ecs_image_pipeline 资源去做流水线的系统镜像构建,但是 build_content 参数 2 个限制,一、大小不能超过 16 KB、二、命令数不能超过 127 个
build_content
- (Optional, ForceNew) The content of the image template. The content cannot be greater than 16 KB in size, and can contain up to 127 commands.
这就意味着 Terraform HCL 里不能写太多构建逻辑,需要把复杂的构建逻辑拆到地方,例如 shell、python,考虑再三,我选择的是 Ansible,主要 Ansible 复用性、可控性更好
我将整个部署过程拆成了三个角色
- initial:负责集群节点的初始化工作,例如:安装基础依赖包、升级操作系统内核、防火墙规则、内核参数设置等
- cri:负责部署容器运行时 Containerd 相关
- kubernetes:负责初始化集群,并自动处理常见问题,例如:coredns 重建、scheduler、controller-manager 异常状态
角色结构如下
☁ ansible-k8s-role tree -L 3
.
├── alicloud.py # 阿里云 动态 inventory
├── aliyun-ecs-sdk.py # openapi 调用脚本,用以快速创建阿里云 ecs 资源
├── instance_ids.txt
├── inventory # 静态 inventory 用于测试
├── meta
│ └── main.yml
├── README.md
├── roles
│ ├── cri # cri 角色
│ │ ├── files
│ │ ├── handlers
│ │ ├── setup.yml # cri 角色入口
│ │ ├── tasks
│ │ ├── templates
│ │ └── vars
│ ├── initial # initial 角色
│ │ ├── files
│ │ ├── handlers
│ │ ├── setup.yml # initial 角色入口
│ │ ├── tasks
│ │ ├── tests
│ │ └── vars
│ └── kubernetes # kubernetes 角色
│ ├── files
│ ├── handlers
│ ├── setup.yml # kubernetes 角色入口
│ ├── tasks
│ ├── templates
│ ├── tests
│ └── vars
└── setup.yml # 总入口
21 directories, 11 files
三个角色通过整合在一个总的 ansible-k8s-role
角色中,包含 4 个执行入口
- ./setup.yml 总的执行入口,执行所有 role,也可以通过
--tags
执行特定剧本集合 initial/setup.yml
:初始化角色入口,单独执行初始化角色,也可以使用--tags
执行特定剧本集合cri/setup.yml
:容器运行时角色入口,单独容器运行时角色,也可以使用--tags
执行特定剧本集合kubernetes/setup.yml
:集群部署角色入口,单独执行集群部署角色,也可以使用--tags
执行特定剧本集合
使用示例
创建 ECS 资源
# 通过 openapi 调用常见 ecs 资源 $ python aliyun-ecs-sdk.py apply Success. Instance creation succeed. InstanceIds: i-hp3ab8zcejtil91hq0pz, i-hp3ab8zcejtil91hq0q0, i-hp3ab8zcejtil91hq0q1 Instance boot successfully: i-hp3ab8zcejtil91hq0q0 node00002 39.104.172.31 172.16.0.50 Instance boot successfully: i-hp3ab8zcejtil91hq0pz node00001 39.104.54.13 172.16.0.49 Instance boot successfully: i-hp3ab8zcejtil91hq0q1 node00003 39.104.77.199 172.16.0.48 Instances all boot successfully Done.
检查主机名设置
$ grep "K8S_MASTER_INTERNAL_ADVERTISE_ADDRESS" roles/kubernetes/vars/main.yml K8S_MASTER_INTERNAL_ADVERTISE_ADDRESS: "172.16.0.49" $ cat roles/initial/files/hosts ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 172.16.0.50 k8s-worker02 node00002 i-hp3ab8zcejtil91hq0q0 172.16.0.49 k8s-master01 node00001 i-hp3ab8zcejtil91hq0pz 172.16.0.48 k8s-worker03 node00003 i-hp3ab8zcejtil91hq0q1
创建集群
☁ ansible-k8s-role ap -i alicloud.py --tags=hosts_resolve,install_kubernetes_pkg,pull_kube_images,build_k8s_cluster setup.yml [WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details PLAY [系统初始化] *************** TASK [initial : include_tasks] *************** included: /prodata/scripts/ansibleLearn/ansible-k8s-role/roles/initial/tasks/set_hosts_resolve.yml for i_hp3ab8zcejtil91hq0q0, i_hp3ab8zcejtil91hq0pz, i_hp3ab8zcejtil91hq0q1 TASK [initial : gather_facts] *************** ok: [i_hp3ab8zcejtil91hq0q1] ok: [i_hp3ab8zcejtil91hq0pz] ok: [i_hp3ab8zcejtil91hq0q0] TASK [initial : 设置主机名解析] *************** changed: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] changed: [i_hp3ab8zcejtil91hq0pz] TASK [initial : 修改主机名:k8s-master01] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [initial : 修改主机名:k8s-worker02] *************** skipping: [i_hp3ab8zcejtil91hq0pz] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] TASK [initial : 修改主机名:k8s-worker03] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0pz] changed: [i_hp3ab8zcejtil91hq0q1] PLAY [部署 Container Runtime] *************** PLAY [部署 Kubernetes 集群] *************** TASK [kubernetes : include_tasks] *************** included: /prodata/scripts/ansibleLearn/ansible-k8s-role/roles/kubernetes/tasks/install_kubernetes_pkg.yml for i_hp3ab8zcejtil91hq0q0, i_hp3ab8zcejtil91hq0pz, i_hp3ab8zcejtil91hq0q1 TASK [kubernetes : 添加 Kubernetes Yum 仓库] *************** ok: [i_hp3ab8zcejtil91hq0q1] ok: [i_hp3ab8zcejtil91hq0pz] ok: [i_hp3ab8zcejtil91hq0q0] TASK [kubernetes : 清理 Yum 缓存] *************** changed: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 安装 Kubernetes 组件] *************** ok: [i_hp3ab8zcejtil91hq0pz] => (item=kubelet-1.22.2) ok: [i_hp3ab8zcejtil91hq0q1] => (item=kubelet-1.22.2) ok: [i_hp3ab8zcejtil91hq0q0] => (item=kubelet-1.22.2) ok: [i_hp3ab8zcejtil91hq0q1] => (item=kubeadm-1.22.2) ok: [i_hp3ab8zcejtil91hq0pz] => (item=kubeadm-1.22.2) ok: [i_hp3ab8zcejtil91hq0q0] => (item=kubeadm-1.22.2) ok: [i_hp3ab8zcejtil91hq0q1] => (item=kubectl-1.22.2) ok: [i_hp3ab8zcejtil91hq0pz] => (item=kubectl-1.22.2) ok: [i_hp3ab8zcejtil91hq0q0] => (item=kubectl-1.22.2) TASK [kubernetes : 启动并设置 kubelet 自启] *************** ok: [i_hp3ab8zcejtil91hq0q1] ok: [i_hp3ab8zcejtil91hq0pz] ok: [i_hp3ab8zcejtil91hq0q0] TASK [kubernetes : include_tasks] *************** included: /prodata/scripts/ansibleLearn/ansible-k8s-role/roles/kubernetes/tasks/pull_kube_images.yml for i_hp3ab8zcejtil91hq0q0, i_hp3ab8zcejtil91hq0pz, i_hp3ab8zcejtil91hq0q1 TASK [kubernetes : 生成 kubeadm 集群初始化配置] *************** changed: [i_hp3ab8zcejtil91hq0pz] changed: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] TASK [kubernetes : 拉取 Kubernetes 集群组件镜像] *************** changed: [i_hp3ab8zcejtil91hq0pz] changed: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] TASK [kubernetes : 拉取其他镜像] *************** changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/library/busybox:latest) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/library/busybox:latest) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/library/busybox:latest) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/prom/node-exporter:v1.5.0) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/prom/node-exporter:v1.5.0) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/prom/node-exporter:v1.5.0) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/grafana/grafana:9.4.7) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/grafana/grafana:9.4.7) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/grafana/grafana:9.4.7) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/prom/prometheus:v2.35.0) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/prom/prometheus:v2.35.0) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/prom/prometheus:v2.35.0) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/victoriametrics/vmstorage:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/victoriametrics/vmstorage:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/victoriametrics/vmstorage:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/victoriametrics/vmselect:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/victoriametrics/vmselect:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/victoriametrics/vmselect:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/victoriametrics/vminsert:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/victoriametrics/vminsert:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/victoriametrics/vminsert:v1.77.0-cluster) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/victoriametrics/vmagent:v1.77.0) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/victoriametrics/vmagent:v1.77.0) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/victoriametrics/vmagent:v1.77.0) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/victoriametrics/vmalert:v1.77.0) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/victoriametrics/vmalert:v1.77.0) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/victoriametrics/vmalert:v1.77.0) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/prom/alertmanager:v0.25.0) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/prom/alertmanager:v0.25.0) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/prom/alertmanager:v0.25.0) changed: [i_hp3ab8zcejtil91hq0q1] => (item=docker.io/lotusching/promoter:latest) changed: [i_hp3ab8zcejtil91hq0pz] => (item=docker.io/lotusching/promoter:latest) changed: [i_hp3ab8zcejtil91hq0q0] => (item=docker.io/lotusching/promoter:latest) TASK [kubernetes : include_tasks] *************** included: /prodata/scripts/ansibleLearn/ansible-k8s-role/roles/kubernetes/tasks/build_kubernetes_cluster.yml for i_hp3ab8zcejtil91hq0q0, i_hp3ab8zcejtil91hq0pz, i_hp3ab8zcejtil91hq0q1 TASK [kubernetes : gather_facts] *************** ok: [i_hp3ab8zcejtil91hq0q0] ok: [i_hp3ab8zcejtil91hq0q1] ok: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 生成 kubeadm 集群初始化配置] *************** ok: [i_hp3ab8zcejtil91hq0q0] ok: [i_hp3ab8zcejtil91hq0q1] ok: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 执行 Kubernetes 集群初始化] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 获取 join 信息] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 拉取 join 脚本到主控机] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : worker 节点加入集群] *************** skipping: [i_hp3ab8zcejtil91hq0pz] changed: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0q0] TASK [kubernetes : Master 节点基础配置] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 下载 flannel 网络插件清单文件] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 部署 flannel 网络插件] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 禁用默认 containerd cni 配置] *************** changed: [i_hp3ab8zcejtil91hq0pz] fatal: [i_hp3ab8zcejtil91hq0q0]: FAILED! => {"changed": true, "cmd": "mv /etc/cni/net.d/10-containerd-net.conflist /etc/cni/net.d/10-containerd-net.conflist.bak ;\nifconfig cni0 down ;\nip link delete cni0\n", "delta": "0:00:00.136080", "end": "2023-04-19 02:42:35.185140", "msg": "non-zero return code", "rc": 1, "start": "2023-04-19 02:42:35.049060", "stderr": "cni0: ERROR while getting interface flags: No such device\nCannot find device \"cni0\"", "stderr_lines": ["cni0: ERROR while getting interface flags: No such device", "Cannot find device \"cni0\""], "stdout": "", "stdout_lines": []} ...ignoring fatal: [i_hp3ab8zcejtil91hq0q1]: FAILED! => {"changed": true, "cmd": "mv /etc/cni/net.d/10-containerd-net.conflist /etc/cni/net.d/10-containerd-net.conflist.bak ;\nifconfig cni0 down ;\nip link delete cni0\n", "delta": "0:00:00.137528", "end": "2023-04-19 02:42:35.272023", "msg": "non-zero return code", "rc": 1, "start": "2023-04-19 02:42:35.134495", "stderr": "cni0: ERROR while getting interface flags: No such device\nCannot find device \"cni0\"", "stderr_lines": ["cni0: ERROR while getting interface flags: No such device", "Cannot find device \"cni0\""], "stdout": "", "stdout_lines": []} ...ignoring TASK [kubernetes : Flush handlers] *************** TASK [kubernetes : Flush handlers] *************** TASK [kubernetes : Flush handlers] *************** RUNNING HANDLER [kubernetes : daemon-reload] *************** ok: [i_hp3ab8zcejtil91hq0pz] RUNNING HANDLER [kubernetes : kubelet-restart] *************** changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 重建 coredns 容器] *************** skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] TASK [kubernetes : 重建 coredns 容器] *************** changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 重建 coredns 容器] *************** changed: [i_hp3ab8zcejtil91hq0pz] TASK [kubernetes : 修复 scheduler control-manager 端口配置问题] *************** skipping: [i_hp3ab8zcejtil91hq0q0] => (item=/etc/kubernetes/manifests/kube-scheduler.yaml) skipping: [i_hp3ab8zcejtil91hq0q0] => (item=/etc/kubernetes/manifests/kube-controller-manager.yaml) skipping: [i_hp3ab8zcejtil91hq0q0] skipping: [i_hp3ab8zcejtil91hq0q1] => (item=/etc/kubernetes/manifests/kube-scheduler.yaml) skipping: [i_hp3ab8zcejtil91hq0q1] => (item=/etc/kubernetes/manifests/kube-controller-manager.yaml) skipping: [i_hp3ab8zcejtil91hq0q1] changed: [i_hp3ab8zcejtil91hq0pz] => (item=/etc/kubernetes/manifests/kube-scheduler.yaml) changed: [i_hp3ab8zcejtil91hq0pz] => (item=/etc/kubernetes/manifests/kube-controller-manager.yaml) RUNNING HANDLER [kubernetes : daemon-reload] *************** ok: [i_hp3ab8zcejtil91hq0pz] RUNNING HANDLER [kubernetes : kubelet-restart] *************** changed: [i_hp3ab8zcejtil91hq0pz] PLAY RECAP *************** i_hp3ab8zcejtil91hq0pz : ok=30 changed=18 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0 i_hp3ab8zcejtil91hq0q0 : ok=18 changed=8 unreachable=0 failed=0 skipped=10 rescued=0 ignored=1 i_hp3ab8zcejtil91hq0q1 : ok=18 changed=8 unreachable=0 failed=0 skipped=10 rescued=0 ignored=1
检查集群
# 拉取 kube-config $ ansible -i alicloud.py 'i-hp3ab8zcejtil91hq0pz' -m fetch -a "src=/root/.kube/config dest=/root/.kube/config flat=true";sed -r - i "s/server:.*/server: https:\/\/k8s-master001.yo-yo.fun:6443/g" ~/.kube/config [WARNING]: Invalid characters were found in group names but not replaced, use -vvvv to see details i_hp3ab8zcejtil91hq0pz | CHANGED => { "changed": true, "checksum": "5559d856fb153561e4b2a135d0183f64c5f6c0bc", "dest": "/root/.kube/config", "md5sum": "eb9e6d80f98276df6a93b37845dc4465", "remote_checksum": "5559d856fb153561e4b2a135d0183f64c5f6c0bc", "remote_md5sum": null } $ kc get cs; kc get node; kc get pods -A; kc get --raw='/readyz?verbose' Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true","reason":""} NAME STATUS ROLES AGE VERSION k8s-master01 Ready control-plane,master 3m8s v1.22.2 k8s-worker02 Ready <none> 2m43s v1.22.2 k8s-worker03 Ready <none> 2m43s v1.22.2 NAMESPACE NAME READY STATUS RESTARTS AGE kube-flannel kube-flannel-ds-4kh9c 1/1 Running 0 2m40s kube-flannel kube-flannel-ds-q65jv 1/1 Running 0 2m40s kube-flannel kube-flannel-ds-zcmzc 1/1 Running 0 2m40s kube-system coredns-96ddf9bfb-dqxqs 1/1 Running 0 2m32s kube-system coredns-96ddf9bfb-sxdhc 1/1 Running 0 2m32s kube-system etcd-k8s-master01 1/1 Running 0 3m kube-system kube-apiserver-k8s-master01 1/1 Running 0 3m kube-system kube-controller-manager-k8s-master01 1/1 Running 2 (2m21s ago) 2m17s kube-system kube-proxy-5zgbb 1/1 Running 0 2m43s kube-system kube-proxy-ldlsj 1/1 Running 0 2m43s kube-system kube-proxy-sb65m 1/1 Running 0 2m52s kube-system kube-scheduler-k8s-master01 1/1 Running 2 (2m21s ago) 2m17s [+]ping ok [+]log ok [+]etcd ok [+]informer-sync ok [+]poststarthook/start-kube-apiserver-admission-initializer ok [+]poststarthook/generic-apiserver-start-informers ok [+]poststarthook/priority-and-fairness-config-consumer ok [+]poststarthook/priority-and-fairness-filter ok [+]poststarthook/start-apiextensions-informers ok [+]poststarthook/start-apiextensions-controllers ok [+]poststarthook/crd-informer-synced ok [+]poststarthook/bootstrap-controller ok [+]poststarthook/rbac/bootstrap-roles ok [+]poststarthook/scheduling/bootstrap-system-priority-classes ok [+]poststarthook/priority-and-fairness-config-producer ok [+]poststarthook/start-cluster-authentication-info-controller ok [+]poststarthook/aggregator-reload-proxy-client-cert ok [+]poststarthook/start-kube-aggregator-informers ok [+]poststarthook/apiservice-registration-controller ok [+]poststarthook/apiservice-status-available-controller ok [+]poststarthook/kube-apiserver-autoregistration ok [+]autoregister-completion ok [+]poststarthook/apiservice-openapi-controller ok [+]shutdown ok readyz check passed
OK,Role 功能符合预期,接下来就是把 initial、cri、拉取镜像的功能 放入 Terraform 构建镜像的流程中
…