kubeadm
是Kubernetes
官方提供的用于快速安装集群的工具,伴随Kubernetes
每个版本的发布都会同步更新。从1.14
开始,kubeadm
的主要特性已经GA了,但还不包含高可用,如果需要高可用需要自己实现,通常是借助Keepalived
实现。
关于Kubernetes Master组件的高可用原理可以看下这篇。
我们的生产环境也是用的kubeadm
搭建的,集群运行也没什么总是,但是也有坑,比如创建出来的集群证书默认有效期是一年,虽然有办法解决,但是过程比较繁琐。从1.15版本开始 kubeadm upgrade
在升级集群时会自动给证书续期。
我们线上集群经历过二进制包搭建,后来换成saltstack
,再后来换成kubeadm
,关于kubernetes
集群的搭建还想多吐槽几句,Kubernetes
功能虽然强大,但想快速搭建一套集群其实不太简单,相比kubernetes
版本迭代速度,感觉官方对搭集群搭建这块支持不太够,所以才有了很多第三方维护集群搭建方案,比如:Kops
,如果想在AWS
上搭建一套Kubernetes
集群,墙裂推荐,我们用它起过多套生产集群,非常省心。
安装过程主要参考了:https://github.com/cookeem/kubeadm-ha
和原博主不一样的地方:
- nginx tcp 反向代理的
timeout
需要设置长一些,不然kubectl在操作时会有error: unexpected EOF
的报错,以及查看kube-controller-manager
日志中也会有类似的报错。这点很重要,请根据情况把这个时间尽量设置长一些。
kube-proxy
使用ipvs
模式。
Calico
使用最新的版本。
环境准备
用准备好的kvm
+ cobbler
环境初始化6台Centos7实例
1
2
3
4
5
6
7
|
install.sh kadm01 192.168.122.81 30G 4096 8 CentOS-7.3-x86_64
install.sh kadm02 192.168.122.82 30G 4096 8 CentOS-7.3-x86_64
install.sh kadm03 192.168.122.83 30G 4096 8 CentOS-7.3-x86_64
install.sh knode01 192.168.122.91 80G 16384 16 CentOS-7.3-x86_64
install.sh knode02 192.168.122.92 80G 16384 16 CentOS-7.3-x86_64
install.sh knode03 192.168.122.93 80G 16384 16 CentOS-7.3-x86_64
for i in kadm01 kadm02 kadm03 knode01 knode02 knode03;do snapshot-create-as.sh $i system-init;done;
|
节点角色:
3Master+3Node
1
2
3
4
5
6
7
|
192.168.122.84 kmaster #VIP
192.168.122.81 kadm01
192.168.122.82 kadm02
192.168.122.83 kadm03
192.168.122.91 knode01
192.168.122.92 knode02
192.168.122.93 knode03
|
使用阿里云yum镜像
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
cat > /etc/yum.repos.d/CentOS7.repo << "EOF"
[base]
name=CentOS-$releasever - Base - mirrors.aliyun.com
failovermethod=priority
baseurl=http://mirrors.aliyun.com/centos/$releasever/os/$basearch/
http://mirrors.aliyuncs.com/centos/$releasever/os/$basearch/
http://mirrors.cloud.aliyuncs.com/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
#released updates
[updates]
name=CentOS-$releasever - Updates - mirrors.aliyun.com
failovermethod=priority
baseurl=http://mirrors.aliyun.com/centos/$releasever/updates/$basearch/
http://mirrors.aliyuncs.com/centos/$releasever/updates/$basearch/
http://mirrors.cloud.aliyuncs.com/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras - mirrors.aliyun.com
failovermethod=priority
baseurl=http://mirrors.aliyun.com/centos/$releasever/extras/$basearch/
http://mirrors.aliyuncs.com/centos/$releasever/extras/$basearch/
http://mirrors.cloud.aliyuncs.com/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus - mirrors.aliyun.com
failovermethod=priority
baseurl=http://mirrors.aliyun.com/centos/$releasever/centosplus/$basearch/
http://mirrors.aliyuncs.com/centos/$releasever/centosplus/$basearch/
http://mirrors.cloud.aliyuncs.com/centos/$releasever/centosplus/$basearch/
gpgcheck=1
enabled=0
gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
#contrib - packages by Centos Users
[contrib]
name=CentOS-$releasever - Contrib - mirrors.aliyun.com
failovermethod=priority
baseurl=http://mirrors.aliyun.com/centos/$releasever/contrib/$basearch/
http://mirrors.aliyuncs.com/centos/$releasever/contrib/$basearch/
http://mirrors.cloud.aliyuncs.com/centos/$releasever/contrib/$basearch/
gpgcheck=1
enabled=0
gpgkey=http://mirrors.aliyun.com/centos/RPM-GPG-KEY-CentOS-7
EOF
|
系统更新及升级内核
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
#更新系统
yum update -y
#设置elrepo,linux内核yum源
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh https://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
#查看最新内核版本,这里为5.4.1-1.el7.elrepo
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
#安装新版本内核 5.4.1-1.el7.elrepo
yum --disablerepo="*" --enablerepo="elrepo-kernel" install -y kernel-ml-5.4.1-1.el7.elrepo
#修改默认启动项
grub2-set-default 0
#重启系统
reboot
#验证内核版本
uname -a
Linux kadm01 5.4.1-1.el7.elrepo.x86_64 #1 SMP Fri Nov 29 10:21:13 EST 2019 x86_64 x86_64 x86_64 GNU/Linux
#查看系统版本
cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)
|
升级高版内核支持Docker
一些新的更好的特性,比如Overlay2
就需要Kernel大于4.0
关闭firewalld防火墙
1
2
|
systemctl stop firewalld
systemctl disable firewalld
|
禁用SELINUX:
1
2
3
4
5
6
|
#临时关闭
setenforce 0
#永久关闭
vim /etc/selinux/config
SELINUX=disabled
|
允许转发
1
2
3
4
5
6
7
|
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
# 让配置生效
sysctl --system
|
禁用swap
1
2
3
4
5
|
swapoff -a
# 注释掉fstab中的swap分区
vim /etc/fstab
#/dev/mapper/centos-swap swap swap defaults 0 0
|
hosts 文件
1
2
3
4
5
6
7
8
9
|
cat <<EOF >> /etc/hosts
192.168.122.84 kmaster
192.168.122.81 kadm01
192.168.122.82 kadm02
192.168.122.83 kadm03
192.168.122.91 knode01
192.168.122.92 knode02
192.168.122.93 knode03
EOF
|
以上操作需要在所有节点上配置
安装软件
Docker
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
# 安装yum管理工具
yum install -y yum-utils
# 添加阿里云的yum源
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
#查看可用的docker-ce版本,最新的docker-ce版本是19.x,但是kubernetes还不支持,所以我们还是安装18.x版本
yum list docker-ce --showduplicates
#安装docker-ce
yum install -y docker-ce-3:18.09.9-3.el7.x86_64
#启动docker服务
systemctl enable docker && systemctl start docker
|
Kubernetes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
|
# 配置kubernetes软件yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
repo_gpgcheck=0
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
#查看可用kubelet版本
yum search kubelet --showduplicates
#安装kubernetes组件
kubeVer=1.15.5-0
for i in kubectl kubelet kubeadm;do yum install -y $i-${kubeVer};done;
#启动kubelet
systemctl enable kubelet && systemctl start kubelet
#锁定下各组件的版本,防止yum update时被更新
yum install -y yum-plugin-versionlock
yum versionlock docker-ce kubeadm kubectl kubelet
|
keepavlied【仅master节点】
1
2
|
#安装keepalived及lvs管理软件
yum install -y keepalived ipvsadm
|
docker-compose 【仅master节点】
1
2
|
curl -L https://github.com/docker/compose/releases/download/1.24.1/docker-compose-`uname -s`-`uname -m` -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
|
准备配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
git clone https://github.com/xnile/kubernetes-bootstrap.git
#主要配置文件如下
├── create-config.sh # 自动生成相关配置文件的脚本
├── kubeadm-config.yaml.tpl # kubeadm初始化配置文件模板
├── calico
│ └── calico.yaml.tpl # calico网络组件配置文件模板
├── keepalived
│ ├── check_apiserver.sh # keepalived自动检测脚本
│ └── keepalived.conf.tpl # keepalived配置文件模板
└── nginx-lb
├── docker-compose.yaml # 使用docker-compose方式启动nginx-lb的配置文件
├── nginx-lb.conf.tpl # nginx-lb配置文件
└── nginx-lb.yaml # 使用kubelet托管nginx-lb的配置文件
|
生成集群初始化需要的文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
|
cd kubernetes-bootstrap
vim create-config.sh
# master keepalived virtual ip address
export K8SHA_VIP=192.168.122.84
# master01 ip address
export K8SHA_IP1=192.168.122.81
# master02 ip address
export K8SHA_IP2=192.168.122.82
# master03 ip address
export K8SHA_IP3=192.168.122.83
# master keepalived virtual ip hostname
export K8SHA_VHOST=kmaster
# master01 hostname
export K8SHA_HOST1=kadm01
# master02 hostname
export K8SHA_HOST2=kadm02
# master03 hostname
export K8SHA_HOST3=kadm03
# master01 network interface name
export K8SHA_NETINF1=eth0
# master02 network interface name
export K8SHA_NETINF2=eth0
# master03 network interface name
export K8SHA_NETINF3=eth0
# keepalived auth_pass config
export K8SHA_KEEPALIVED_AUTH=412f7dc3bfed32194d1600c483e10ad1d
# calico reachable ip address
export K8SHA_CALICO_REACHABLE_IP=192.168.122.1
# kubernetes CIDR pod subnet
export K8SHA_CIDR=10.1.0.0/16
# kubernetes CIDR SVC subnet
export K8SSVC_CIDR=10.254.0.0/16
|
执行create-config.sh
脚本生成配置文件
配置文件清单
- 执行
create-config.sh
脚本后,会自动生成以下配置文件:
- kubeadm-config.yaml:kubeadm初始化配置文件,位于代码的
./
根目录
- keepalived:keepalived配置文件,位于各个master节点的
/etc/keepalived
目录
- nginx-lb:nginx-lb负载均衡配置文件,位于各个master节点的
/root/nginx-lb
目录
- calico.yaml:calico网络组件部署文件,位于kubeadm-ha代码的
./calico
目录
启动负载均衡
初始化集群
建议预先拉取kubernetes所需镜像
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|
[root@kadm01 ~]# kubeadm --kubernetes-version=v1.15.6 config images list
k8s.gcr.io/kube-apiserver:v1.15.6
k8s.gcr.io/kube-controller-manager:v1.15.6
k8s.gcr.io/kube-scheduler:v1.15.6
k8s.gcr.io/kube-proxy:v1.15.6
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.3.10
k8s.gcr.io/coredns:1.3.1
#因为众所周知的原因,默认谷歌gcr的镜像国内是没法拉取的。感谢Azrue为我们提供的镜像
[root@kadm01 ~]# kubeadm --kubernetes-version=v1.15.6 config images pull --image-repository gcr.azk8s.cn/google_containers
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-apiserver:v1.15.6
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-controller-manager:v1.15.6
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-scheduler:v1.15.6
[config/images] Pulled gcr.azk8s.cn/google_containers/kube-proxy:v1.15.6
[config/images] Pulled gcr.azk8s.cn/google_containers/pause:3.1
[config/images] Pulled gcr.azk8s.cn/google_containers/etcd:3.3.10
[config/images] Pulled gcr.azk8s.cn/google_containers/coredns:1.3.1
|
初始化第一个节点
1
2
|
# 在kadm01上执行以下命令初始化第一个master节点,等待几分钟
kubeadm init --config=/root/kubernetes-bootstrap/kubeadm-config.yaml --upload-certs
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
# 非常重要,务必把执行结果中以下内容保存下来,输出结果中有两条命令可以分别把节点加入到集群作为master或者worker
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of the control-plane node running the following command on each as root:
kubeadm join 192.168.122.84:16443 --token fv9auf.0h5vxtr05xyo15ur \
--discovery-token-ca-cert-hash sha256:07c7d42a86b8b501a2aa872fcb214dda7804258fee52f861288fcf85ca3c63ea \
--control-plane --certificate-key beaa9381e52b4001da8713de6fe1f81f526d19b9ae3ecd8ec970476770bca490
Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.122.84:16443 --token fv9auf.0h5vxtr05xyo15ur \
--discovery-token-ca-cert-hash sha256:07c7d42a86b8b501a2aa872fcb214dda7804258fee52f861288fcf85ca3c63ea
|
根据上边的提示配置kubectl的配置文件
1
2
3
|
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
|
访问集群
1
2
3
|
[root@kadm01 kubernetes-bootstrap]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kadm01 NotReady master 4m10s v1.15.5
|
因为还没有初始化calico所以集群状态还是NotReady状态
安装Calico网络组件
calico的image是放在docker hub上的,所以不科学上网也是可以拉取到,就是有点慢而已。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
|
[root@kadm01 kubernetes-bootstrap]# kubectl apply -f /root/kubernetes-bootstrap/calico/calico.yaml
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.apps/calico-node created
serviceaccount/calico-node created
deployment.apps/calico-kube-controllers created
serviceaccount/calico-kube-controllers created
#查看calico pod是否正常启动
[root@kadm01 kubernetes-bootstrap]# kubectl -n kube-system get pods
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-845589bf5f-4pgcf 0/1 Running 0 27m
calico-node-zj46f 1/1 Running 0 27m
coredns-cf8fb6d7f-22dbb 1/1 Running 0 34m
coredns-cf8fb6d7f-t8lxm 1/1 Running 0 34m
etcd-kadm01 1/1 Running 0 33m
kube-apiserver-kadm01 1/1 Running 0 33m
kube-controller-manager-kadm01 1/1 Running 1 33m
kube-proxy-tmb96 1/1 Running 0 34m
kube-scheduler-kadm01 1/1 Running 1 33m
#等待启动完毕,再次验证集群状态已经是Ready状态
[root@kadm01 kubernetes-bootstrap]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kadm01 Ready master 33m v1.15.5
|
将其他master节点加入controlplane控制平面
在第一个master节点使用kubeadm初始化第一个kubernetes的master节点后,会输出两条命令,其中第一条命令就是把节点以master的方式加入到集群。
在第二第三个master节点kadm02
和kadm03
上分别执行以下命令,把kadm02
和kadm03
以master方式加入到集群。
1
2
3
4
|
# 在其他的master节点kadm02和kadm03上执行以下命令,把节点加入到controlplane
kubeadm join 192.168.122.84:16443 --token fv9auf.0h5vxtr05xyo15ur \
--discovery-token-ca-cert-hash sha256:07c7d42a86b8b501a2aa872fcb214dda7804258fee52f861288fcf85ca3c63ea \
--control-plane --certificate-key beaa9381e52b4001da8713de6fe1f81f526d19b9ae3ecd8ec970476770bca490
|
检查集群状态
1
2
3
4
5
|
[root@kadm01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kadm01 Ready master 12h v1.15.5
kadm02 Ready master 11h v1.15.5
kadm03 Ready master 11h v1.15.5
|
添加Node节点
在第一个master节点使用kubeadm初始化第一个kubernetes的master节点后,会输出两条命令,其中第二条命令就是把节点以worker的方式加入到集群。
在Node节点knode01
上执行以下命令,把knode01
以worker方式加入到集群。
1
2
3
4
5
6
7
8
9
10
11
12
13
|
# 在knode01节点上执行以下命令,把节点以worker方式加入到集群
kubeadm join 192.168.122.84:16443 --token fv9auf.0h5vxtr05xyo15ur \
--discovery-token-ca-cert-hash sha256:07c7d42a86b8b501a2aa872fcb214dda7804258fee52f861288fcf85ca3c63ea
# 查看集群节点状态
[root@kadm01 ~]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
kadm01 Ready master 11h v1.15.5
kadm02 Ready master 10h v1.15.5
kadm03 Ready master 10h v1.15.5
knode01 Ready <none> 10h v1.15.5
knode02 Ready <none> 10h v1.15.5
knode03 Ready <none> 10h v1.15.5
|
给Node节点打上标签
1
|
for i in knode01 knode02 knode03;do kubectl label node $i node-role.kubernetes.io/node=;done
|
Kubectl自动补全
1
2
3
4
|
yum install bash-completion -y
echo "source <(kubectl completion bash)">> /root/.bashrc
echo "source /usr/share/bash-completion/bash_completion">>/root/.bashrc
source /root/.bashrc
|
调试
kubeadm join
时如果hang 住,可以加下 --v=2
参数显示详细信息
查看Kubeadm 集群初始化默认参数
1
2
|
kubeadm config print init-defaults
kubeadm config print join-defaults
|
参考
https://github.com/cookeem/kubeadm-ha
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/
https://kubernetes.io/blog/2018/12/04/production-ready-kubernetes-cluster-creation-with-kubeadm/
https://medium.com/@dominik.tornow/kubernetes-high-availability-d2c9cbbdd864