Kubernetes 高可用实践

彭楷淳发布于 2021-02-18
预计阅读时间 33 分钟
总计 6.4k
浏览

统一环境配置


节点配置

主机名 IP 角色 系统 CPU/内存 磁盘
kubernetes-master-01 192.168.141.150 Master Ubuntu Server 18.04 2核2G 20G
kubernetes-master-02 192.168.141.151 Master Ubuntu Server 18.04 2核2G 20G
kubernetes-master-03 192.168.141.152 Master Ubuntu Server 18.04 2核2G 20G
kubernetes-node-01 192.168.141.160 Node Ubuntu Server 18.04 2核4G 20G
kubernetes-node-02 192.168.141.161 Node Ubuntu Server 18.04 2核4G 20G
kubernetes-node-03 192.168.141.162 Node Ubuntu Server 18.04 2核4G 20G
Kubernetes VIP 192.168.141.200 - - - -

对操作系统的配置

特别注意:以下步骤请在制作 VMware 镜像时一并完成,避免逐台安装的痛苦

关闭交换空间

1
$ swapoff -a

避免开机启动交换空间

注释 swap 开头的行

1
$ vi /etc/fstab

关闭防火墙

1
$ ufw disable

配置 DNS

取消 DNS 行注释,并增加 DNS 配置如:114.114.114.114,修改后重启下计算机

1
$ vi /etc/systemd/resolved.conf

安装 Docker

1
2
3
4
5
6
7
8
9
10
11
12
# 更新软件源
$ sudo apt-get update
# 安装所需依赖
$ sudo apt-get -y install apt-transport-https ca-certificates curl software-properties-common
# 安装 GPG 证书
$ curl -fsSL http://mirrors.aliyun.com/docker-ce/linux/ubuntu/gpg | sudo apt-key add -
# 新增软件源信息
$ sudo add-apt-repository "deb [arch=amd64] http://mirrors.aliyun.com/docker-ce/linux/ubuntu $(lsb_release -cs) stable"
# 再次更新软件源
$ sudo apt-get -y update
# 安装 Docker CE 版
$ sudo apt-get -y install docker-ce

配置 Docker 加速器

国内镜像加速器可能会很卡,请替换成你自己阿里云镜像加速器,地址如:https://yourself.mirror.aliyuncs.com,在阿里云控制台的 容器镜像服务 > 镜像加速器 菜单中可以找到。

/etc/docker/daemon.json 中写入如下内容(如果文件不存在请新建该文件)

1
2
3
4
5
{
"registry-mirrors": [
"https://registry.docker-cn.com"
]
}

安装 kubeadm、kubelet、kubectl

1
2
3
4
5
6
7
8
9
10
11
12
13
# 安装系统工具
$ apt-get update && apt-get install -y apt-transport-https

# 安装 GPG 证书
$ curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -

# 写入软件源;注意:我们用系统代号为 bionic,但目前阿里云不支持,所以沿用 16.04 的 xenial
$ cat << EOF >/etc/apt/sources.list.d/kubernetes.list
> deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
> EOF

# 安装
$ apt-get update && apt-get install -y kubelet kubeadm kubectl

同步时间

设置时区

1
$ dpkg-reconfigure tzdata

选择 Asia(亚洲)Shanghai(上海)

时间同步

1
2
3
4
5
6
7
8
# 安装 ntpdate
$ apt-get install ntpdate

# 设置系统时间与网络时间同步(cn.pool.ntp.org 位于中国的公共 NTP 服务器)
$ ntpdate cn.pool.ntp.org

# 将系统时间写入硬件时间
$ hwclock --systohc

确认时间

1
2
3
4
$ date

# 输出如下(自行对照与系统时间是否一致)
Sun Jun 2 22:02:35 CST 2019

配置 IPVS

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# 安装系统工具
$ apt-get install -y ipset ipvsadm

# 配置并加载 IPVS 模块
$ mkdir -p /etc/sysconfig/modules/
$ vi /etc/sysconfig/modules/ipvs.modules

# 输入如下内容
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4

# 执行脚本,注意:如果重启则需要重新运行该脚本
$ chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

# 执行脚本输出如下
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 147456 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack_ipv4 16384 3
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4
nf_conntrack 131072 8 xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink,ip_vs
libcrc32c 16384 4 nf_conntrack,nf_nat,raid456,ip_vs

配置内核参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# 配置参数
$ vi /etc/sysctl.d/k8s.conf

# 输入如下内容
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.swappiness=0

# 应用参数
$ sysctl --system

# 应用参数输出如下(找到 Applying /etc/sysctl.d/k8s.conf 开头的日志)
* Applying /etc/sysctl.d/10-console-messages.conf ...
kernel.printk = 4 4 1 7
* Applying /etc/sysctl.d/10-ipv6-privacy.conf ...
* Applying /etc/sysctl.d/10-kernel-hardening.conf ...
kernel.kptr_restrict = 1
* Applying /etc/sysctl.d/10-link-restrictions.conf ...
fs.protected_hardlinks = 1
fs.protected_symlinks = 1
* Applying /etc/sysctl.d/10-lxd-inotify.conf ...
fs.inotify.max_user_instances = 1024
* Applying /etc/sysctl.d/10-magic-sysrq.conf ...
kernel.sysrq = 176
* Applying /etc/sysctl.d/10-network-security.conf ...
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.all.rp_filter = 1
net.ipv4.tcp_syncookies = 1
* Applying /etc/sysctl.d/10-ptrace.conf ...
kernel.yama.ptrace_scope = 1
* Applying /etc/sysctl.d/10-zeropage.conf ...
vm.mmap_min_addr = 65536
* Applying /usr/lib/sysctl.d/50-default.conf ...
net.ipv4.conf.all.promote_secondaries = 1
net.core.default_qdisc = fq_codel
* Applying /etc/sysctl.d/99-sysctl.conf ...
* Applying /etc/sysctl.d/k8s.conf ...
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.ip_forward = 1
vm.swappiness = 0
* Applying /etc/sysctl.conf ...

修改 cloud.cfg

1
2
3
4
$ vi /etc/cloud/cloud.cfg

# 该配置默认为 false,修改为 true 即可
preserve_hostname: true

单独节点配置


特别注意:为 Master 和 Node 节点单独配置对应的 IP主机名

配置 IP

编辑配置文件

1
$ vi /etc/netplan/50-cloud-init.yaml

修改内容如下

1
2
3
4
5
6
7
8
9
network:
ethernets:
ens33:
# 我的 Master 是 150 - 152,Node 是 160 - 162
addresses: [192.168.141.150/24]
gateway4: 192.168.141.2
nameservers:
addresses: [192.168.141.2]
version: 2

使用 netplan apply 命令让配置生效

配置主机名

1
2
3
4
5
6
7
# 修改主机名
$ hostnamectl set-hostname kubernetes-master-01

# 配置 hosts
$ cat >> /etc/hosts << EOF
192.168.141.150 kubernetes-master-01
EOF

安装 HAProxy + Keepalived


概述

Kubernetes Master 节点运行组件如下:

  • kube-apiserver:提供了资源操作的唯一入口,并提供认证、授权、访问控制、API 注册和发现等机制
  • kube-scheduler:负责资源的调度,按照预定的调度策略将 Pod 调度到相应的机器上
  • kube-controller-manager:负责维护集群的状态,比如故障检测、自动扩展、滚动更新等
  • etcd:CoreOS 基于 Raft 开发的分布式 key-value 存储,可用于服务发现、共享配置以及一致性保障(如数据库选主、分布式锁等)

kube-schedulerkube-controller-manager 可以以集群模式运行,通过 leader 选举产生一个工作进程,其它进程处于阻塞模式。

kube-apiserver 可以运行多个实例,但对其它组件需要提供统一的访问地址,本章节部署 Kubernetes 高可用集群实际就是利用 HAProxy + Keepalived 配置该组件

配置的思路就是利用 HAProxy + Keepalived 实现 kube-apiserver 虚拟 IP 访问从而实现高可用和负载均衡,拆解如下:

  • Keepalived 提供 kube-apiserver 对外服务的虚拟 IP(VIP)
  • HAProxy 监听 Keepalived VIP
  • 运行 Keepalived 和 HAProxy 的节点称为 LB(负载均衡) 节点
  • Keepalived 是一主多备运行模式,故至少需要两个 LB 节点
  • Keepalived 在运行过程中周期检查本机的 HAProxy 进程状态,如果检测到 HAProxy 进程异常,则触发重新选主的过程,VIP 将飘移到新选出来的主节点,从而实现 VIP 的高可用
  • 所有组件(如 kubeclt、apiserver、controller-manager、scheduler 等)都通过 VIP +HAProxy 监听的 6444 端口访问 kube-apiserver 服务(注意:kube-apiserver 默认端口为 6443,为了避免冲突我们将 HAProxy 端口设置为 6444,其它组件都是通过该端口统一请求 apiserver

创建 HAProxy 启动脚本

该步骤在 kubernetes-master-01 执行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ mkdir -p /usr/local/kubernetes/lb
$ vi /usr/local/kubernetes/lb/start-haproxy.sh

# 输入内容如下
#!/bin/bash
# 修改为你自己的 Master 地址
MasterIP1=192.168.141.150
MasterIP2=192.168.141.151
MasterIP3=192.168.141.152
# 这是 kube-apiserver 默认端口,不用修改
MasterPort=6443

# 容器将 HAProxy 的 6444 端口暴露出去
$ docker run -d --restart=always --name HAProxy-K8S -p 6444:6444 \
-e MasterIP1=$MasterIP1 \
-e MasterIP2=$MasterIP2 \
-e MasterIP3=$MasterIP3 \
-e MasterPort=$MasterPort \
wise2c/haproxy-k8s

# 设置权限
$ chmod +x start-haproxy.sh

创建 Keepalived 启动脚本

该步骤在 kubernetes-master-01 执行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ mkdir -p /usr/local/kubernetes/lb
$ vi /usr/local/kubernetes/lb/start-keepalived.sh

# 输入内容如下
#!/bin/bash
# 修改为你自己的虚拟 IP 地址
VIRTUAL_IP=192.168.141.200
# 虚拟网卡设备名
INTERFACE=ens33
# 虚拟网卡的子网掩码
NETMASK_BIT=24
# HAProxy 暴露端口,内部指向 kube-apiserver 的 6443 端口
CHECK_PORT=6444
# 路由标识符
RID=10
# 虚拟路由标识符
VRID=160
# IPV4 多播地址,默认 224.0.0.18
MCAST_GROUP=224.0.0.18

$ docker run -itd --restart=always --name=Keepalived-K8S \
--net=host --cap-add=NET_ADMIN \
-e VIRTUAL_IP=$VIRTUAL_IP \
-e INTERFACE=$INTERFACE \
-e CHECK_PORT=$CHECK_PORT \
-e RID=$RID \
-e VRID=$VRID \
-e NETMASK_BIT=$NETMASK_BIT \
-e MCAST_GROUP=$MCAST_GROUP \
wise2c/keepalived-k8s

# 设置权限
$ chmod +x start-keepalived.sh

复制脚本到其它 Master 地址

分别在 kubernetes-master-02kubernetes-master-03 执行创建工作目录命令

1
$ mkdir -p /usr/local/kubernetes/lb

kubernetes-master-01 中的脚本拷贝至其它 Master

1
2
scp start-haproxy.sh start-keepalived.sh 192.168.141.151:/usr/local/kubernetes/lb
scp start-haproxy.sh start-keepalived.sh 192.168.141.152:/usr/local/kubernetes/lb

分别在 3 个 Master 中启动容器(执行脚本)

1
$ sh /usr/local/kubernetes/lb/start-haproxy.sh && sh /usr/local/kubernetes/lb/start-keepalived.sh

验证是否成功

查看容器

1
2
3
4
5
6
$ docker ps

# 输出如下
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f50df479ecae wise2c/keepalived-k8s "/usr/bin/keepalived…" About an hour ago Up About an hour Keepalived-K8S
75066a7ed2fb wise2c/haproxy-k8s "/docker-entrypoint.…" About an hour ago Up About an hour 0.0.0.0:6444->6444/tcp HAProxy-K8S

查看网卡绑定的虚拟 IP

1
2
3
4
5
6
$ ip a | grep ens33

# 输出如下
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 192.168.141.151/24 brd 192.168.141.255 scope global ens33
inet 192.168.141.200/24 scope global secondary ens33

特别注意:Keepalived 会对 HAProxy 监听的 6444 端口进行检测,如果检测失败即认定本机 HAProxy 进程异常,会将 VIP 漂移到其他节点,所以无论本机 Keepalived 容器异常或 HAProxy 容器异常都会导致 VIP 漂移到其他节点

部署 Kubernetes 集群


初始化 Master

创建工作目录并导出配置文件

1
2
3
4
5
# 创建工作目录
$ mkdir -p /usr/local/kubernetes/cluster

# 导出配置文件到工作目录
$ kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml

修改配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
apiVersion: kubeadm.k8s.io/v1beta1
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
# 修改为主节点 IP
advertiseAddress: 192.168.141.150
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: kubernetes-master
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
# 配置 Keepalived 地址和 HAProxy 端口
controlPlaneEndpoint: "192.168.141.200:6444"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
# 国内不能访问 Google,修改为阿里云
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
# 修改版本号
kubernetesVersion: v1.14.2
networking:
dnsDomain: cluster.local
# 配置成 Calico 的默认网段
podSubnet: "192.168.0.0/16"
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
# 开启 IPVS 模式
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
SupportIPVSProxyMode: true
mode: ipvs

kubeadm 初始化

1
2
3
4
5
6
7
8
9
10
# kubeadm 初始化
$ kubeadm init --config=kubeadm.yml --experimental-upload-certs | tee kubeadm-init.log

# 配置 kubectl
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config

# 验证是否成功
$ kubectl get node

安装网络插件

1
2
3
4
5
# 安装 Calico
$ kubectl apply -f https://docs.projectcalico.org/v3.7/manifests/calico.yaml

# 验证安装是否成功
$ watch kubectl get pods --all-namespaces

加入 Master 节点

kubeadm-init.log 中获取命令,分别将 kubernetes-master-02kubernetes-master-03 加入 Master

1
2
3
4
# 以下为示例命令
$ kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:56d53268517c132ae81c868ce99c44be797148fb2923e59b49d73c99782ff21f \
--experimental-control-plane --certificate-key c4d1525b6cce4b69c11c18919328c826f92e660e040a46f5159431d5ff0545bd

加入 Node 节点

kubeadm-init.log 中获取命令,分别将 kubernetes-node-01kubernetes-node-03 加入 Node

1
2
3
# 以下为示例命令
$ kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:56d53268517c132ae81c868ce99c44be797148fb2923e59b49d73c99782ff21f

验证集群状态

查看 Node

1
$ kubectl get nodes -o wide

查看 Pod

1
$ kubectl -n kube-system get pod -o wide

查看 Service

1
$ kubectl -n kube-system get svc

验证 IPVS,查看 kube-proxy 日志,server_others.go:176] Using ipvs Proxier.

1
$ kubectl -n kube-system logs -f <kube-proxy 容器名>

查看代理规则

1
$ ipvsadm -ln

查看生效的配置

1
$ kubectl -n kube-system get cm kubeadm-config -oyaml

查看 etcd 集群

1
2
3
4
5
6
7
8
9
10
11
$ kubectl -n kube-system exec etcd-kubernetes-master-01 -- etcdctl \
--endpoints=https://192.168.141.150:2379 \
--ca-file=/etc/kubernetes/pki/etcd/ca.crt \
--cert-file=/etc/kubernetes/pki/etcd/server.crt \
--key-file=/etc/kubernetes/pki/etcd/server.key cluster-health

# 输出如下
member 1dfaf07371bb0cb6 is healthy: got healthy result from https://192.168.141.152:2379
member 2da85730b52fbeb2 is healthy: got healthy result from https://192.168.141.150:2379
member 6a3153eb4faaaffa is healthy: got healthy result from https://192.168.141.151:2379
cluster is healthy

验证高可用

特别注意:Keepalived 要求至少 2 个备用节点,故想测试高可用至少需要 1 主 2 从模式验证,否则可能出现意想不到的问题

对任意一台 Master 机器执行关机操作

1
$ shutdown -h now

在任意一台 Master 节点上查看 Node 状态

1
2
3
4
5
6
7
$ kubectl get node

# 输出如下,除已关机那台状态为 NotReady 其余正常便表示成功
NAME STATUS ROLES AGE VERSION
kubernetes-master-01 NotReady master 18m v1.14.2
kubernetes-master-02 Ready master 17m v1.14.2
kubernetes-master-03 Ready master 16m v1.14.2

查看 VIP 漂移

1
2
3
4
5
6
$ ip a |grep ens33

# 输出如下
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
inet 192.168.141.151/24 brd 192.168.141.255 scope global ens33
inet 192.168.141.200/24 scope global secondary ens33

解决 Node 无法加入的问题


问题描述

当我们使用 kubeadm join 命令将 Node 节点加入集群时,你会发现所有 kubectl 命令均不可用(呈现阻塞状态,并不会返回响应结果),我们可以在 Node 节点中通过 kubeadm reset 命令将 Node 节点下线,此时回到 Master 节点再使用 watch kubectl get pods --all-namespaces 可以看到下图中报错了,coredns-xxx-xxx 状态为 CrashLoopBackOff

解决方案

从上面的错误信息不难看出应该是出现了网络问题,而我们在安装过程中只使用了一个网络插件 Calico ,那么该错误是不是由 Calico 引起的呢?带着这个疑问我们去到 Calico 官网再看一下它的说明,官网地址:https://docs.projectcalico.org/v3.7/getting-started/kubernetes/

在它的 Quickstart 里有两段话(属于特别提醒)

上面这段话的主要意思是:当 kubeadm 安装完成后不要关机,继续完成后续的安装步骤;这也说明了安装 Kubernetes 的过程不要出现中断一口气搞定(不过这不是重点)(* ̄rǒ ̄)

上面这段话的主要意思是:如果你的网络在 192.168.0.0/16 网段中,则必须选择一个不同的 Pod 网络;恰巧咱们的网络范围(我虚拟机的 IP 范围是 192.168.141.0/24)和该网段重叠 (ノへ ̄、);好吧,当时做单节点集群时因为没啥问题而忽略了 ♪(^∇^*)

so,能够遇到这个问题主要是因为虚拟机 IP 范围刚好和 Calico 默认网段重叠导致的,所以想要解决这个问题,咱们就需要修改 Calico 的网段了(当然也可以改虚拟机的),换句话说就是大家重装一下 o (一︿一 +) o

按照以下标准步骤重装即可

重置 Kubernetes

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ kubeadm reset

# 输出如下
[reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted.
[reset] Are you sure you want to proceed? [y/N]: y
[preflight] Running pre-flight checks
W0604 01:55:28.517280 22688 reset.go:234] [reset] No kubeadm config, using etcd pod spec to get data directory
[reset] No etcd config found. Assuming external etcd
[reset] Please manually reset etcd to prevent further issues
[reset] Stopping the kubelet service
[reset] unmounting mounted directories in "/var/lib/kubelet"
[reset] Deleting contents of stateful directories: [/var/lib/kubelet /etc/cni/net.d /var/lib/dockershim /var/run/kubernetes]
[reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki]
[reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf]

The reset process does not reset or clean up iptables rules or IPVS tables.
If you wish to reset iptables, you must do so manually.
For example:
iptables -F && iptables -t nat -F && iptables -t mangle -F && iptables -X

If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar)
to reset your system's IPVS tables.

删除 kubectl 配置

1
$ rm -fr ~/.kube/

启用 IPVS

1
2
3
4
5
$ modprobe -- ip_vs
$ modprobe -- ip_vs_rr
$ modprobe -- ip_vs_wrr
$ modprobe -- ip_vs_sh
$ modprobe -- nf_conntrack_ipv4

导出并修改配置文件

1
$ kubeadm config print init-defaults --kubeconfig ClusterConfiguration > kubeadm.yml

配置文件修改如下

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
apiVersion: kubeadm.k8s.io/v1beta1
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 192.168.141.150
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: kubernetes-master-01
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta1
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controlPlaneEndpoint: "192.168.141.200:6444"
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.14.2
networking:
dnsDomain: cluster.local
# 主要修改在这里,替换 Calico 网段为我们虚拟机不重叠的网段(这里用的是 Flannel 默认网段)
podSubnet: "10.244.0.0/16"
serviceSubnet: 10.96.0.0/12
scheduler: {}
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
featureGates:
SupportIPVSProxyMode: true
mode: ipvs

kubeadm 初始化

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
$ kubeadm init --config=kubeadm.yml --experimental-upload-certs | tee kubeadm-init.log

# 输出如下
[init] Using Kubernetes version: v1.14.2
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes-master-01 localhost] and IPs [192.168.141.150 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes-master-01 localhost] and IPs [192.168.141.150 127.0.0.1 ::1]
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes-master-01 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.141.150 192.168.141.200]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[apiclient] All control plane components are healthy after 24.507568 seconds
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.14" in namespace kube-system with the configuration for the kubelets in the cluster
[upload-certs] Storing the certificates in ConfigMap "kubeadm-certs" in the "kube-system" Namespace
[upload-certs] Using certificate key:
a662b8364666f82c93cc5cd4fb4fabb623bbe9afdb182da353ac40f1752dfa4a
[mark-control-plane] Marking the node kubernetes-master-01 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kubernetes-master-01 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[bootstrap-token] Using token: abcdef.0123456789abcdef
[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstrap-token] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[addons] Applied essential addon: kube-proxy

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:2ea8c138021fb1e184a24ed2a81c16c92f9f25c635c73918b1402df98f9c8aad \
--experimental-control-plane --certificate-key a662b8364666f82c93cc5cd4fb4fabb623bbe9afdb182da353ac40f1752dfa4a

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --experimental-upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:2ea8c138021fb1e184a24ed2a81c16c92f9f25c635c73918b1402df98f9c8aad

配置 kubectl

1
2
3
4
5
6
7
# 配置 kubectl
$ mkdir -p $HOME/.kube
$ cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
$ chown $(id -u):$(id -g) $HOME/.kube/config

# 验证是否成功
$ kubectl get node

下载 Calico 配置文件并修改

1
$ wget https://docs.projectcalico.org/v3.7/manifests/calico.yaml
1
$ vi calico.yaml

修改第 611 行,将 192.168.0.0/16 修改为 10.244.0.0/16,可以通过如下命令快速查找

  • 显示行号::set number
  • 查找字符:/要查找的字符,输入小写 n 下一个匹配项,输入大写 N 上一个匹配项

安装 Calico

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ kubectl apply -f calico.yaml

# 输出如下
configmap/calico-config created
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created
clusterrole.rbac.authorization.k8s.io/calico-node created
clusterrolebinding.rbac.authorization.k8s.io/calico-node created
daemonset.extensions/calico-node created
serviceaccount/calico-node created
deployment.extensions/calico-kube-controllers created
serviceaccount/calico-kube-controllers created

加入 Master 节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
# 示例如下,别忘记两个备用节点都要加入哦
$ kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:2ea8c138021fb1e184a24ed2a81c16c92f9f25c635c73918b1402df98f9c8aad \
--experimental-control-plane --certificate-key a662b8364666f82c93cc5cd4fb4fabb623bbe9afdb182da353ac40f1752dfa4a

# 输出如下
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks before initializing the new control plane instance
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [kubernetes-master-02 kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 192.168.141.151 192.168.141.200]
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [kubernetes-master-02 localhost] and IPs [192.168.141.151 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [kubernetes-master-02 localhost] and IPs [192.168.141.151 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[certs] Using the existing "sa" key
[kubeconfig] Generating kubeconfig files
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[check-etcd] Checking that the etcd cluster is healthy
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
[etcd] Announced new etcd member joining to the existing etcd cluster
[etcd] Wrote Static Pod manifest for a local etcd member to "/etc/kubernetes/manifests/etcd.yaml"
[etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s
[upload-config] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[mark-control-plane] Marking the node kubernetes-master-02 as control-plane by adding the label "node-role.kubernetes.io/master=''"
[mark-control-plane] Marking the node kubernetes-master-02 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]

This node has joined the cluster and a new control plane instance was created:

* Certificate signing request was sent to apiserver and approval was received.
* The Kubelet was informed of the new secure connection details.
* Control plane (master) label and taint were applied to the new node.
* The Kubernetes control plane instances scaled up.
* A new etcd member was added to the local/stacked etcd cluster.

To start administering your cluster from this node, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Run 'kubectl get nodes' to see this node join the cluster.

加入 Node 节点

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# 示例如下
$ kubeadm join 192.168.141.200:6444 --token abcdef.0123456789abcdef \
--discovery-token-ca-cert-hash sha256:2ea8c138021fb1e184a24ed2a81c16c92f9f25c635c73918b1402df98f9c8aad

# 输出如下
> --discovery-token-ca-cert-hash sha256:2ea8c138021fb1e184a24ed2a81c16c92f9f25c635c73918b1402df98f9c8aad
[preflight] Running pre-flight checks
[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.14" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Activating the kubelet service
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the control-plane to see this node join the cluster.

验证是否可用

1
2
3
4
5
6
7
8
$ kubectl get node

# 输出如下,我们可以看到 Node 节点已经成功上线 ━━( ̄ー ̄*|||━━
NAME STATUS ROLES AGE VERSION
kubernetes-master-01 Ready master 19m v1.14.2
kubernetes-master-02 Ready master 4m46s v1.14.2
kubernetes-master-03 Ready master 3m23s v1.14.2
kubernetes-node-01 Ready <none> 74s v1.14.2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$ watch kubectl get pods --all-namespaces

# 输出如下,coredns 也正常运行了
Every 2.0s: kubectl get pods --all-namespaces kubernetes-master-01: Tue Jun 4 02:31:43 2019

NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-8646dd497f-hz5xp 1/1 Running 0 9m9s
kube-system calico-node-2z892 1/1 Running 0 9m9s
kube-system calico-node-fljxv 1/1 Running 0 6m39s
kube-system calico-node-vprlw 1/1 Running 0 5m16s
kube-system calico-node-xvqcx 1/1 Running 0 3m7s
kube-system coredns-8686dcc4fd-5ndjm 1/1 Running 0 21m
kube-system coredns-8686dcc4fd-zxtql 1/1 Running 0 21m
kube-system etcd-kubernetes-master-01 1/1 Running 0 20m
kube-system etcd-kubernetes-master-02 1/1 Running 0 6m37s
kube-system etcd-kubernetes-master-03 1/1 Running 0 5m14s
kube-system kube-apiserver-kubernetes-master-01 1/1 Running 0 20m
kube-system kube-apiserver-kubernetes-master-02 1/1 Running 0 6m37s
kube-system kube-apiserver-kubernetes-master-03 1/1 Running 0 5m14s
kube-system kube-controller-manager-kubernetes-master-01 1/1 Running 1 20m
kube-system kube-controller-manager-kubernetes-master-02 1/1 Running 0 6m37s
kube-system kube-controller-manager-kubernetes-master-03 1/1 Running 0 5m14s
kube-system kube-proxy-68jqr 1/1 Running 0 3m7s
kube-system kube-proxy-69bnn 1/1 Running 0 6m39s
kube-system kube-proxy-vvhp5 1/1 Running 0 5m16s
kube-system kube-proxy-ws6wx 1/1 Running 0 21m
kube-system kube-scheduler-kubernetes-master-01 1/1 Running 1 20m
kube-system kube-scheduler-kubernetes-master-02 1/1 Running 0 6m37s
kube-system kube-scheduler-kubernetes-master-03 1/1 Running 0 5m14s

至此,Kubernetes 高可用集群算是彻底部署成功。


如果你喜欢这个博客或发现它对你有用,欢迎你点击右下角 “OPEN CHAT” 进行评论。也欢迎你分享这个博客,让更多的人参与进来。如果在博客中使用的图片侵犯了您的版权,请联系博主删除它们。谢谢你!