0%

使用kubeasz安装k8s集群

使用kubeasz安装k8s集群

1 节点规划

因机器配置比较低所以采用精简配置,所有节点均为2c2g

角色 数量 描述
etcd节点 1 注意etcd集群需要1,3,5,…奇数个节点,一般复用master节点
master节点 1 高可用集群至少2个master节点
node节点 2 运行应用负载的节点,可根据需要提升机器配置/增加节点数

image-20220425162300417

生产环境可以参考

类型 备注
ansible * 2 k8s集群部署服务器,可以和其他服务器混用
k8s master * 3 k8s控制端,通过一个vip做主备做高可用
harbor * 2 高可用镜像服务器
etcd * 3 保存k8s集群数据的服务器
haproxy * 2 高可用etcd代理服务器
node节点 * (2 - N) 真正运行容器的服务器端,高可用环境至少两台

2 Linux Kenel升级

k8s,docker,cilium等很多功能、特性需要较新的linux内核支持,所以有必要在集群部署前对内核进行升级;CentOS7 和 Ubuntu16.04可以很方便的完成内核升级。

CentOS7

红帽企业版 Linux 仓库网站 https://www.elrepo.org,主要提供各种硬件驱动(显卡、网卡、声卡等)和内核升级相关资源;兼容 CentOS7 内核升级。如下按照网站提示载入elrepo公钥及最新elrepo版本,然后按步骤升级内核(以安装长期支持版本 kernel-lt 为例)

# 载入公钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
# 安装ELRepo
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
# 载入elrepo-kernel元数据
yum --disablerepo=\* --enablerepo=elrepo-kernel repolist
# 查看可用的rpm包
yum --disablerepo=\* --enablerepo=elrepo-kernel list kernel*
# 安装长期支持版本的kernel
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt.x86_64
# 删除旧版本工具包
yum remove kernel-tools-libs.x86_64 kernel-tools.x86_64 -y
# 安装新版本工具包
yum --disablerepo=\* --enablerepo=elrepo-kernel install -y kernel-lt-tools.x86_64

#查看默认启动顺序
awk -F\' '$1=="menuentry " {print $2}' /etc/grub2.cfg
CentOS Linux (4.4.183-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-c52097a1078c403da03b8eddeac5080b) 7 (Core)
#默认启动的顺序是从0开始,新内核是从头插入(目前位置在0,而4.4.4的是在1),所以需要选择0。
grub2-set-default 0
#重启并检查

Centos 7.9 查询到的第一个内核是拯救模式内核,所以应该选1作为默认启动项,注意不要选错了

[root@node-1 ~]# awk -F' ‘$1==”menuentry “ {print $2}’ /etc/grub2.cfg
CentOS Linux 7 Rescue f59f5a8d46be4d74b70ef1356f76733e (5.4.190-1.el7.elrepo.x86_64)
CentOS Linux (5.4.190-1.el7.elrepo.x86_64) 7 (Core)
CentOS Linux (3.10.0-1160.62.1.el7.x86_64) 7 (Core)
CentOS Linux (3.10.0-1160.el7.x86_64) 7 (Core)
CentOS Linux (0-rescue-a8d0483a1b314a91ad7252d9baf67dcc) 7 (Core)

3 基础环境准备

安装python

yum update
# 安装python
yum install python -y

部署节点安装ansible

# 注意pip 21.0以后不再支持python2和python3.5,需要如下安装
# To install pip for Python 2.7 install it from https://bootstrap.pypa.io/2.7/ :
curl -O https://bootstrap.pypa.io/pip/2.7/get-pip.py
python get-pip.py
python -m pip install --upgrade "pip < 21.0"

# pip安装ansible(国内如果安装太慢可以直接用pip阿里云加速)
pip install ansible -i https://mirrors.aliyun.com/pypi/simple/

在ansible控制端配置免密码登录

# 更安全 Ed25519 算法
ssh-keygen -t ed25519 -N '' -f ~/.ssh/id_ed25519
# 或者传统 RSA 算法
ssh-keygen -t rsa -b 2048 -N '' -f ~/.ssh/id_rsa

ssh-copy-id $IPs #$IPs为所有节点地址包括自身,按照提示输入yes 和root密码

# 为每个节点设置python软链接
ssh $IPs ln -s /usr/bin/python3 /usr/bin/python

附:

分发公钥脚本

#!/bin/bash
# 目标主机列表
IP=""
for node in ${IP};do
sshpass -p 123456 ssh-copy-id ${node} -o StrictHostKeyChecking=no
if [ $? -eq 0 ];then
echo "${node} 密钥copy完成"
else
echo "${node} 密钥copy失败"
fi
done

同步docker证书脚本

 同步docker证书脚本
```shell
#!/bin/bash
# 目标主机列表
IP=""
for node in ${IP};do
sshpass -p 123456 ssh-copy-id ${node} -o StrictHostKeyChecking=no
for node in $IP;
do
sshpass -p ddrbdgzy ssh-copy-id $node -o StrictHostKeyChecking=no
if [ $? -eq 0 ];then
echo "$node key copy complete"
echo "prepare init..."
ssh $node "mkdir /etc/docker/certs.d/harbor.qintianjun.local -pv"
echo "create harbor cert directory complete!"
scp /etc/docker/certs.d/harbor.qintianjun.local/harbor-ca.crt 192.168.88.$node:/etc/docker/certs.d/harbor.qintianjun.local/harbor-ca.crt
echo 'copy harbor cert success!'
scp -r /root/.docker $node:/root/
echo 'copy harbor authentication complete!'
else
echo "copy $node key failed!"
fi
done

4 部署节点编排k8s安装

# 下载工具脚本ezdown,举例使用kubeasz版本3.0.0
export release=3.0.0
wget https://github.com/easzlab/kubeasz/releases/download/${release}/ezdown
chmod +x ./ezdown
# 使用工具脚本下载
./ezdown -D
[root@master kubeasz]# ./ezdown -D
2022-04-25 16:56:55 INFO Action begin: download_all
2022-04-25 16:56:55 INFO downloading docker binaries, version 19.03.14
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 59.5M 100 59.5M 0 0 726k 0 0:01:23 0:01:23 --:--:-- 691k
Unit docker.service could not be found.
2022-04-25 16:58:23 DEBUG generate docker service file
2022-04-25 16:58:23 DEBUG generate docker config: /etc/docker/daemon.json
2022-04-25 16:58:23 DEBUG prepare register mirror for CN
2022-04-25 16:58:23 DEBUG turn off selinux in CentOS/Redhat
2022-04-25 16:58:23 INFO clean iptable rules
2022-04-25 16:58:23 DEBUG enable and start docker
Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /etc/systemd/system/docker.service.
2022-04-25 16:58:28 INFO downloading kubeasz: 3.0.0
2022-04-25 16:58:28 DEBUG run a temporary container
Unable to find image 'easzlab/kubeasz:3.0.0' locally
3.0.0: Pulling from easzlab/kubeasz
31603596830f: Pull complete
9d61d7ab563b: Pull complete
Digest: sha256:360b34a44abf290b778f328d98ba4cc4ce1d2eecc3b159f41cb5662e2bb36ab3
Status: Downloaded newer image for easzlab/kubeasz:3.0.0
73c15a2aff051a1f2c70bdd05d3c7ab004a5a320ded9c189284f80771a6ccca5
2022-04-25 16:59:27 DEBUG cp kubeasz code from the temporary container
2022-04-25 16:59:28 DEBUG stop&remove temporary container
temp_easz
2022-04-25 16:59:28 INFO downloading kubernetes: v1.20.2 binaries
v1.20.2: Pulling from easzlab/kubeasz-k8s-bin
31603596830f: Already exists
10f46e710c09: Pull complete
Digest: sha256:30facff185abd42773b976e1138bc8ff9565daab93f2949b7b2ad2d95f1f9fbd
Status: Downloaded newer image for easzlab/kubeasz-k8s-bin:v1.20.2
docker.io/easzlab/kubeasz-k8s-bin:v1.20.2
2022-04-25 17:00:36 DEBUG run a temporary container
0ef119992b244be79ef21ae634c05d95dc7b5a762b4e0ef4fc00ac2d26320736
2022-04-25 17:00:37 DEBUG cp k8s binaries
2022-04-25 17:00:40 DEBUG stop&remove temporary container
temp_k8s_bin
2022-04-25 17:00:40 INFO downloading extral binaries kubeasz-ext-bin:0.8.1
0.8.1: Pulling from easzlab/kubeasz-ext-bin
31603596830f: Already exists
f7e9399587c7: Pull complete
Digest: sha256:9b811cddce5a74cd150cca3dee088c3e6879c5178e1c35ddbedbc2b4b7a50a21
Status: Downloaded newer image for easzlab/kubeasz-ext-bin:0.8.1
docker.io/easzlab/kubeasz-ext-bin:0.8.1
2022-04-25 17:02:02 DEBUG run a temporary container
745e9a48176cb87c7e88dbbd2093e2e2e633aeb55a5085097ea08fbdfa474898
2022-04-25 17:02:02 DEBUG cp extral binaries
2022-04-25 17:02:04 DEBUG stop&remove temporary container
temp_ext_bin
2022-04-25 17:02:04 INFO downloading offline images
v3.15.3: Pulling from calico/cni
1ff8efc68ede: Pull complete
dbf74493f8ac: Pull complete
6a02335af7ae: Pull complete
a9d90ecd95cb: Pull complete
269efe44f16b: Pull complete
d94997f3700d: Pull complete
8c7602656f2e: Pull complete
34fcbf8be9e7: Pull complete
Digest: sha256:519e5c74c3c801ee337ca49b95b47153e01fd02b7d2797c601aeda48dc6367ff
Status: Downloaded newer image for calico/cni:v3.15.3
docker.io/calico/cni:v3.15.3
v3.15.3: Pulling from calico/pod2daemon-flexvol
3fb48f9dffa9: Pull complete
a820112abeeb: Pull complete
10d8d066ec17: Pull complete
217b4fd6d612: Pull complete
06c30d5e085d: Pull complete
ca0fd3d60e05: Pull complete
a1c12287b32b: Pull complete
Digest: sha256:cec7a31b08ab5f9b1ed14053b91fd08be83f58ddba0577e9dabd8b150a51233f
Status: Downloaded newer image for calico/pod2daemon-flexvol:v3.15.3
docker.io/calico/pod2daemon-flexvol:v3.15.3
v3.15.3: Pulling from calico/kube-controllers
22d9887128f5: Pull complete
8824e2076f71: Pull complete
8b26373ef5b7: Pull complete
Digest: sha256:b88f0923b02090efcd13a2750da781622b6936df72d6c19885fcb2939647247b
Status: Downloaded newer image for calico/kube-controllers:v3.15.3
docker.io/calico/kube-controllers:v3.15.3
v3.15.3: Pulling from calico/node
0a63a759fe25: Pull complete
9d6c79b335fa: Pull complete
0c7b599aaa59: Pull complete
641ec2b3d585: Pull complete
682bbf5a5743: Pull complete
b7275bfed8bc: Pull complete
f9c5a243b960: Pull complete
eafb01686242: Pull complete
3a8a3042bbc5: Pull complete
e4fa8d582cf2: Pull complete
6ff16d4df057: Pull complete
8b0afdee71db: Pull complete
aa370255d6dd: Pull complete
Digest: sha256:1d674438fd05bd63162d9c7b732d51ed201ee7f6331458074e3639f4437e34b1
Status: Downloaded newer image for calico/node:v3.15.3
docker.io/calico/node:v3.15.3
1.7.1: Pulling from coredns/coredns
c6568d217a00: Pull complete
f68152bf8486: Pull complete
Digest: sha256:4a6e0769130686518325b21b0c1d0688b54e7c79244d48e1b15634e98e40c6ef
Status: Downloaded newer image for coredns/coredns:1.7.1
docker.io/coredns/coredns:1.7.1
1.16.0: Pulling from easzlab/k8s-dns-node-cache
e5a8c1ed6cf1: Pull complete
f275df365c13: Pull complete
6a2802bb94f4: Pull complete
cb3853c52da4: Pull complete
db342cbe4b1c: Pull complete
9a72dd095a53: Pull complete
645d255d3285: Pull complete
932f414bd1e7: Pull complete
Digest: sha256:248c29f0f3106a6f55f7c686521ae6f85966f3c9eed10bf8b68cdc6049b46196
Status: Downloaded newer image for easzlab/k8s-dns-node-cache:1.16.0
docker.io/easzlab/k8s-dns-node-cache:1.16.0
v2.1.0: Pulling from kubernetesui/dashboard
a16055c5c364: Pull complete
035b5e7ced27: Pull complete
Digest: sha256:7f80b5ba141bead69c4fee8661464857af300d7d7ed0274cf7beecedc00322e6
Status: Downloaded newer image for kubernetesui/dashboard:v2.1.0
docker.io/kubernetesui/dashboard:v2.1.0
v0.13.0-amd64: Pulling from easzlab/flannel
df20fa9351a1: Pull complete
0fbfec51320e: Pull complete
734a6c0a0c59: Pull complete
41745b624d5f: Pull complete
feca50c5fe05: Pull complete
071b96dd834b: Pull complete
5154c0aa012a: Pull complete
Digest: sha256:34860ea294a018d392e61936f19a7862d5e92039d196cac9176da14b2bbd0fe3
Status: Downloaded newer image for easzlab/flannel:v0.13.0-amd64
docker.io/easzlab/flannel:v0.13.0-amd64
v1.0.6: Pulling from kubernetesui/metrics-scraper
47a33a630fb7: Pull complete
62498b3018cb: Pull complete
Digest: sha256:1f977343873ed0e2efd4916a6b2f3075f310ff6fe42ee098f54fc58aa7a28ab7
Status: Downloaded newer image for kubernetesui/metrics-scraper:v1.0.6
docker.io/kubernetesui/metrics-scraper:v1.0.6
v0.3.6: Pulling from mirrorgooglecontainers/metrics-server-amd64
e8d8785a314f: Pull complete
b2f4b24bed0d: Pull complete
Digest: sha256:c9c4e95068b51d6b33a9dccc61875df07dc650abbf4ac1a19d58b4628f89288b
Status: Downloaded newer image for mirrorgooglecontainers/metrics-server-amd64:v0.3.6
docker.io/mirrorgooglecontainers/metrics-server-amd64:v0.3.6
3.2: Pulling from easzlab/pause-amd64
c74f8866df09: Pull complete
Digest: sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108
Status: Downloaded newer image for easzlab/pause-amd64:3.2
docker.io/easzlab/pause-amd64:3.2
3.0.0: Pulling from easzlab/kubeasz
Digest: sha256:360b34a44abf290b778f328d98ba4cc4ce1d2eecc3b159f41cb5662e2bb36ab3
Status: Image is up to date for easzlab/kubeasz:3.0.0
docker.io/easzlab/kubeasz:3.0.0
2022-04-25 17:08:59 INFO Action successed: download_all

上述脚本运行成功后,所有文件(kubeasz代码、二进制、离线镜像)均已整理好放入目录/etc/kubeasz

[root@master kubeasz]# docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
easzlab/kubeasz 3.0.0 38e4e73f8189 14 months ago 350MB
easzlab/kubeasz-k8s-bin v1.20.2 753cf45d89f6 15 months ago 473MB
easzlab/kubeasz-ext-bin 0.8.1 e5471f766400 16 months ago 373MB
kubernetesui/dashboard v2.1.0 9a07b5b4bfac 16 months ago 226MB
easzlab/k8s-dns-node-cache 1.16.0 90f9d984ec9a 17 months ago 121MB
kubernetesui/metrics-scraper v1.0.6 48d79e554db6 18 months ago 34.5MB
easzlab/flannel v0.13.0-amd64 e708f4bb69e3 18 months ago 57.2MB
coredns/coredns 1.7.1 0a6cfbf7b0b6 19 months ago 42.4MB
calico/node v3.15.3 d45bf977dfbf 19 months ago 262MB
calico/pod2daemon-flexvol v3.15.3 963564fb95ed 19 months ago 22.8MB
calico/cni v3.15.3 ca5564c06ea0 19 months ago 110MB
calico/kube-controllers v3.15.3 0cb2976cbb7d 19 months ago 52.9MB
easzlab/pause-amd64 3.2 80d28bedfe5d 2 years ago 683kB
mirrorgooglecontainers/metrics-server-amd64 v0.3.6 9dd718864ce6 2 years ago 39.9MB
[root@master kubeasz]# ll /etc/kubeasz/
总用量 80
-rw-rw-r--. 1 root root 10284 2月 1 2021 ansible.cfg
drwxr-xr-x. 3 root root 4096 4月 25 17:02 bin
drwxrwxr-x. 8 root root 92 2月 2 2021 docs
drwxr-xr-x. 2 root root 4096 4月 25 17:08 down
drwxrwxr-x. 2 root root 70 2月 2 2021 example
-rwxrwxr-x. 1 root root 22436 2月 1 2021 ezctl
-rwxrwxr-x. 1 root root 13660 2月 1 2021 ezdown
drwxrwxr-x. 10 root root 145 2月 2 2021 manifests
drwxrwxr-x. 2 root root 4096 2月 2 2021 pics
drwxrwxr-x. 2 root root 4096 2月 2 2021 playbooks
-rw-rw-r--. 1 root root 5640 2月 1 2021 README.md
drwxrwxr-x. 22 root root 4096 2月 2 2021 roles
drwxrwxr-x. 2 root root 48 2月 2 2021 tools

创建集群配置实例

ezctl new k8s-01
2021-01-19 10:48:23 DEBUG generate custom cluster files in /etc/kubeasz/clusters/k8s-01
2021-01-19 10:48:23 DEBUG set version of common plugins
2021-01-19 10:48:23 DEBUG cluster k8s-01: files successfully created.
2021-01-19 10:48:23 INFO next steps 1: to config '/etc/kubeasz/clusters/k8s-01/hosts'
2021-01-19 10:48:23 INFO next steps 2: to config '/etc/kubeasz/clusters/k8s-01/config.yml'

然后根据提示配置’/etc/kubeasz/clusters/k8s-01/hosts’ 和 ‘/etc/kubeasz/clusters/k8s-01/config.yml’:根据前面节点规划修改hosts 文件和其他集群层面的主要配置选项;其他集群组件等配置项可以在config.yml 文件中修改。

[root@manager kubeasz]# cat clusters/k8s-01/config.yml
############################
# prepare
############################
# 可选离线安装系统软件包 (offline|online)
INSTALL_SOURCE: "online"

# 可选进行系统安全加固 github.com/dev-sec/ansible-collection-hardening
OS_HARDEN: false

# 设置时间源服务器【重要:集群内机器时间必须同步】
ntp_servers:
- "ntp1.aliyun.com"
- "time1.cloud.tencent.com"
- "0.cn.pool.ntp.org"

# 设置允许内部时间同步的网络段,比如"10.0.0.0/8",默认全部允许
local_network: "0.0.0.0/0"


############################
# role:deploy
############################
# default: ca will expire in 100 years
# default: certs issued by the ca will expire in 50 years
CA_EXPIRY: "876000h"
CERT_EXPIRY: "438000h"

# kubeconfig 配置参数
CLUSTER_NAME: "cluster1"
CONTEXT_NAME: "context-{{ CLUSTER_NAME }}"


############################
# role:runtime [containerd,docker]
############################
# ------------------------------------------- containerd
# [.]启用容器仓库镜像
ENABLE_MIRROR_REGISTRY: true

# [containerd]基础容器镜像
SANDBOX_IMAGE: "easzlab/pause-amd64:3.2"

# [containerd]容器持久化存储目录
CONTAINERD_STORAGE_DIR: "/var/lib/containerd"

# ------------------------------------------- docker
# [docker]容器存储目录
DOCKER_STORAGE_DIR: "/var/lib/docker"

# [docker]开启Restful API
ENABLE_REMOTE_API: false

# [docker]信任的HTTP仓库
INSECURE_REG: '["127.0.0.1/8", "harbor.magedu.local"]'


############################
# role:kube-master
############################
# k8s 集群 master 节点证书配置,可以添加多个ip和域名(比如增加公网ip和域名)
MASTER_CERT_HOSTS:
- "192.168.68.152"
#- "k8s.test.io"
#- "www.test.com"

# node 节点上 pod 网段掩码长度(决定每个节点最多能分配的pod ip地址)
# 如果flannel 使用 --kube-subnet-mgr 参数,那么它将读取该设置为每个节点分配pod网段
# https://github.com/coreos/flannel/issues/847
NODE_CIDR_LEN: 24


############################
# role:kube-node
############################
# Kubelet 根目录
KUBELET_ROOT_DIR: "/var/lib/kubelet"

# node节点最大pod 数
MAX_PODS: 110

# 配置为kube组件(kubelet,kube-proxy,dockerd等)预留的资源量
# 数值设置详见templates/kubelet-config.yaml.j2
KUBE_RESERVED_ENABLED: "yes"

# k8s 官方不建议草率开启 system-reserved, 除非你基于长期监控,了解系统的资源占用状况;
# 并且随着系统运行时间,需要适当增加资源预留,数值设置详见templates/kubelet-config.yaml.j2
# 系统预留设置基于 4c/8g 虚机,最小化安装系统服务,如果使用高性能物理机可以适当增加预留
# 另外,集群安装时候apiserver等资源占用会短时较大,建议至少预留1g内存
SYS_RESERVED_ENABLED: "no"

# haproxy balance mode
BALANCE_ALG: "roundrobin"


############################
# role:network [flannel,calico,cilium,kube-ovn,kube-router]
############################
# ------------------------------------------- flannel
# [flannel]设置flannel 后端"host-gw","vxlan"
FLANNEL_BACKEND: "vxlan"
DIRECT_ROUTING: false

# [flannel] flanneld_image: "quay.io/coreos/flannel:v0.10.0-amd64"
flannelVer: "v0.13.0-amd64"
flanneld_image: "easzlab/flannel:{{ flannelVer }}"

# [flannel]离线镜像tar包
flannel_offline: "flannel_{{ flannelVer }}.tar"

# ------------------------------------------- calico
# [calico]设置 CALICO_IPV4POOL_IPIP=“off”,可以提高网络性能,条件限制详见 docs/setup/calico.md
CALICO_IPV4POOL_IPIP: "Always"

# [calico]设置 calico-node使用的host IP,bgp邻居通过该地址建立,可手工指定也可以自动发现
IP_AUTODETECTION_METHOD: "can-reach={{ groups['kube_master'][0] }}"

# [calico]设置calico 网络 backend: brid, vxlan, none
CALICO_NETWORKING_BACKEND: "brid"

# [calico]更新支持calico 版本: [v3.3.x] [v3.4.x] [v3.8.x] [v3.15.x]
calico_ver: "v3.15.3"

# [calico]calico 主版本
calico_ver_main: "{{ calico_ver.split('.')[0] }}.{{ calico_ver.split('.')[1] }}"

# [calico]离线镜像tar包
calico_offline: "calico_{{ calico_ver }}.tar"

# ------------------------------------------- cilium
# [cilium]CILIUM_ETCD_OPERATOR 创建的 etcd 集群节点数 1,3,5,7...
ETCD_CLUSTER_SIZE: 1

# [cilium]镜像版本
cilium_ver: "v1.4.1"

# [cilium]离线镜像tar包
cilium_offline: "cilium_{{ cilium_ver }}.tar"

# ------------------------------------------- kube-ovn
# [kube-ovn]选择 OVN DB and OVN Control Plane 节点,默认为第一个master节点
OVN_DB_NODE: "{{ groups['kube_master'][0] }}"

# [kube-ovn]离线镜像tar包
kube_ovn_ver: "v1.5.3"
kube_ovn_offline: "kube_ovn_{{ kube_ovn_ver }}.tar"

# ------------------------------------------- kube-router
# [kube-router]公有云上存在限制,一般需要始终开启 ipinip;自有环境可以设置为 "subnet"
OVERLAY_TYPE: "full"

# [kube-router]NetworkPolicy 支持开关
FIREWALL_ENABLE: "true"

# [kube-router]kube-router 镜像版本
kube_router_ver: "v0.3.1"
busybox_ver: "1.28.4"

# [kube-router]kube-router 离线镜像tar包
kuberouter_offline: "kube-router_{{ kube_router_ver }}.tar"
busybox_offline: "busybox_{{ busybox_ver }}.tar"


############################
# role:cluster-addon
############################
# coredns 自动安装
dns_install: "no"
corednsVer: "1.7.1"
ENABLE_LOCAL_DNS_CACHE: false
dnsNodeCacheVer: "1.16.0"
# 设置 local dns cache 地址
LOCAL_DNS_CACHE: "169.254.20.10"

# metric server 自动安装
metricsserver_install: "no"
metricsVer: "v0.3.6"

# dashboard 自动安装
dashboard_install: "yes"
dashboardVer: "v2.1.0"
dashboardMetricsScraperVer: "v1.0.6"

# ingress 自动安装
ingress_install: "no"
ingress_backend: "traefik"
traefik_chart_ver: "9.12.3"

# prometheus 自动安装
prom_install: "no"
prom_namespace: "monitor"
prom_chart_ver: "12.10.6"

############################
# role:harbor
############################
# harbor version,完整版本号
HARBOR_VER: "v1.9.4"
[root@manager k8s-01]# cat hosts
# 'etcd' cluster should have odd member(s) (1,3,5,...)
[etcd]
192.168.68.148

# master node(s)
[kube_master]
192.168.68.152

# work node(s)
[kube_node]
192.168.68.149
192.168.68.150

# [optional] harbor server, a private docker registry
# 'NEW_INSTALL': 'yes' to install a harbor server; 'no' to integrate with existed one
# 'SELF_SIGNED_CERT': 'no' you need put files of certificates named harbor.pem and harbor-key.pem in directory 'down'
[harbor]
#192.168.68.8 HARBOR_DOMAIN="harbor.yourdomain.com" NEW_INSTALL=no SELF_SIGNED_CERT=yes

# [optional] loadbalance for accessing k8s from outside
[ex_lb]
#192.168.68.6 LB_ROLE=backup EX_APISERVER_VIP=192.168.68.250 EX_APISERVER_PORT=8443
#192.168.68.7 LB_ROLE=master EX_APISERVER_VIP=192.168.68.250 EX_APISERVER_PORT=8443

# [optional] ntp server for the cluster
[chrony]
#192.168.68.1

[all:vars]
# --------- Main Variables ---------------
# Cluster container-runtime supported: docker, containerd
CONTAINER_RUNTIME="docker"

# Network plugins supported: calico, flannel, kube-router, cilium, kube-ovn
CLUSTER_NETWORK="flannel"

# Service proxy mode of kube-proxy: 'iptables' or 'ipvs'
PROXY_MODE="ipvs"

# K8S Service CIDR, not overlap with node(host) networking
SERVICE_CIDR="10.68.0.0/16"

# Cluster CIDR (Pod CIDR), not overlap with node(host) networking
CLUSTER_CIDR="172.20.0.0/16"

# NodePort Range
NODE_PORT_RANGE="30000-32767"

# Cluster DNS Domain
CLUSTER_DNS_DOMAIN="cluster.local."

# -------- Additional Variables (don't change the default value right now) ---
# Binaries Directory
bin_dir="/opt/kube/bin"

# Deploy Directory (kubeasz workspace)
base_dir="/etc/kubeasz"

# Directory for a specific cluster
cluster_dir="{{ base_dir }}/clusters/k8s-01"

# CA and other components cert/key Directory
ca_dir="/etc/kubernetes/ssl"

开始安装 如果你对集群安装流程不熟悉,请阅读项目首页 安装步骤 讲解后分步安装,并对 每步都进行验证

# 一键安装
ezctl setup k8s-01 all

# 或者分步安装,具体使用 ezctl help setup 查看分步安装帮助信息
# ezctl setup k8s-01 01 # 安装准备
# ezctl setup k8s-01 02 # 安装etcd
# ezctl setup k8s-01 03 # 安装容器运行时
# ezctl setup k8s-01 04 # 安装master
# ezctl setup k8s-01 05 # 安装node节点
# ezctl setup k8s-01 06 # 安装集群网络
# ezctl setup k8s-01 06 # 安装集群插件

验证安装:

master上运行

[root@master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
192.168.68.149 Ready node 2m35s v1.20.2
192.168.68.150 Ready node 2m35s v1.20.2
192.168.68.152 Ready,SchedulingDisabled master 4m17s v1.20.2

可以看到node-1,node-2, master 正在运行

创建pod用于测试:

[root@master ~]# kubectl run net-test1 --image=centos:7.9.2009 sleep 36000
pod/net-test1 created

image-20220427144607918

可以看到pod状态从ContainerCreating变成了Running,依法同样创建net-test2:

image-20220427144744408

可以看到node1的ip:172.20.1.2, node2:172.20.2.2, 根据node信息,可以看到两个pod分别运行在两个node上,进入net-test1容器内部pingnet-test2,可以连通:

image-20220427145334903

ping外网ip地址,可以连通:

image-20220427145424739

此时ping外网域名是不通的,需要安装coreDNS用于解析域名

5 安装coreDNS

目前k8s主要使用coreDNS组件用于解析k8s集群中service name 所对应得到ip地址

相关文档: 官网

github: 地址

此处使用k8s的yaml文件方式安装,下载k8s软件包:https://dl.k8s.io/v1.23.5/kubernetes.tar.gz

下载完成后解压

image-20220427190117913

进入kubernetes-1.23.5/cluster/addons/dns/coredns目录

image-20220427190217669

可以看到coreDNS对应yaml文件,复制coredns.yaml.basecoredns.yaml打开

[root@manager coredns]# cat coredns.yaml
# __MACHINE_GENERATED_WARNING__
apiVersion: v1
kind: ServiceAccount
metadata:
name: coredns
namespace: kube-system
labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: Reconcile
name: system:coredns
rules:
- apiGroups:
- ""
resources:
- endpoints
- services
- pods
- namespaces
verbs:
- list
- watch
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- apiGroups:
- discovery.k8s.io
resources:
- endpointslices
verbs:
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
labels:
kubernetes.io/bootstrapping: rbac-defaults
addonmanager.kubernetes.io/mode: EnsureExists
name: system:coredns
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:coredns
subjects:
- kind: ServiceAccount
name: coredns
namespace: kube-system
---
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: EnsureExists
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes magedu.local in-addr.arpa ip6.arpa { # 此处修改为和kubeasz中hosts文件一致
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: coredns
namespace: kube-system
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
# replicas: not specified here:
# 1. In order to make Addon Manager do not reconcile this replicas parameter.
# 2. Default is 1.
# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
selector:
matchLabels:
k8s-app: kube-dns
template:
metadata:
labels:
k8s-app: kube-dns
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
priorityClassName: system-cluster-critical
serviceAccountName: coredns
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: k8s-app
operator: In
values: ["kube-dns"]
topologyKey: kubernetes.io/hostname
tolerations:
- key: "CriticalAddonsOnly"
operator: "Exists"
nodeSelector:
kubernetes.io/os: linux
containers:
- name: coredns
image: k8s.gcr.io/coredns/coredns:v1.8.6
imagePullPolicy: IfNotPresent
resources:
limits:
memory: 200Mi # 修改limits为200毫核,具体参数视配置情况而定
requests:
cpu: 100m
memory: 70Mi
args: [ "-conf", "/etc/coredns/Corefile" ]
volumeMounts:
- name: config-volume
mountPath: /etc/coredns
readOnly: true
ports:
- containerPort: 53
name: dns
protocol: UDP
- containerPort: 53
name: dns-tcp
protocol: TCP
- containerPort: 9153
name: metrics
protocol: TCP
livenessProbe:
httpGet:
path: /health
port: 8080
scheme: HTTP
initialDelaySeconds: 60
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 5
readinessProbe:
httpGet:
path: /ready
port: 8181
scheme: HTTP
securityContext:
allowPrivilegeEscalation: false
capabilities:
add:
- NET_BIND_SERVICE
drop:
- all
readOnlyRootFilesystem: true
dnsPolicy: Default
volumes:
- name: config-volume
configMap:
name: coredns
items:
- key: Corefile
path: Corefile
---
apiVersion: v1
kind: Service
metadata:
name: kube-dns
namespace: kube-system
annotations:
prometheus.io/port: "9153"
prometheus.io/scrape: "true"
labels:
k8s-app: kube-dns
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
kubernetes.io/name: "CoreDNS"
spec:
selector:
k8s-app: kube-dns
clusterIP: 10.68.0.2 # 修改为与pod中一致,不知道可以进容器查看/etc/resolv.conf中nameserver参数
ports:
- name: dns
port: 53
protocol: UDP
- name: dns-tcp
port: 53
protocol: TCP
- name: metrics
port: 9153
protocol: TCP

修改完成用使用kube apply命令创建对应容器

[root@master ~]# kubectl apply -f coredns.yaml
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created

查看pod发现对应pod被创建:

[root@master ~]# kubectl get pod -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default net-test1 1/1 Running 0 22h
default net-test2 1/1 Running 0 22h
kube-system coredns-58b74b57c7-58plj 1/1 Running 0 57s # coreDNS
kube-system kube-flannel-ds-amd64-mncm2 1/1 Running 0 22h
kube-system kube-flannel-ds-amd64-rchqv 1/1 Running 0 22h
kube-system kube-flannel-ds-amd64-zgcm6 1/1 Running 0 22h

再次在容器内部ping外网域名,此时外网域名可以正确解析

image-20220427192132613

coreDNS 属于关键服务,宕机后整个集群域名无法解析,所以一般情况下会启用多个副本保证高可用:

image-20220427192805185

多副本coreDNS最简单多部署方式就是修改对应yaml文件的replicas副本数定义,再重新kubectl apply使其生效,

kubectl edit deployment coredns -n kube-system:

image-20220427195338929

image-20220427195423142

可以看到coreDNS副本数变为了2

image-20220427195612493

yaml 部分参数解释

69 data:
70 Corefile: |
71 .:53 {
72 errors
73 health {
74 lameduck 5s
75 }
76 ready
77 kubernetes magedu.local in-addr.arpa ip6.arpa {
78 pods insecure
79 fallthrough in-addr.arpa ip6.arpa
80 ttl 30
81 }
82 prometheus :9153
83 forward . /etc/resolv.conf {
84 max_concurrent 1000
85 }
86 cache 30
87 loop
88 reload
89 loadbalance
90 }
  • errors:错误记录到标准输出。

  • health:在 http://localhost:8080/health 处提供 CoreDNS 的健康报告。

  • ready:在端口 8181 上提供的一个 HTTP 末端,当所有能够 表达自身就绪的插件都已就绪时,在此末端返回 200 OK。

  • kubernetes:CoreDNS 将基于 Kubernetes 的服务和 Pod 的 IP 答复 DNS 查询。你可以在 CoreDNS 网站阅读更多细节。 你可以使用 ttl 来定制响应的 TTL。默认值是 5 秒钟。TTL 的最小值可以是 0 秒钟, 最大值为 3600 秒。将 TTL 设置为 0 可以禁止对 DNS 记录进行缓存。

    pods insecure 选项是为了与 kube-dns 向后兼容。你可以使用 pods verified 选项,该选项使得 仅在相同名称空间中存在具有匹配 IP 的 Pod 时才返回 A 记录。如果你不使用 Pod 记录,则可以使用 pods disabled 选项。

  • forward: 不在 Kubernetes 集群域内的任何查询都将转发到 预定义的解析器 (/etc/resolv.conf).

  • cache:启用前端缓存。

  • loop:检测到简单的转发环,如果发现死循环,则中止 CoreDNS 进程。

  • reload:允许自动重新加载已更改的 Corefile。 编辑 ConfigMap 配置后,请等待两分钟,以使更改生效。

  • loadbalance:这是一个轮转式 DNS 负载均衡器, 它在应答中随机分配 A、AAAA 和 MX 记录的顺序。

你可以通过修改 ConfigMap 来更改默认的 CoreDNS 行为, 参考

你可以追加本地权威DNS服务器,当本地有类似内网环境的特殊域名coreDNS无法解析的时候,可以把请求转发到本地权威DNS, 在上述配置后追加:

myserver.online {
forward . 172.16.16.16:53 # 本地DNS的ip地址,例如bind9
}

6 安装Dashboard

使用官网提供的yaml文件安装,参考:https://github.com/kubernetes/dashboard/releases

Apply 方式安装后查看相关状态:

[root@master ~]# kubectl get pod -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
default net-test1 1/1 Running 2 2d20h 172.20.1.8 192.168.68.150 <none> <none>
default net-test2 1/1 Running 2 2d20h 172.20.2.8 192.168.68.149 <none> <none>
kube-system coredns-58b74b57c7-kv4lw 1/1 Running 2 121m 172.20.1.9 192.168.68.150 <none> <none>
kube-system kube-flannel-ds-amd64-mncm2 1/1 Running 1 2d20h 192.168.68.152 192.168.68.152 <none> <none>
kube-system kube-flannel-ds-amd64-rchqv 1/1 Running 2 2d20h 192.168.68.150 192.168.68.150 <none> <none>
kube-system kube-flannel-ds-amd64-zgcm6 1/1 Running 2 2d20h 192.168.68.149 192.168.68.149 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-5b8896d7fc-2frf2 1/1 Running 0 107s 172.20.1.10 192.168.68.150 <none> <none>
kubernetes-dashboard kubernetes-dashboard-897c7599f-tvscb 1/1 Running 0 107s 172.20.2.9 192.168.68.149 <none> <none>
[root@master ~]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 2d21h
kube-system kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 46h
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.68.203.30 <none> 8000/TCP 3m56s
kubernetes-dashboard kubernetes-dashboard ClusterIP 10.68.183.139 <none> 443/TCP 3m57s

可以看到dashboard容器监听443端口,但并未对外暴露端口,需要修改yaml文件将端口暴露出来以便用户访问,打开yaml文件定位到如下位置:

image-20220429185550382

增加type: NodePort, 增加nodePort: 30004,将端口暴露到物理机30004端口,此时查看svc

[root@master ~]# kubectl get svc -A -o wide
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
default kubernetes ClusterIP 10.68.0.1 <none> 443/TCP 2d21h <none>
kube-system kube-dns ClusterIP 10.68.0.2 <none> 53/UDP,53/TCP,9153/TCP 46h k8s-app=kube-dns
kubernetes-dashboard dashboard-metrics-scraper ClusterIP 10.68.203.30 <none> 8000/TCP 8m8s k8s-app=dashboard-metrics-scraper
kubernetes-dashboard kubernetes-dashboard NodePort 10.68.183.139 <none> 443:30004/TCP 8m9s k8s-app=kubernetes-dashboard

端口已经暴露,此时使用任意node ip:30004即可在浏览器打开dashboard:

![image-20220429174206542](data:image/svg+xml,)

需要创建用户及配置权限才能登录,创建使用yaml文件形式:

apiVersion: v1
kind: ServiceAccount
metadata:
name: admin-user
namespace: kubernetes-dashboard

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: admin-user
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: admin-user
namespace: kubernetes-dashboard

将文件保存为admin-user.yaml,使用kubectl apply -f 方式创建

image-20220429180913245

查看对应admin-user对应token,k8s默认会针对刚才我们在yaml文件创建的用户生成一个token,使用kubectl get secrets -n kubernetes-dashboard命令查看token名称

[root@master ~]# kubectl get secrets -n kubernetes-dashboard
[root@master ~]# kubectl get secrets -n kubernetes-dashboard
NAME TYPE DATA AGE
admin-user-token-nfjt2 # 刚创建的用户 kubernetes.io/service-account-token 3 113s
default-token-swd6q kubernetes.io/service-account-token 3 38m
kubernetes-dashboard-certs Opaque 0 38m
kubernetes-dashboard-csrf Opaque 1 38m
kubernetes-dashboard-key-holder Opaque 2 38m
kubernetes-dashboard-token-crghv kubernetes.io/service-account-token 3 38m

可以看到对应token名称为admin-user-token-nfjt2, 使用kubectl describe secrets TOKENNAME -n kubernetes-dashboard查看具体值:

[root@master ~]# kubectl describe secrets admin-user-token-nfjt2 -n kubernetes-dashboard
Name: admin-user-token-nfjt2
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: admin-user
kubernetes.io/service-account.uid: 9ae3bcc6-3aec-4314-b6ba-203e7400fa0a

Type: kubernetes.io/service-account-token

Data
====
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IjdqUC10Ql90bTBTbzNSU0RmTmM1UGg3YUhmOUhsWG1sYWZhc2NzVGJiWEUifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyLXRva2VuLW5manQyIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI5YWUzYmNjNi0zYWVjLTQzMTQtYjZiYS0yMDNlNzQwMGZhMGEiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.ZO6Frkr1kQPA0nZ13ucnQOBQ03rnkjPwD8Ziag7XxY0G6GNiueLnyJKvCfy0WU7oknFYbuBwbx-3GrJ0qaq_sepTgVAnNPch14gZjnByIxOgQ-pDOxJFnW3o4o2fyF5SzyKhcG-VT35S4CTklDfohWrvpSAdxO7pAhFlp5nTa9qR5p3ktjO00zJSQEj97MUvvJWbCyXfdLui6p2cHD5leOTSmSqvF8QT6Miifo8L39r2jKfa39ShJ0na1WP_EUML-ilf0H_JIVBO_EjQyx_fPRxSPYVKc3ldotjRh76git6Alb4zGkCCsyKPP6xSHfLUiANTRtnCoplMcOGVWzRh-g
ca.crt: 1350 bytes
namespace: 20 bytes

使用返回token登录dashboard

image-20220429182343970

至此k8s关键组件安装完成。

欢迎关注我的其它发布渠道