基于vagrant搭建k8s集群
编辑于 2021-08-17 13:20:42 阅读 3198
3台虚拟机
|节点|系统|IP| |-|-|-|-| |master|CentOS-8|192.168.10.90| |node1|CentOS-8|192.168.10.91| |node2|CentOS-8|192.168.10.92|
构建基础镜像
cd k8sbase
vagrant box add centos8 ../vagrant_package/CentOS-8-generic.box
vagrant init centos8
#启动
vagrant up
#登陆系统,进行基础配置
vagrant ssh
#切换到root
sudo su
基础配置
#关闭防火墙
systemctl stop firewalld && systemctl disable firewalld
#关闭 seLinux
sed -i 's/^SELINUX=enforcing$/SELINUX=disabled/' /etc/selinux/config
#关闭 swap 分区
sed -ri 's/.*swap.*/#&/' /etc/fstab
#将桥接的 IPV4 流量传递到 iptables 的链
cat <<EOF > /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sysctl --system
#同步时间
1 date #查看时间是否正确,不正确则执行以下步骤
2 rm -rf /etc/localtime
3 ln -s /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
4 设置时区
tzselect
5 同步时间
yum install -y ntpdate
ntpdate cn.pool.ntp.org
6 date
# yum源
vi /etc/yum.repos.d/docker-ce.repo
[docker-ce-stable]
name=Docker CE Stable - $basearch
baseurl=https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable
enabled=1
gpgcheck=1
gpgkey=https://mirrors.aliyun.com/docker-ce/linux/centos/gpg
vi /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
#安装
yum install docker-ce kubelet kubeadm kubectl
systemctl enable docker
systemctl enable kubelet
#注意,这里我遇到一个问题,docker的驱动类型和kubelet的驱动类型不同,需要统一一下。修改docker的驱动类型为systemd
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
打包镜像
vagrant halt
vagrant package --output k8s.box
vagrant box add k8s k8s.box
#此镜像已上传到vagrant cloud,可以直接使用
https://app.vagrantup.com/cuiw/boxes/k8s-centos8-base
基于基础镜像批量生成虚拟机
# Vagrantfile
# vim: set ft=ruby ts=2 :
Vagrant.configure("2") do |config|
config.vm.box = "k8s"
config.vm.provider "virtualbox" do |v|
v.memory = 2048
v.cpus = 2
end
config.vm.define :master do |cfg|
cfg.vm.hostname = "master"
cfg.vm.network :public_network, ip: "192.168.10.90"
end
config.vm.define :node1 do |cfg|
cfg.vm.hostname = "node1"
cfg.vm.network :public_network, ip: "192.168.10.91"
end
config.vm.define :node2 do |cfg|
cfg.vm.hostname = "node2"
cfg.vm.network :public_network, ip: "192.168.10.92"
end
config.vm.synced_folder "./", "/vagrant"
end
cd k8s-demo
# 启动3台虚拟机
vagrant up
# 开一个新窗口
vagrant ssh master
# 开一个新窗口
vagrant ssh node1
# 开一个新窗口
vagrant ssh node2
master 节点初始化
# vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--node-ip=192.168.10.90 --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
service kubelet restart
#kubeadm 预下载基础镜像
kubeadm config images pull --image-repository registry.cn-hangzhou.aliyuncs.com/google_containers
#初始化
kubeadm init \
--kubernetes-version v1.22.0 \
--apiserver-advertise-address=192.168.10.90 \
--ignore-preflight-errors=all \
--image-repository registry.cn-hangzhou.aliyuncs.com/google_containers \
--service-cidr=10.1.0.0/16 \
--pod-network-cidr=10.244.0.0/16
POD的网段为: 10.244.0.0/16, api server地址就是master本机IP。
这一步很关键,由于kubeadm 默认从官网k8s.grc.io下载所需镜像,国内无法访问,因此需要通过–image-repository指定阿里云镜像仓库地址。1️⃣
这一步也可能会因为kubelet启动失败而失败2️⃣
参数解释:
–kubernetes-version: 用于指定k8s版本;
–apiserver-advertise-address:用于指定kube-apiserver监听的ip地址,就是 master本机IP地址。
–pod-network-cidr:用于指定Pod的网络范围; 10.244.0.0/16
–service-cidr:用于指定SVC的网络范围;
–image-repository: 指定阿里云镜像仓库地址
如果执行成功会看到
Your Kubernetes control-plane has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
Alternatively, if you are the root user, you can run:
export KUBECONFIG=/etc/kubernetes/admin.conf
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.10.90:6443 --token 9vh4kc.5lcth0nqbhx7mugj \
--discovery-token-ca-cert-hash sha256:eb606e1c5f634d7a861fe09644dfdb12f9bfeb02534012445cc9712e4e8caede
#依次执行上面的3条命令
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
[root@master vagrant]# kubectl get nodes
NAME STATUS ROLES AGE VERSION
master NotReady control-plane,master 6m54s v1.22.0
node1节点
# vi /etc/sysconfig/kubelet
KUBELET_EXTRA_ARGS="--node-ip=192.168.10.91 --pod-infra-container-image=registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
service kubelet restart
#将当前的节点加入到kubelet集群当中去,如果忘记可以通过命令 kubeadm token create --print-join-command 获取
[root@node vagrant]# kubeadm join 192.168.10.90:6443 --token 9vh4kc.5lcth0nqbhx7mugj \
> --discovery-token-ca-cert-hash sha256:eb606e1c5f634d7a861fe09644dfdb12f9bfeb02534012445cc9712e4e8caede
[preflight] Running pre-flight checks
[WARNING FileExisting-tc]: tc not found in system path
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Starting the kubelet
[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap...
This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
#回到master上看一下, 可以看见INTERNAL-IP都是我们希望的IP了(状态还是NoReady,因为网络组件还没安装)
[root@master vagrant]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master NotReady control-plane,master 13m v1.22.0 192.168.10.90 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
node1 NotReady <none> 8s v1.22.0 192.168.10.91 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
node2 NotReady <none> 16s v1.22.0 192.168.10.92 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
node2节点
除了ip,和node1节点的操作一致
flannel网络组件
在master上执行
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
#可能上面的命令因网络问题而执行失败,那就把整个库下载下来再执行
https://github.com/flannel-io/flannel
#修改kube-flannel.yml,指定eth1
...
containers:
- name: kube-flannel
image: registry.cn-hangzhou.aliyuncs.com/google-containers/flannel:v0.9.0
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
- --iface=eth1
...
#执行
kubectl apply -f ./Documentation/kube-flannel.yml
#查询所以pod,不出所料,STATUS都是Running
[root@master vagrant]# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-7d89d9b6b8-fcs2v 1/1 Running 0 42m 10.244.0.3 master <none> <none>
coredns-7d89d9b6b8-kb8f8 1/1 Running 0 42m 10.244.0.2 master <none> <none>
etcd-master 1/1 Running 1 (2m51s ago) 51m 192.168.10.90 master <none> <none>
kube-apiserver-master 1/1 Running 2 (2m30s ago) 51m 192.168.10.90 master <none> <none>
kube-controller-manager-master 1/1 Running 9 (45s ago) 51m 192.168.10.90 master <none> <none>
kube-flannel-ds-4g9k2 1/1 Running 0 4m28s 192.168.10.92 node2 <none> <none>
kube-flannel-ds-bzw9q 1/1 Running 0 5m39s 192.168.10.90 master <none> <none>
kube-flannel-ds-c5x4l 1/1 Running 0 5m42s 192.168.10.91 node1 <none> <none>
kube-proxy-bfl8p 1/1 Running 1 (2m8s ago) 38m 192.168.10.92 node2 <none> <none>
kube-proxy-pf8r9 1/1 Running 1 (2m10s ago) 38m 192.168.10.91 node1 <none> <none>
kube-proxy-vj85t 1/1 Running 1 (2m51s ago) 42m 192.168.10.90 master <none> <none>
kube-scheduler-master 0/1 Running 4 (46s ago) 49m 192.168.10.90 master <none> <none>
#查询所有node,不出所料,STATUS都是Ready
[root@master vagrant]# kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
master Ready control-plane,master 51m v1.22.0 192.168.10.90 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
node1 Ready <none> 38m v1.22.0 192.168.10.91 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
node2 Ready <none> 38m v1.22.0 192.168.10.92 <none> CentOS Linux 8 4.18.0-240.15.1.el8_3.x86_64 docker://20.10.8
dashboard
ingress-nginx
k8s ingress 两种部署方式nodePort和hostNetwork
文件下载
https://github.com/chudaozhe/k8s-demo
参考
https://developer.aliyun.com/mirror/kubernetes
1️⃣ 关于registry.aliyuncs.com/google_containers,详见 阿里云镜像仓库市场
问题
# 问题1
[WARNING ImagePull]: failed to pull image registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4: output: Error response from daemon: manifest for registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4 not found: manifest unknown: manifest unknown
# 解决办法(所以节点都需要执行)
#拉取
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.4
# 重命名
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.4 registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:v1.8.4
# 删除原有镜像
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.8.4
2️⃣问题2
如果kubeadm init执行失败,先确认kubelet是否正常允许
service kubelet status
如果kubelet未能正常运行,执行下面这条命令排查
journalctl -xefu kubelet
假如看到了下面这样的错误
"Failed to run kubelet" err="failed to run Kubelet: misconfiguration: kubelet cgroup driver: \"systemd\" is different from docker cgroup driver: \"cgroupfs\""
大致意思是docker和kubelet的cgroup driver不一致
#查看docker的cgroup driver
docker info
#查看kubelet的cgroup driver
cat /var/lib/kubelet/config.yaml
#设置docker的cgroup driver为systemd
cat > /etc/docker/daemon.json <<EOF
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
EOF
#重启docker
systemctl restart docker
#多次执行kubeadm init,可能会提示端口占用,需重置一下
kubeadm reset