3、k8s安装
1. 主流安装方式a. kubeadmK8s 官方提供的最小化集群部署工具通过 kubeadm init/join 快速拉起集群自动处理依赖、证书、组件启动优点简单几条命令即可完成单节点 / 多节点集群搭建大幅降低部署门槛高可用可维护自带升级、证书轮换、节点重置等运维能力官方标准、生态兼容最好与 CNI/CRI/CSI 生态完美兼容社区支持完善文档全、排错资料多遇到问题容易解决。缺点定制化程度有限部署流程、组件启动参数、目录结构被固定无法深度自定义底层流程b. 二进制安装不依赖任何安装工具手动下载每个组件的二进制文件手动编写 systemd 单元、配置证书、网络策略、组件依赖关系优点完全自定义组件启动参数、证书策略、目录结构、启动顺序、依赖关系完全由自己控制部署过程能彻底搞懂 K8s 各组件通信、证书机制、网络模型、依赖关系缺点复杂度极高、极易出错维护成本极高生产风险大部署、调试周期长不适合快速交付和规模化节点2. 部署架构a. 企业高可用架构在 Master 的 3 个节点Master 节点个数可以根据集群规模进行扩展之前通过一个负载均衡器提供对客户端的唯一访问入口地址。负载均衡器以 HAProxy 搭配 Keepalived 进行配置。3 台主机的 IP 地址分别为 192.168.1.101、192.168.1.102、192.168.1.103负载均衡器使用的 VIP 为 192.168.1.100。为了保证 Master 的高可用同时也需要确保 etcd 是高可用的。当然如果业务规模真的很大可以按照业务模块部署多套 k8s 的集群。etcd 减压b. 测试模式一主两从的模式一个 master两个 node用于测试下面按照这个模式部署master: 192.168.1.101node: 192.168.1.102node: 192.168.1.1033. 安装流程a. 软件准备离线安装i. k8s 包可以到 https://mirrors.aliyun.com/kubernetes-new/core/stable/v1.35/rpm/x86_64/?spma2c6h.25603864.0.0.412b1b64kyV9rm 进行下载包含以下 rpm 包cri-tools-1.35.0-150500.1.1.x86_64.rpmkubeadm-1.35.0-150500.1.1.x86_64.rpmkubectl-1.35.0-150500.1.1.x86_64.rpmkubelet-1.35.0-150500.1.1.x86_64.rpmkubernetes-cni-1.8.0-150500.1.1.x86_64.rpmii. docker-ce 包可以到 https://mirrors.aliyun.com/docker-ce/linux/centos/9.3/x86_64/stable/Packages/?spma2c6h.25603864.0.0.289436d5qP7mT9 进行下载包含以下包docker-ce-29.3.0-1.el9.x86_64.rpmdocker-ce-cli-29.3.0-1.el9.x86_64.rpmdocker-ce-rootless-extras-29.3.0-1.el9.x86_64.rpmdocker-compose-plugin-5.1.0-1.el9.x86_64.rpmdocker-buildx-plugin-0.31.1-1.el9.x86_64.rpmcontainerd.io-2.2.2-1.el9.x86_64.rpmiii. cri-docker 包可以到 https://github.com/Mirantis/cri-dockerd/releases/download/v0.3.24/cri-dockerd-0.3.24.amd64.tgz 进行下载iii. k8s 核心镜像包kubeadm init 的时候需要用到先在有外网的服务器上拉取以下镜像docker pull registry.aliyuncs.com/google_containers/coredns:v1.13.1docker pull registry.aliyuncs.com/google_containers/etcd:3.6.6-0docker pull registry.aliyuncs.com/google_containers/kube-apiserver:v1.35.0docker pull registry.aliyuncs.com/google_containers/kube-controller-manager:v1.35.0docker pull registry.aliyuncs.com/google_containers/kube-proxy:v1.35.0docker pull registry.aliyuncs.com/google_containers/kube-scheduler:v1.35.0docker pull registry.aliyuncs.com/google_containers/pause:3.10.1docker pull registry.aliyuncs.com/google_containers/pause:3.8然后导出镜像docker save -o k8s-1.35.0-images.tar registry.aliyuncs.com/google_containers/coredns:v1.13.1 registry.aliyuncs.com/google_containers/etcd:3.6.6-0 registry.aliyuncs.com/google_containers/kube-apiserver:v1.35.0 registry.aliyuncs.com/google_containers/kube-controller-manager:v1.35.0 registry.aliyuncs.com/google_containers/kube-proxy:v1.35.0 registry.aliyuncs.com/google_containers/kube-scheduler:v1.35.0 registry.aliyuncs.com/google_containers/pause:3.10.1 registry.aliyuncs.com/google_containers/pause:3.8iiii. calico 插件可以通过以下命令先下载curl https://raw.githubusercontent.com/projectcalico/calico/v3.29.7/manifests/calico-typha.yaml -o calico.yamliiiii. calico 镜像包先在有外网的服务器上拉取以下镜像docker pull quay.io/calico/cni:v3.29.7docker pull quay.io/calico/kube-controllers:v3.29.7docker pull quay.io/calico/node:v3.29.7docker pull quay.io/calico/typha:v3.29.7然后导出镜像docker save -o calico-3.29.7-images.tar quay.io/calico/cni:v3.29.7 quay.io/calico/kube-controllers:v3.29.7 quay.io/calico/node:v3.29.7 quay.io/calico/typha:v3.29.7b. 安装步骤i. 网卡配置可选只保留一个网卡其余的网卡将自启动关闭如果保留多个网卡可能导致 calico 绑定网卡出错也可以通过固定网卡解决# 仅主机模式的网卡[rootrocky-server-1: ~]#cat /etc/NetworkManager/system-connections/ens160.nmconnection[connection]idens160uuid55ff12b9-d0ae-33ef-9d0d-63e65df2e750typeethernet autoconnect-priority-999 interface-nameens160timestamp1761792426[ethernet][ipv4]address1192.168.1.101/24methodmanualdns114.114.114.114;8.8.8.8[ipv6]addr-gen-modeeui64methodauto[proxy]# NAT 网卡[rootrocky-server-1: ~]#cat /etc/NetworkManager/system-connections/ens192.nmconnection[connection]idens192uuide8834466-539f-3848-9f06-46c41d5c66b7typeethernet autoconnect-priority-999 interface-nameens192timestamp1761811392autoconnectfalse# 关闭自启动[ethernet][ipv4]address1192.168.73.101/24,192.168.73.2methodmanualdns114.114.114.114;8.8.8.8[ipv6]addr-gen-modeeui64methodauto[proxy]# 调用 nmcli 重启设备和连接配置nmcli d d ens192 nmcli d r ens160 nmcli c r ens160ii. 更换 yum 源# 替换为国内阿里云源sed-es|^mirrorlist|#mirrorlist|g\-es|^#baseurlhttp://dl.rockylinux.org/$contentdir|baseurlhttps://mirrors.aliyun.com/rockylinux|g\-i.bak\/etc/yum.repos.d/[Rr]ocky*.repoiii. 禁用 selinuxsetenforce0sed-is/SELINUXenforcing/SELINUXdisabled/g/etc/selinux/config grubby --update-kernel ALL--argsselinux0# 查看是否禁用grubby --info DEFAULT# 回滚内核层禁用操作grubby --update-kernel ALL --remove-args selinuxiiii. 设置时区timedatectl set-timezone Asia/Shanghaiiiiii. 关闭 swap 分区# swap 分区是在内存不够的时候把磁盘空间变成虚拟内存但是性能较低如果服务器专门用于做 k8s 就建议关闭 swap 分区swapoff-a# 临时关闭sed-is:/dev/mapper/rl-swap:#/dev/mapper/rl-swap:g/etc/fstab# 永久关闭iiiiii. 修改主机名# k8s 会读取主机名标识hostnamectl set-hostname k8s-master01# 192.168.1.101hostnamectl set-hostname k8s-node01# 192.168.1.102hostnamectl set-hostname k8s-node02# 192.168.1.103# /etc/hosts 中写入192.168.1.101 k8s-master01 m1192.168.1.102 k8s-node01 n1192.168.1.103 k8s-node02 n2iiiiiii. 安装 ipvs# 生产环境 K8s 都用 kube-proxy 的 IPVS 四层负载均衡模式而 IPVS 必须依赖这个工具yuminstall-yipvsadmiiiiiiii. 开启路由转发# 让节点变成「网络路由器」允许节点转发不同网段的网络数据包—— 这是 K8s 集群中 Pod 跨节点通信、Service 流量转发、Pod 访问外网的核心前提不开启的话 K8s 网络直接瘫痪。echonet.ipv4.ip_forward1/etc/sysctl.confsysctl-piiiiiiiii. 加载 bridge# 加载 bridgeyuminstall-yepel-release yuminstall-ybridge-utils# 加载 br_netfilter 模块modprobe br_netfilter# modprobe overlay # 如果是 containerd 容器需要加载 overlay# 开机自动加载模块echobr_netfilter/etc/modules-load.d/bridge.conf# 配置网桥 iptables 调用参数echonet.bridge.bridge-nf-call-iptables1/etc/sysctl.confechonet.bridge.bridge-nf-call-ip6tables1/etc/sysctl.conf# 让配置立即生效sysctl-piiiiiiiiii. 安装 docker# 添加 docker-ce yum 源# 中科大(ustc)sudodnf config-manager --add-repo https://mirrors.ustc.edu.cn/docker-ce/linux/centos/docker-ce.repocd/etc/yum.repos.d# 切换中科大源sed-es|download.docker.com|mirrors.ustc.edu.cn/docker-ce|gdocker-ce.repodocker-ce-ustc.repomvdocker-ce.repo docker-ce.repo.bak# 安装 docker-ceyum-yinstalldocker-ce# 配置 daemoncat/etc/docker/daemon.jsonEOF { data-root: /data/docker, exec-opts: [native.cgroupdriversystemd], log-driver: json-file, log-opts: { max-size: 100m, max-file: 100 }, insecure-registries: [harbor.bsoft.com], registry-mirrors: [https://kfp63jaj.mirror.aliyuncs.com] } EOFmkdir-p/etc/systemd/system/docker.service.d# 重启docker服务systemctl daemon-reloadsystemctl restartdockersystemctlenabledocker# 重启服务器rebootiiiiiiiiiii. 安装 cri-docker# 安装 cri-dockerwgethttps://github.com/Mirantis/cri-dockerd/releases/download/v0.3.24/cri-dockerd-0.3.24.amd64.tgztar-zvxfcri-dockerd-0.3.24.amd64.tgzcpcri-dockerd/cri-dockerd /usr/bin/chmodax /usr/bin/cri-dockerd# 配置 cri-docker 服务catEOF/usr/lib/systemd/system/cri-docker.service[Unit] DescriptionCRI Interface for Docker Application Container Engine Documentationhttps://docs.mirantis.com Afternetwork-online.target firewalld.service docker.service Wantsnetwork-online.target Requirescri-docker.socket [Service] Typenotify ExecStart/usr/bin/cri-dockerd --network-plugincni --pod-infra-container-imageregistry.aliyuncs.com/google_containers/pause:3.8 ExecReload/bin/kill -s HUP $MAINPID TimeoutSec0 RestartSec2 Restartalways StartLimitBurst3 StartLimitInterval60s LimitNOFILEinfinity LimitNPROCinfinity LimitCOREinfinity TasksMaxinfinity Delegateyes KillModeprocess [Install] WantedBymulti-user.target EOF# 添加 cri-docker 套接字catEOF/usr/lib/systemd/system/cri-docker.socket[Unit] DescriptionCRI Docker Socket for the API PartOfcri-docker.service [Socket] ListenStream%t/cri-dockerd.sock SocketMode0660 SocketUserroot SocketGroupdocker [Install] WantedBysockets.target EOF# 启动 cri-docker 对应服务systemctl daemon-reload systemctlenablecri-docker systemctl start cri-docker systemctl is-active cri-dockeriiiiiiiiiiiii. 开放所需端口# 1. 放行 K8s 控制面端口firewall-cmd--permanent--add-port6443/tcp# kube-apiserverfirewall-cmd--permanent--add-port2379-2380/tcp# etcdfirewall-cmd--permanent--add-port10250/tcp# kubeletfirewall-cmd--permanent--add-port10251/tcp# kube-schedulerfirewall-cmd--permanent--add-port10252/tcp# kube-controller-managerfirewall-cmd--permanent--add-port10255/tcp# kubelet 只读端口# 2. 放行 CNI 网络插件端口Calico/Flannelfirewall-cmd--permanent--add-port8472/udp# Flannel VXLANfirewall-cmd--permanent--add-port5473/udp# Calico VXLANfirewall-cmd--permanent--add-port4789/udp# VXLAN 通用端口# 3. 放行 IPVS 相关端口kube-proxy IPVS 模式firewall-cmd--permanent--add-port10249/tcp# kube-proxy metricsfirewall-cmd--permanent--add-port10256/tcp# kube-proxy healthz# 4. 重新加载 firewalld 规则生效配置firewall-cmd--reload# 验证放行的端口firewall-cmd --list-portsiiiiiiiiiiiiii. 安装 kubeadm 1.35.0 版本# 安装 kubeadm 1.35.0 版本所有节点执行cd/usr/upload/kubernetes-1.35.0-150500.1.1 yum-yinstall*.rpm# 固定ip所有节点执行vim/etc/sysconfig/kubelet# 输入以下内容KUBELET_EXTRA_ARGS--node-ip192.168.1.101 --cgroup-driversystemd# 开机自启所有节点执行systemctlenablekubelet.service# 初始化主节点master 节点执行这里如果有外网可以直接执行如果没有外网先把 k8s-1.35.0-images.tar 镜像导入到 master 节点的 docker 中然后再执行kubeadm init --apiserver-advertise-address192.168.1.101 --image-repository registry.aliyuncs.com/google_containers --kubernetes-version1.35.0 --service-cidr10.10.0.0/12 --pod-network-cidr10.244.0.0/16 --cri-socket unix:///var/run/cri-dockerd.sock# 也可以通过配置文件进行初始化kubeadm init--configkubeadm-config.yaml --upload-certs# 从初始化输出内容中复制以下命令执行master 节点执行mkdir-p$HOME/.kubesudocp-i/etc/kubernetes/admin.conf$HOME/.kube/configsudochown$(id-u):$(id-g)$HOME/.kube/config# work 加入work 节点执行kubeadmjoin192.168.1.101:6443--tokenyynve7.my0fozh4kfgrd39v --discovery-token-ca-cert-hash sha256:56aa0ebaf54104e1bdec475d78698c42a1c93e8acc20b615ef7179d171d1a404 --cri-socket unix:///var/run/cri-dockerd.sock# master 加入可选如果是多 master 的情况# token 通过执行 kubeadm token create 生成新的 token# discovery-token-ca-cert-hash 通过执行 openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2/dev/null | openssl dgst -sha256 -hex | sed s/^.* // 得到# certificate-key 通过执行 kubeadm certs certificate-key 得到如果过期需要通过 kubeadm init phase upload-certs --upload-certs 重新生成kubeadmjoin192.168.1.101:6443--tokenyynve7.my0fozh4kfgrd39v --discovery-token-ca-cert-hash sha256:56aa0ebaf54104e1bdec475d78698c42a1c93e8acc20b615ef7179d171d1a404 --control-plane --certificate-key 6b5ae96e16cf35bc8d6d665458f014a8add06e8e9277fcbb880a2a7a3df0740a --cri-socket unix:///var/run/cri-dockerd.sock# 验证节点kubectl getnodekubectl get pod-Aiiiiiiiiiiiiiii. 部署网络插件 calico# 下载 3.29.7 版本 YAMLcurlhttps://raw.githubusercontent.com/projectcalico/calico/v3.29.7/manifests/calico-typha.yaml-ocalico.yaml# 替换 YAML 中的镜像源避免离线部署时镜像拉取失败sed-is#docker.io/calico#quay.io/calico#gcalico.yaml# 修改 calico.yaml 的 CALICO_IPV4POOL_CIDR 地址- name: CALICO_IPV4POOL_CIDR value:10.244.0.0/16# 修改 calico.yaml 开启 bgp 模式# Enable IPIP- name: CALICO_IPV4POOL_IPIP value:Always#改成 Off# 导入 calico 镜像如果是离线环境所有节点先把 calico-3.29.7-images.tar 镜像导入dockerload-icalico-3.29.7-images.tar# 先确定只有一个网卡然后执行以下命令把 calico 应用到 kubernetes 集群cd/usr/upload kubectl apply-fcalico.yaml# 启动 calicoc. 清理重置集群i. master node 都要清理# 重置集群kubeadm reset --cri-socket unix:///var/run/cri-dockerd.sock-f# 停止 kubelet cri-dockersystemctl stop kubelet systemctl stop cri-docker systemctl disable kubelet# 手动清理残留目录关键步骤rm-rf$HOME/.kube/configrm-rf/etc/kubernetes/*rm-rf/var/lib/kubelet/*rm-rf/var/lib/etcd/*rm-rf/etc/cni/*rm-rf/opt/cni/*rm-rf/var/lib/cni/*rm-rf/var/lib/dockershim/*rm-rf/var/lib/calico/*rm-rf/var/run/kubernetes/*dockersystem prune-a# 清理网络残留可选iplinkdelete cni02/dev/nulliplinkdelete flannel.12/dev/null# 重启核心服务systemctl start cri-docker systemctl daemon-reloadii. 清理 kubernetes 的 rpm 包yum remove-ycri-tools kubeadm kubectl kubelet kubernetes-cni