## âś…ToDos ## đź“–Documentation Steps are not completely sufficient for the initial kubernetes Deployment, follow https://medium.com/@sirtcp/deploying-a-three-node-kubernetes-cluster-on-debian-12-with-kubeadm-and-rook-ceph-for-persistent-eb080f31d3fc for the Kubernetes deployment. ### Prerequisites - brew - govc - kubectl ### Installing K8S VMs - VMs are based on Debian 12 bookworm - users - kg - ansible - root - updates/ patches baseline will be done via ansible - waves? #### govc configuration ```bash #govc variables export GOVC_URL=vcenter.lab.sponar.de export [email protected] export GOVC_PASSWORD= export GOVC_DATACENTER=OSN export GOVC_RESOURCE_POOL=K8S export GOVC_DATASTORE=vsanDatastore export GOVC_NETWORK=NSX-SEG-1201-K8S-01 ``` #### disk.EnableUUID=1 ``` govc vm.change -vm '/OSN/vm/K8S-01' -e="disk.enableUUID=1" govc vm.change -vm '/OSN/vm/K8S-02' -e="disk.enableUUID=1" govc vm.change -vm '/OSN/vm/K8S-03' -e="disk.enableUUID=1" govc vm.change -vm '/OSN/vm/K8S-04' -e="disk.enableUUID=1" govc vm.change -vm '/OSN/vm/K8S-05' -e="disk.enableUUID=1" govc vm.change -vm '/OSN/vm/K8S-06' -e="disk.enableUUID=1" ``` #### Virtual HW version ``` #check vHW version of VM govc vm.option.info '/OSN/vm/K8S-01' | grep HwVersion #upgrade HW version of VM if it is lower than 15 govc vm.upgrade -version=15 -vm '/OSN/vm/K8S-01' ``` #### remove swap file ```bash sudo swapoff -a #remove any swap entries inside the fstab file= sudo vim /etc/fstab ``` #### allow required ports ```bash sudo ufw allow 6443/tcp sudo ufw allow 2379/tcp sudo ufw allow 2380/tcp sudo ufw allow 10250/tcp sudo ufw allow 10251/tcp sudo ufw allow 10252/tcp sudo ufw allow 10255/tcp sudo ufw allow 22/tcp sudo ufw allow 179/tcp #BGP for Calico sudo ufw enable && sudo ufw reload ``` ### Installing K8S #### Containerd preqrequisites ```bash cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf overlay br_netfilter EOF ``` #### start modules ```bash sudo modprobe overlay sudo modprobe br_netfilter ``` #### edit kubernetes config ```bash cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-k8s.conf net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 net.bridge.bridge-nf-call-ip6tables = 1 EOF ``` ```bash sudo /usr/sbin/sysctl --system ``` #### install containerd ```bash sudo apt -y install containerd ``` #### HTTPS prerequisites ```bash sudo apt install ca-certificates software-properties-common ``` #### add Docker GPG key ```bash curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add - ``` #### install Docker apt repository ```bash # Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update ``` #### install docker-ce ```bash sudo apt install docker-ce ``` #### setup daemon with default paramters ```bash tee /etc/docker/daemon.json >/dev/null <<EOF { "exec-opts": ["native.cgroupdriver=systemd"], "log-driver": "json-file", "log-opts": { "max-size": "100m" }, "storage-driver": "overlay2" } EOF ``` #### create docker service folder ```bash mkdir -p /etc/systemd/system/docker.service.d ``` #### restart and reload docker daemon ```bash systemctl daemon-reload && systemctl restart docker ``` #### Copy certificates from master to control plane nodes certificates to copy: - **CA Certificates**: - `ca.crt` - `ca.key` - **Front Proxy CA**: - `front-proxy-ca.crt` - `front-proxy-ca.key` - **Service Account Key Pair**: - `sa.pub` - `sa.key` - **Etcd CA (if using an external etcd)**: - `etcd/ca.crt` - `etcd/ca.key` Command to copy certificates above from master node: ```bash scp -r /etc/kubernetes/pki/ca.crt /etc/kubernetes/pki/ca.key \ /etc/kubernetes/pki/front-proxy-ca.crt /etc/kubernetes/pki/front-proxy-ca.key \ /etc/kubernetes/pki/sa.pub /etc/kubernetes/pki/sa.key \ [email protected]:/home/kg/pki/ scp -r /etc/kubernetes/pki/etcd/ca.crt /etc/kubernetes/pki/etcd/ca.key \ [email protected]:/home/kg/pki/etcd/ ``` #### Install Calico networking ```bash kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.0/manifests/calico.yaml ``` ### Installing CPI (vSphere Cloud Provider Interface) & CSP (vSphere Container Storage Plug-in) #### taint Kubernetes Nodes Taint your primary control nodes with the following: ```bash kubectl taint nodes <k8s-primary-name> node-role.kubernetes.io/control-plane=:NoSchedule ``` verify that the nodes are tainted: ```bash kubectl describe nodes | egrep "Taints:|Name:" ``` #### create vmware-system-csi namespace ```bash kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.3.1/manifests/vanilla/namespace.yaml ``` #### create configuration file for block volumes ```bash #/etc/kubernetes/csi-vsphere.conf [VirtualCenter "<vcenter FQDN/IP>"] insecure-flag = "true" user = "[email protected]" password = "" port = "443" datacenters = "OSN" ``` #### Create vSphere secret ```bash kubectl create secret generic vsphere-config-secret --from-file=/etc/kubernetes/csi-vsphere.conf --namespace=vmware-system-csi ``` verify that they were created: ```bash kubectl get secret vsphere-config-secret --namespace=vmware-system-csi ``` delete secret config file: ```bash sudo rm /etc/kubernetes/csi-vsphere.conf ``` #### deploy latest vSphere Container Storage Plugin > [!info] If you encounter problems during this step, it might be due to the wrong urls that were not fixed in 3.3.1 yet, check out the next section "problems during deployment of vSphere Container Storage Plugin" At this date, the latest version was 3.3.1, check the [release notes](https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/3.0/vmware-vsphere-csp-getting-started/GUID-54BB79D2-B13F-4673-8CC2-63A772D17B3C.html) for the latest current version. (see next sections "Problems during deployment") ```bash kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.3.1/manifests/vanilla/vsphere-csi-driver.yaml ``` verify that they are running: ```bash kubectl get deployment --namespace=vmware-system-csi ``` #### Problems during deployment of vSphere Container Storage Plugin First the 3.3.2 yaml file downloaded above still had the old gcr repositories configured as an image. you have to download the yaml ```bash wget https://raw.githubusercontent.com/kubernetes-sigs/vsphere-csi-driver/v3.3.1/manifests/vanilla/vsphere-csi-driver.yaml ``` edit and replace any paths that are using the old gcr url with the newer urls: ``` #v3.3.1 images registry.k8s.io/csi-vsphere/driver:v3.3.1 registry.k8s.io/csi-vsphere/driver:v3.3.1 ``` afterwards you can apply the edited yaml file ```bash kubectl apply -f vsphere-csi-driver.yaml ``` I had issues during the deployment of my controller/csi nodes, same as described in this issue: https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/2284 temp steps troubleshooting: ```bash kubectl describe pod vsphere-csi-controller-6b84dbdbd7-5fl7g-n vmware-system-csi #output 6 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling. # delete deployment kubectl delete -f vsphere-csi-driver.yaml # get all namespaces pods kubectl get pods --all-namespaces # get CIDRs of pods kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' # force delete pod kubectl delete pod <PODNAME> --grace-period=0 --force --namespace <NAMESPACE> #describe node for details why it is still creating kubectl describe pod vsphere-csi-controller-6b84dbdbd7-49dp5 -n vmware-system-csi ``` ### creating Storage Class and Persistent Volume Claim create a new `storage class` ```yaml # /home/kg/vsan-sc.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: vsan-sc annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: csi.vsphere.vmware.com parameters: storagepolicyname: "k8s" ``` import into Kubernetes cluster ```bash kubectl create -f /home/kg/vsan-sc.yaml ``` create a new `Persistent Volume Claim` ```yaml apiVersion: v1 kind: PersistentVolumeClaim metadata: name: vsan-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: vsan-sc ``` import the `Persistent Volume Claim` ```bash kubectl create -f /home/kg/vsan-pvc.yaml ``` ### Patch existing Storage class and expand pvc if you want to edit an already existing storage class you can either do this by editing the already existing storage class yaml or by providing the values manually. I will edit the already existing file `vsan-sc.yaml` that we created earlier. add `allowVolumeExpansion : true` to allow for expansion of an already existing volume ```yaml # /home/kg/vsan-cs.yaml kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: vsan-sc annotations: storageclass.kubernetes.io/is-default-class: "true" provisioner: csi.vsphere.vmware.com allowVolumeExpansion: true parameters: storagepolicyname: "k8s" ``` patch the already existing storage class with the edited file: ```yaml kubectl patch storageclass vsan-sc --patch-file vsan-sc.yaml ``` expand a Persistent Volume Claim ```bash kubectl patch pvc vsan-pvc -p '{"spec": {"resources": {"requests": {"storage": "20Gi"}}}}' ``` confirm that the pvc was expanded ```bash kubectl describe pvc vsan-pvc ``` > [!info] If none of your pods is using the expanded pvc, the size of the actual claim will not change until it is registered to a pod. The Events will stop and show the reason "FileSystemResizeRequired" > ### Example Grafana Kubernetes deployment using a pvc create new namespace ```bash kubectl create namespace grafana ``` create deployment file ```yaml --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: grafana-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: vsan-sc --- apiVersion: apps/v1 kind: Deployment metadata: labels: app: grafana name: grafana spec: selector: matchLabels: app: grafana template: metadata: labels: app: grafana spec: securityContext: fsGroup: 472 supplementalGroups: - 0 containers: - name: grafana image: grafana/grafana:latest imagePullPolicy: IfNotPresent ports: - containerPort: 3000 name: http-grafana protocol: TCP readinessProbe: failureThreshold: 3 httpGet: path: /robots.txt port: 3000 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 30 successThreshold: 1 timeoutSeconds: 2 livenessProbe: failureThreshold: 3 initialDelaySeconds: 30 periodSeconds: 10 successThreshold: 1 tcpSocket: port: 3000 timeoutSeconds: 1 resources: requests: cpu: 250m memory: 750Mi volumeMounts: - mountPath: /var/lib/grafana name: grafana-pv volumes: - name: grafana-pv persistentVolumeClaim: claimName: grafana-pvc --- apiVersion: v1 kind: Service metadata: name: grafana spec: ports: - port: 3000 protocol: TCP targetPort: http-grafana selector: app: grafana sessionAffinity: None type: LoadBalancer ``` Apply the deployment file ```bash kubectl apply -f grafana.yaml ``` ## đź”—Resources ### Kubernetes install guide - https://medium.com/@sirtcp/deploying-a-three-node-kubernetes-cluster-on-debian-12-with-kubeadm-and-rook-ceph-for-persistent-eb080f31d3fc ### CPI/CSP Guide - https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/index.html ### CSP Release notes - https://docs.vmware.com/en/VMware-vSphere-Container-Storage-Plug-in/3.0/vmware-vsphere-csp-getting-started/GUID-54BB79D2-B13F-4673-8CC2-63A772D17B3C.html