Ceph est une solution de SDS (Software Defined Storage) opensource, qui s’intègre avec divers PAAS / IAAS, tels qu’OpenStack, OpenNebula, ou Kubernetes. Donnant suite à notre article concernant le déploiement d’un cluster Kubernetes sur Raspberry Pi, aujourd’hui nous allons voire comment s’interfacer avec un cluster Ceph, pour y héberger les volumes persistants de nos applications.

Logo Ceph

CSI

Depuis janvier 2019, l’implémentation CSI (Container Storage Interface) de Kubernetes est considérée comme stable et se généralise, au point que les versions récentes de Kubernetes ne permettent souvent plus de monter des volumes tel que nous avions pu le documenter en 2018.

Les éditeurs de solutions SDS sont mis à contribution, et doivent implémenter leurs propres contrôleurs, ceux-ci devant automatiser la création, l’expansion ou la purge, ainsi que les montages et détachement de volumes. Dans le cas de Ceph, nous retrouverons le nécessaire sur GitHub.

Raspberry Pi

Dans le contexte de noeuds Raspberry Pi, nous évoquions la chose dans notre précédent article, il faudra prendre en compte que Raspbian ne livre pas le module kernel rbd, habituellement au coeur de toute interaction avec les volumes de type block Ceph.

Mais notons qu’rbd n’est pas indispensable pour autant. Il existe un client alternatif, moins connu, qui nous permettra de manipuler nos volumes Ceph sur nos Raspberry Pi.

Pour se rattacher à un volume, depuis la ligne de commande, nous pourrions utiliser :

# apt-get install ceph-common rbd-nbd
# scp -p root@ceph-host:/etc/ceph/ceph.conf /etc/ceph/
# scp -p root@ceph-host:/etc/ceph/ceph.client.admin.keyring /etc/ceph/
# rbd-nbd -v
<version ceph>
# rbd -p kube ls
<liste de volumes rbd dans le pool kube>
# rbd-nbd map kube/kubernetes-dynamic-pvc-d3c6c506-3247-11eb-a46b-4249a933eab7
/dev/nbd0
# mount /dev/nbd0 /mnt
# date >/mnt/the_date
# ls -1 /mnt
date
lost+found
# mount /mnt
# rbd-nbd unmap /dev/nbd0

ARM

Nous avons pu voir que Raspbian, en l’absence de module rbd, reste néanmoins capable de monter un volume Ceph. Mais qu’en est-il de CSI ?

C’est un autre contre-temps, les images officielles des contrôleurs Ceph CSI n’étant pas publiées pour ARM. Cependant, la question est à l’étude, on retrouve un premier ticket GitHub sur le projet Rook — ayant donné lieu à un second ouvert chez Ceph — mentionnant entre autres les noms de dépôts DockerHub, servant aux tests de contributeurs. Si ces images ne sont pas fréquemment mises à jour, elles sont pour l’instant relativement récentes, et ont le mérite de fonctionner sur ARM. Par ailleurs, qu’elles soient issues de contributions Rook donne bon espoir qu’elles ne soient plus très loin d’être intégrées aux releases officielles.

Un détail à ne pas négliger : certaines images ne sont pour l’instant pas disponible en 32b, il faudra impérativement flasher vos Raspberry Pi avec l’image Raspbian 64b.

Préparatif Ceph

Dans un premier temps, nous allons créer un pool sur notre cluster Ceph devant accueillir les volumes du cluster Kubernetes.

# ceph -c /etc/ceph/ceph.conf osd pool create kube 32

Afin de ne pas ré-utiliser le keyring de l’admin du cluster, nous allons de même créer un utilisateur Ceph, disposant du minimum de privilèges :

# ceph -c /etc/ceph/ceph.conf auth get-or-create client.kube mon 'allow r' osd 'allow class-read object_prefix rbd_children, allow rwx pool=kube' -o /etc/ceph/ceph.client.kube.keyring

Prendre note de la liste des Monitors, du Cluster ID, et du keyring généré :

# grep -E '^mon host|fsid' /etc/ceph/ceph.conf
fsid = eb53775c-ec88-484f-b5f5-b421b55079d7
mon host = [v2:10.42.253.110:3300,v1:10.42.253.110:6789],[v2:10.42.253.111:3300,v1:10.42.253.111:6789],[v2:10.42.253.112:3300,v1:10.42.253.112:6789]

# cat /etc/ceph/ceph.client.kube.keyring
[client.kube]
    key = AQAKfNJeC/z0FhAAbmBgMu0NdWUw2wfua4Lf9Q==
    caps mon = "allow r"
    caps osd = "allow class-read object_prefix rbd_children, allow rwx pool=kube"

Préparatif Kubernetes

Du côté de Kubernetes, il faudra d’abord créer un Namespace :

$ kubectl create ns rbd-provisioner

Préparons ensuite la configuration décrivant comment s’interfacer à notre cluster Ceph. Celle-ci comprend une ConfigMap, listant les Cluster ID et Monitors du cluster et un Secret qui contiendra la clé du keyring créé précédemment :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: ceph-csi-config
  namespace: rbd-provisioner
data:
  config.json: |-
    [
      {
        "clusterID": "eb53775c-ec88-484f-b5f5-b421b55079d7",
        "monitors": [
          "10.42.253.110",
          "10.42.253.111",
          "10.42.253.112"
        ],
        "cephFS": {
          "subvolumeGroup": "k8srpicsi"
        }
      }
    ]
---
apiVersion: v1
kind: Secret
metadata:
  name: ceph-secret-user
  namespace: rbd-provisioner
stringData:
  userID: kube
  userKey: AQAKfNJeC/z0FhAAbmBgMu0NdWUw2wfua4Lf9Q==
EOF

Provisioner

Ensuite, ayant activé les PodSecurityPolicies au déploiement du cluster, nous devrons créer le nécessaire permettant au provisioner CSI l’accès à une partie de l’arboresence du système de fichiers de nos noeuds Kubernetes :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: rbd-csi-provisioner-psp
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - 'SYS_ADMIN'
  fsGroup:
    rule: RunAsAny
  privileged: true
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - 'configMap'
  - 'emptyDir'
  - 'projected'
  - 'secret'
  - 'downwardAPI'
  - 'hostPath'
  allowedHostPaths:
  - pathPrefix: '/dev'
    readOnly: false
  - pathPrefix: '/sys'
    readOnly: false
  - pathPrefix: '/lib/modules'
    readOnly: true
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: rbd-provisioner
  name: rbd-csi-provisioner-psp
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['rbd-csi-provisioner-psp']
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-provisioner-psp
  namespace: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: rbd-csi-provisioner
  namespace: rbd-provisioner
roleRef:
  kind: Role
  name: rbd-csi-provisioner-psp
  apiGroup: rbac.authorization.k8s.io
EOF

Par ailleurs, le provisioner dépend de l’API du cluster, devant pouvoir lister ses PersistentVolumeClaims, créer et supprimer les objets PersistentVolume, …

Nous ajouterons donc :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rbd-csi-provisioner
  namespace: rbd-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-external-provisioner-runner
rules:
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list"]
- apiGroups: [""]
  resources: ["events"]
  verbs: ["list", "watch", "create", "update", "patch"]
- apiGroups: [""]
  resources: ["persistentvolumes"]
  verbs: ["get", "list", "watch", "create", "update", "delete", "patch"]
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["get", "list", "watch", "update"]
- apiGroups: [""]
  resources: ["persistentvolumeclaims/status"]
  verbs: ["update", "patch"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshots"]
  verbs: ["get", "list"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshotcontents"]
  verbs: ["create", "get", "list", "watch", "update", "delete"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshotclasses"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["storage.k8s.io"]
  resources: ["volumeattachments"]
  verbs: ["get", "list", "watch", "update", "patch"]
- apiGroups: ["storage.k8s.io"]
  resources: ["volumeattachments/status"]
  verbs: ["patch"]
- apiGroups: ["storage.k8s.io"]
  resources: ["csinodes"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshotcontents/status"]
  verbs: ["update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-provisioner-role
subjects:
- kind: ServiceAccount
  name: rbd-csi-provisioner
  namespace: rbd-provisioner
roleRef:
  kind: ClusterRole
  name: rbd-external-provisioner-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: rbd-provisioner
  name: rbd-external-provisioner-cfg
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get", "list", "watch", "create", "update", "delete"]
- apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
  verbs: ["get", "watch", "list", "delete", "update", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-provisioner-role-cfg
  namespace: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: rbd-csi-provisioner
  namespace: rbd-provisioner
roleRef:
  kind: Role
  name: rbd-external-provisioner-cfg
  apiGroup: rbac.authorization.k8s.io

Enfin, déployons le provisioner.

Notons que les images utilisées ici ont été choisiees pour ARM64. Vous retrouverez l’original des configurations distribuées par Ceph, pour x86, sur leur dépôt GitHub.

$ cat <<EOF | kubectl apply -f-
---
kind: Service
apiVersion: v1
metadata:
  name: csi-rbdplugin-provisioner
  namespace: rbd-provisioner
  labels:
    app: csi-metrics
spec:
  selector:
    app: csi-rbdplugin-provisioner
  ports:
  - name: http-metrics
    port: 8080
    protocol: TCP
    targetPort: 8680
---
kind: Deployment
apiVersion: apps/v1
metadata:
  name: csi-rbdplugin-provisioner
  namespace: rbd-provisioner
spec:
  replicas: 3
  selector:
    matchLabels:
      app: csi-rbdplugin-provisioner
  template:
    metadata:
      labels:
        app: csi-rbdplugin-provisioner
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: app
                operator: In
                values:
                - csi-rbdplugin-provisioner
            topologyKey: "kubernetes.io/hostname"
      nodeSelector:
        kubernetes.io/arch: arm64
      serviceAccount: rbd-csi-provisioner
      containers:
      - name: csi-provisioner
        image: docker.io/jamesorlakin/multiarch-csi-provisioner:2.0.1
        args:
        - "--csi-address=$(ADDRESS)"
        - "--v=5"
        - "--timeout=150s"
        - "--retry-interval-start=500ms"
        - "--leader-election=true"
        - "--feature-gates=Topology=false"
        - "--default-fstype=ext4"
        - "--extra-create-metadata=true"
        env:
        - name: ADDRESS
          value: unix:///csi/csi-provisioner.sock
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
      - name: csi-snapshotter
        image: docker.io/jamesorlakin/multiarch-csi-snapshotter:2.1.1
        args:
        - "--csi-address=$(ADDRESS)"
        - "--v=5"
        - "--timeout=150s"
        - "--leader-election=true"
        env:
        - name: ADDRESS
          value: unix:///csi/csi-provisioner.sock
        imagePullPolicy: IfNotPresent
        securityContext:
          privileged: true
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
      - name: csi-attacher
        image: docker.io/jamesorlakin/multiarch-csi-attacher:3.0.0
        args:
        - "--v=5"
        - "--csi-address=$(ADDRESS)"
        - "--leader-election=true"
        - "--retry-interval-start=500ms"
        env:
        - name: ADDRESS
          value: /csi/csi-provisioner.sock
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
      - name: csi-resizer
        image: docker.io/jamesorlakin/multiarch-csi-resizer:1.0.0
        args:
        - "--csi-address=$(ADDRESS)"
        - "--v=5"
        - "--timeout=150s"
        - "--leader-election"
        - "--retry-interval-start=500ms"
        - "--handle-volume-inuse-error=false"
        env:
        - name: ADDRESS
          value: unix:///csi/csi-provisioner.sock
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
      - name: csi-rbdplugin
        securityContext:
          privileged: true
          capabilities:
            add: ["SYS_ADMIN"]
        image: quay.io/cephcsi/cephcsi:v3.2.1-arm64
        args:
        - "--nodeid=$(NODE_ID)"
        - "--type=rbd"
        - "--controllerserver=true"
        - "--endpoint=$(CSI_ENDPOINT)"
        - "--v=5"
        - "--drivername=rbd.csi.ceph.com"
        - "--pidlimit=-1"
        - "--rbdhardmaxclonedepth=8"
        - "--rbdsoftmaxclonedepth=4"
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_ID
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CSI_ENDPOINT
          value: unix:///csi/csi-provisioner.sock
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
        - mountPath: /dev
          name: host-dev
        - mountPath: /sys
          name: host-sys
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
        - name: ceph-csi-config
          mountPath: /etc/ceph-csi-config/
        - name: keys-tmp-dir
          mountPath: /tmp/csi/keys
      - name: csi-rbdplugin-controller
        securityContext:
          privileged: true
          capabilities:
            add: ["SYS_ADMIN"]
        image: quay.io/cephcsi/cephcsi:v3.2.1-arm64
        args:
        - "--type=controller"
        - "--v=5"
        - "--drivername=rbd.csi.ceph.com"
        - "--drivernamespace=$(DRIVER_NAMESPACE)"
        env:
        - name: DRIVER_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        imagePullPolicy: IfNotPresent
        volumeMounts:
        - name: ceph-csi-config
          mountPath: /etc/ceph-csi-config/
        - name: keys-tmp-dir
          mountPath: /tmp/csi/keys
      - name: liveness-prometheus
        image: quay.io/cephcsi/cephcsi:v3.2.1-arm64
        args:
        - "--type=liveness"
        - "--endpoint=$(CSI_ENDPOINT)"
        - "--metricsport=8680"
        - "--metricspath=/metrics"
        - "--polltime=60s"
        - "--timeout=3s"
        env:
        - name: CSI_ENDPOINT
          value: unix:///csi/csi-provisioner.sock
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
        imagePullPolicy: IfNotPresent
      volumes:
      - name: host-dev
        hostPath:
          path: /dev
      - name: host-sys
        hostPath:
          path: /sys
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: socket-dir
        emptyDir: {
          medium: "Memory"
        }
      - name: ceph-csi-config
        configMap:
          name: ceph-csi-config
      - name: keys-tmp-dir
        emptyDir: {
          medium: "Memory"
        }
EOF

Suivre le déploiement des Pods provisioner. Dans cette configuration, trois Pods vont se lancer. Une élection décidera d’un contrôleur primaire, qui prendra alors en charge le provisionnement de nos volumes.

Node Plugin

Dans le même temps, nous devrons déployer le second contrôleur en charge de rattacher nos volumes à nos noeuds.

Toujours dans le cas où les PodSecurityPolicies soient actives sur votre cluster, nous devrons créer le nécessaire permettant au contrôleur CSI de manipuler et d’attacher nos volumes aux noeuds de notre cluster.

$ cat <<EOF | kubectl apply -f-
---
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: rbd-csi-nodeplugin-psp
spec:
  allowPrivilegeEscalation: true
  allowedCapabilities:
  - 'SYS_ADMIN'
  fsGroup:
    rule: RunAsAny
  privileged: true
  hostNetwork: true
  hostPID: true
  runAsUser:
    rule: RunAsAny
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - 'configMap'
  - 'emptyDir'
  - 'projected'
  - 'secret'
  - 'downwardAPI'
  - 'hostPath'
  allowedHostPaths:
  - pathPrefix: '/dev'
    readOnly: false
  - pathPrefix: '/run/mount'
    readOnly: false
  - pathPrefix: '/sys'
    readOnly: false
  - pathPrefix: '/lib/modules'
    readOnly: true
  - pathPrefix: '/var/lib/kubelet/pods'
    readOnly: false
  - pathPrefix: '/var/lib/kubelet/plugins/rbd.csi.ceph.com'
    readOnly: false
  - pathPrefix: '/var/lib/kubelet/plugins_registry'
    readOnly: false
  - pathPrefix: '/var/lib/kubelet/plugins'
    readOnly: false
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-nodeplugin-psp
  namespace: rbd-provisioner
rules:
- apiGroups: ['policy']
  resources: ['podsecuritypolicies']
  verbs: ['use']
  resourceNames: ['rbd-csi-nodeplugin-psp']
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-nodeplugin-psp
  namespace: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: rbd-csi-nodeplugin
  namespace: rbd-provisioner
roleRef:
  kind: Role
  name: rbd-csi-nodeplugin-psp
  apiGroup: rbac.authorization.k8s.io
EOF

De plus, ce contrôleur doit pouvoir interroger l’API du cluster, à la recherche de ConfigMaps, Secrets et Nodes. Nous ajouterons alors :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rbd-csi-nodeplugin
  namespace: rbd-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-nodeplugin
rules:
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: ["get"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rbd-csi-nodeplugin
subjects:
- kind: ServiceAccount
  name: rbd-csi-nodeplugin
  namespace: rbd-provisioner
roleRef:
  kind: ClusterRole
  name: rbd-csi-nodeplugin
  apiGroup: rbac.authorization.k8s.io
EOF

Enfin, déployer ce contrôleur :

$ cat <<EOF | kubectl apply -f-
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: csi-rbdplugin
  namespace: rbd-provisioner
spec:
  selector:
    matchLabels:
      app: csi-rbdplugin
  template:
    metadata:
      labels:
        app: csi-rbdplugin
    spec:
      serviceAccount: rbd-csi-nodeplugin
      hostNetwork: true
      hostPID: true
      nodeSelector:
        kubernetes.io/arch: arm64
      dnsPolicy: ClusterFirstWithHostNet
      containers:
      - name: driver-registrar
        securityContext:
          privileged: true
        #image: docker.io/colek42/csi-node-driver-registrar:latest
        image: docker.io/jamesorlakin/multiarch-csi-node-registrar:2.0.0
        args:
        - "--v=5"
        - "--csi-address=/csi/csi.sock"
        - "--kubelet-registration-path=/var/lib/kubelet/plugins/rbd.csi.ceph.com/csi.sock"
        env:
        - name: KUBE_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
        - name: registration-dir
          mountPath: /registration
      - name: csi-rbdplugin
        securityContext:
          privileged: true
          capabilities:
            add: ["SYS_ADMIN"]
          allowPrivilegeEscalation: true
        image: quay.io/cephcsi/cephcsi:v3.2.1-arm64
        args:
        - "--nodeid=$(NODE_ID)"
        - "--type=rbd"
        - "--nodeserver=true"
        - "--endpoint=$(CSI_ENDPOINT)"
        - "--v=5"
        - "--drivername=rbd.csi.ceph.com"
        env:
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: NODE_ID
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CSI_ENDPOINT
          value: unix:///csi/csi.sock
        imagePullPolicy: "IfNotPresent"
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
        - mountPath: /dev
          name: host-dev
        - mountPath: /sys
          name: host-sys
        - mountPath: /run/mount
          name: host-mount
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
        - name: ceph-csi-config
          mountPath: /etc/ceph-csi-config/
        - name: plugin-dir
          mountPath: /var/lib/kubelet/plugins
          mountPropagation: "Bidirectional"
        - name: mountpoint-dir
          mountPath: /var/lib/kubelet/pods
          mountPropagation: "Bidirectional"
        - name: keys-tmp-dir
          mountPath: /tmp/csi/keys
      - name: liveness-prometheus
        securityContext:
          privileged: true
        image: quay.io/cephcsi/cephcsi:v3.2.1-arm64
        args:
        - "--type=liveness"
        - "--endpoint=$(CSI_ENDPOINT)"
        - "--metricsport=8680"
        - "--metricspath=/metrics"
        - "--polltime=60s"
        - "--timeout=3s"
        env:
        - name: CSI_ENDPOINT
          value: unix:///csi/csi.sock
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        volumeMounts:
        - name: socket-dir
          mountPath: /csi
        imagePullPolicy: "IfNotPresent"
      volumes:
      - name: socket-dir
        hostPath:
          path: /var/lib/kubelet/plugins/rbd.csi.ceph.com
          type: DirectoryOrCreate
      - name: plugin-dir
        hostPath:
          path: /var/lib/kubelet/plugins
          type: Directory
      - name: mountpoint-dir
        hostPath:
          path: /var/lib/kubelet/pods
          type: DirectoryOrCreate
      - name: registration-dir
        hostPath:
          path: /var/lib/kubelet/plugins_registry/
          type: Directory
      - name: host-dev
        hostPath:
          path: /dev
      - name: host-sys
        hostPath:
          path: /sys
      - name: host-mount
        hostPath:
          path: /run/mount
      - name: lib-modules
        hostPath:
          path: /lib/modules
      - name: ceph-csi-config
        configMap:
          name: ceph-csi-config
      - name: keys-tmp-dir
        emptyDir: {
          medium: "Memory"
        }
---
apiVersion: v1
kind: Service
metadata:
  name: csi-metrics-rbdplugin
  namespace: rbd-provisioner
  labels:
    app: csi-metrics
spec:
  ports:
  - name: http-metrics
    port: 8080
    protocol: TCP
    targetPort: 8680
  selector:
    app: csi-rbdplugin
EOF

StorageClass

Nous pourrons conclure avec l’ajout de la StorageClass suivante, faisant le lien entre nos configurations précédentes :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    storageclass.beta.kubernetes.io/is-default-class: "true"
  name: rwo-storage
allowVolumeExpansion: true
parameters:
  clusterID: eb53775c-ec88-484f-b5f5-b421b55079d7
  csi.storage.k8s.io/provisioner-secret-name: ceph-secret-user
  csi.storage.k8s.io/provisioner-secret-namespace: rbd-provisioner
  csi.storage.k8s.io/controller-expand-secret-name: ceph-secret-user
  csi.storage.k8s.io/controller-expand-secret-namespace: rbd-provisioner
  csi.storage.k8s.io/node-stage-secret-name: ceph-secret-user
  csi.storage.k8s.io/node-stage-secret-namespace: rbd-provisioner
  csi.storage.k8s.io/fstype: ext4
  imageFeatures: layering
  imageFormat: "2"
  mapOptions: nbds_max=16
  mounter: rbd-nbd
  pool: kube
  volumeNamePrefix: k8s-rpi-
provisioner: rbd.csi.ceph.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
EOF

Notez que nous passons une option mounter valant rbd-nbd : comme évoqué précédemment, il s’agit de ne pas dépendre du module rbd, absent de la distribution Raspbian.

Tester

Tout est prêt, il ne nous reste plus qu’à tester.

Nous pourrons d’abord créer une PersistentVolumeClaim :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc-too
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 8Gi
EOF

Avant de ne lancer cette commande, depuis un autre terminal, nous pourrons nous rattacher aux logs du provisioner, pour suivre la création de notre volume :

$ kubectl get pods -n rbd-provisioner -w
NAME                                        READY   STATUS              RESTARTS   AGE
csi-rbdplugin-24wk8                         3/3     Running             0          55s
csi-rbdplugin-6kgnk                         3/3     Running             0          51s
csi-rbdplugin-94dnw                         3/3     Running             0          50s
csi-rbdplugin-9l59x                         3/3     Running             0          52s
csi-rbdplugin-c9bv2                         3/3     Running             0          53s
csi-rbdplugin-fl5dx                         3/3     Running             0          50s
csi-rbdplugin-ldqrb                         3/3     Running             0          47s
csi-rbdplugin-m8421                         3/3     Running             0          53s
csi-rbdplugin-n4ckr                         3/3     Running             0          54s
csi-rbdplugin-provisioner-7b59568f8-2n4hd   0/7     ContainerCreating   0          57s
csi-rbdplugin-provisioner-7b59568f8-5xs5q   7/7     Running             0          57s
csi-rbdplugin-provisioner-7b59568f8-h7ks7   0/7     ContainerCreating   0          57s
csi-rbdplugin-sjxkl                         3/3     Running             0          45s
csi-rbdplugin-wzrqm                         3/3     Running             0          49s
csi-rbdplugin-xnskj                         3/3     Running             0          51s
$ kubectl logs -n rbd-provisioner csi-rbdplugin-provisioner-7b59568f8-5xs5q -c csi-provisioner  -f
I0111 09:55:09.400769       1 csi-provisioner.go:121] Version: v2.0.1-0-g4e612cc-dirty
I0111 09:55:09.401039       1 csi-provisioner.go:135] Building kube configs for running in cluster...
I0111 09:55:09.425902       1 connection.go:153] Connecting to unix:///csi/csi-provisioner.sock
W0111 09:55:19.426118       1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0111 09:55:29.426155       1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
W0111 09:55:39.426148       1 connection.go:172] Still connecting to unix:///csi/csi-provisioner.sock
I0111 09:55:40.722714       1 common.go:111] Probing CSI driver for readiness
I0111 09:55:40.723054       1 connection.go:182] GRPC call: /csi.v1.Identity/Probe
I0111 09:55:40.723170       1 connection.go:183] GRPC request: {}
I0111 09:55:40.730806       1 connection.go:185] GRPC response: {}
I0111 09:55:40.730985       1 connection.go:186] GRPC error: 
I0111 09:55:40.731014       1 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginInfo
I0111 09:55:40.731030       1 connection.go:183] GRPC request: {}
I0111 09:55:40.733538       1 connection.go:185] GRPC response: {"name":"rbd.csi.ceph.com","vendor_version":"v3.2.1"}
I0111 09:55:40.733699       1 connection.go:186] GRPC error: 
I0111 09:55:40.733728       1 csi-provisioner.go:182] Detected CSI driver rbd.csi.ceph.com
W0111 09:55:40.733766       1 metrics.go:142] metrics endpoint will not be started because `metrics-address` was not specified.
I0111 09:55:40.733790       1 connection.go:182] GRPC call: /csi.v1.Identity/GetPluginCapabilities
I0111 09:55:40.733806       1 connection.go:183] GRPC request: {}
I0111 09:55:40.738163       1 connection.go:185] GRPC response: {"capabilities":[{"Type":{"Service":{"type":1}}},{"Type":{"VolumeExpansion":{"type":1}}},{"Type":{"Service":{"type":2}}}]}
I0111 09:55:40.738676       1 connection.go:186] GRPC error: 
I0111 09:55:40.738816       1 connection.go:182] GRPC call: /csi.v1.Controller/ControllerGetCapabilities
I0111 09:55:40.739012       1 connection.go:183] GRPC request: {}
I0111 09:55:40.741701       1 connection.go:185] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":5}}},{"Type":{"Rpc":{"type":7}}},{"Type":{"Rpc":{"type":9}}}]}
I0111 09:55:40.742063       1 connection.go:186] GRPC error: 
I0111 09:55:40.743107       1 controller.go:735] Using saving PVs to API server in background
I0111 09:55:40.746567       1 leaderelection.go:243] attempting to acquire leader lease  rbd-provisioner/rbd-csi-ceph-com...
I0111 09:55:40.760038       1 leader_election.go:172] new leader detected, current leader: csi-rbdplugin-provisioner-7b59568f8-xr2g8
I0111 09:56:00.421214       1 leaderelection.go:253] successfully acquired lease rbd-provisioner/rbd-csi-ceph-com
I0111 09:56:00.421229       1 leader_election.go:172] new leader detected, current leader: csi-rbdplugin-provisioner-7b59568f8-5xs5q
I0111 09:56:00.421553       1 leader_election.go:165] became leader, starting
I0111 09:56:00.522001       1 controller.go:820] Starting provisioner controller rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b59568f8-5xs5q_b3ea2092-486f-4ae8-ab41-0793c60da5f8!
I0111 09:56:00.522051       1 volume_store.go:97] Starting save volume queue
I0111 09:56:00.522170       1 clone_controller.go:66] Starting CloningProtection controller
I0111 09:56:00.522401       1 clone_controller.go:84] Started CloningProtection controller
I0111 09:56:00.623959       1 controller.go:869] Started provisioner controller rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b59568f8-5xs5q_b3ea2092-486f-4ae8-ab41-0793c60da5f8!

== A ce moment, nous créons le volume décrit précédemment :

I0111 09:56:36.834686       1 controller.go:1317] provision "default/test-pvc-too" class "rwo-storage": started
I0111 09:56:36.835459       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-pvc-too", UID:"c06c760d-b718-48a5-b92f-56683049e6d2", APIVersion:"v1", ResourceVersion:"2703309", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/test-pvc-too"
I0111 09:56:36.834922       1 controller.go:570] CreateVolumeRequest {Name:pvc-c06c760d-b718-48a5-b92f-56683049e6d2 CapacityRange:required_bytes:8589934592  VolumeCapabilities:[mount: access_mode: ] Parameters:map[clusterID:eb53775c-ec88-484f-b5f5-b421b55079d7 csi.storage.k8s.io/controller-expand-secret-name:ceph-secret-admin csi.storage.k8s.io/controller-expand-secret-namespace:rbd-provisioner csi.storage.k8s.io/fstype:ext4 csi.storage.k8s.io/node-stage-secret-name:ceph-secret-user csi.storage.k8s.io/node-stage-secret-namespace:rbd-provisioner csi.storage.k8s.io/provisioner-secret-name:ceph-secret-admin csi.storage.k8s.io/provisioner-secret-namespace:rbd-provisioner imageFeatures:layering imageFormat:2 mapOptions:nbds_max=16 mounter:rbd-nbd pool:kube volumeNamePrefix:k8s-rpi-] Secrets:map[] VolumeContentSource: AccessibilityRequirements: XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0111 09:56:36.853789       1 connection.go:182] GRPC call: /csi.v1.Controller/CreateVolume
I0111 09:56:36.853869       1 connection.go:183] GRPC request: {"capacity_range":{"required_bytes":8589934592},"name":"pvc-c06c760d-b718-48a5-b92f-56683049e6d2","parameters":{"clusterID":"eb53775c-ec88-484f-b5f5-b421b55079d7","csi.storage.k8s.io/pv/name":"pvc-c06c760d-b718-48a5-b92f-56683049e6d2","csi.storage.k8s.io/pvc/name":"test-pvc-too","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","mapOptions":"nbds_max=16","mounter":"rbd-nbd","pool":"kube","volumeNamePrefix":"k8s-rpi-"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}}]}

I0111 09:56:41.441551       1 connection.go:185] GRPC response: {"volume":{"capacity_bytes":8589934592,"volume_context":{"clusterID":"eb53775c-ec88-484f-b5f5-b421b55079d7","csi.storage.k8s.io/pv/name":"pvc-c06c760d-b718-48a5-b92f-56683049e6d2","csi.storage.k8s.io/pvc/name":"test-pvc-too","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","imageName":"k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda","journalPool":"kube","mapOptions":"nbds_max=16","mounter":"rbd-nbd","pool":"kube","radosNamespace":"","volumeNamePrefix":"k8s-rpi-"},"volume_id":"0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda"}}
I0111 09:56:41.441998       1 connection.go:186] GRPC error: 
I0111 09:56:41.442311       1 controller.go:652] create volume rep: {CapacityBytes:8589934592 VolumeId:0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda VolumeContext:map[clusterID:eb53775c-ec88-484f-b5f5-b421b55079d7 csi.storage.k8s.io/pv/name:pvc-c06c760d-b718-48a5-b92f-56683049e6d2 csi.storage.k8s.io/pvc/name:test-pvc-too csi.storage.k8s.io/pvc/namespace:default imageFeatures:layering imageFormat:2 imageName:k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda journalPool:kube mapOptions:nbds_max=16 mounter:rbd-nbd pool:kube radosNamespace: volumeNamePrefix:k8s-rpi-] ContentSource: AccessibleTopology:[] XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}
I0111 09:56:41.442523       1 controller.go:734] successfully created PV pvc-c06c760d-b718-48a5-b92f-56683049e6d2 for PVC test-pvc-too and csi volume name 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda
I0111 09:56:41.442582       1 controller.go:750] successfully created PV {GCEPersistentDisk:nil AWSElasticBlockStore:nil HostPath:nil Glusterfs:nil NFS:nil RBD:nil ISCSI:nil Cinder:nil CephFS:nil FC:nil Flocker:nil FlexVolume:nil AzureFile:nil VsphereVolume:nil Quobyte:nil AzureDisk:nil PhotonPersistentDisk:nil PortworxVolume:nil ScaleIO:nil Local:nil StorageOS:nil CSI:&CSIPersistentVolumeSource{Driver:rbd.csi.ceph.com,VolumeHandle:0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda,ReadOnly:false,FSType:ext4,VolumeAttributes:map[string]string{clusterID: eb53775c-ec88-484f-b5f5-b421b55079d7,csi.storage.k8s.io/pv/name: pvc-c06c760d-b718-48a5-b92f-56683049e6d2,csi.storage.k8s.io/pvc/name: test-pvc-too,csi.storage.k8s.io/pvc/namespace: default,imageFeatures: layering,imageFormat: 2,imageName: k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda,journalPool: kube,mapOptions: nbds_max=16,mounter: rbd-nbd,pool: kube,radosNamespace: ,storage.kubernetes.io/csiProvisionerIdentity: 1610358940742-8081-rbd.csi.ceph.com,volumeNamePrefix: k8s-rpi-,},ControllerPublishSecretRef:nil,NodeStageSecretRef:&SecretReference{Name:ceph-secret-user,Namespace:rbd-provisioner,},NodePublishSecretRef:nil,ControllerExpandSecretRef:&SecretReference{Name:ceph-secret-admin,Namespace:rbd-provisioner,},}}
I0111 09:56:41.443201       1 controller.go:1420] provision "default/test-pvc-too" class "rwo-storage": volume "pvc-c06c760d-b718-48a5-b92f-56683049e6d2" provisioned
I0111 09:56:41.443292       1 controller.go:1437] provision "default/test-pvc-too" class "rwo-storage": succeeded
I0111 09:56:41.443313       1 volume_store.go:154] Saving volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
I0111 09:56:41.502958       1 volume_store.go:157] Volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2 saved
I0111 09:56:41.504698       1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"test-pvc-too", UID:"c06c760d-b718-48a5-b92f-56683049e6d2", APIVersion:"v1", ResourceVersion:"2703309", FieldPath:""}): type: 'Normal' reason: 'ProvisioningSucceeded' Successfully provisioned volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
E0111 09:56:41.505723       1 controller.go:1443] couldn't create key for object pvc-c06c760d-b718-48a5-b92f-56683049e6d2: object has no meta: object does not implement the Object interfaces
I0111 09:56:41.505847       1 controller.go:1078] Claim processing succeeded, removing PVC c06c760d-b718-48a5-b92f-56683049e6d2 from claims in progress
I0111 09:56:41.505965       1 controller.go:1317] provision "default/test-pvc-too" class "rwo-storage": started
I0111 09:56:41.506001       1 controller.go:1326] provision "default/test-pvc-too" class "rwo-storage": persistentvolume "pvc-c06c760d-b718-48a5-b92f-56683049e6d2" already exists, skipping
I0111 09:56:41.506024       1 controller.go:1080] Stop provisioning, removing PVC c06c760d-b718-48a5-b92f-56683049e6d2 from claims in progress

Nous pouvons alors confirmer que le volume a bien été provisionné :

$ kubectl get pvc
NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-pvc-too   Bound    pvc-c06c760d-b718-48a5-b92f-56683049e6d2   8Gi        RWO            rwo-storage    26s

Maintenant, essayons d’étendre ce volume :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: test-pvc-too
  namespace: default
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 16Gi
EOF

A ce stade, la Capacity du PersistentVolumeClaim apparait toujours à 8Gi :

$ kubectl describe pvc test-pvc-too
Name:          test-pvc-too
Namespace:     default
StorageClass:  rwo-storage
Status:        Bound
Volume:        pvc-c06c760d-b718-48a5-b92f-56683049e6d2
Labels:        
Annotations:   pv.kubernetes.io/bind-completed: yes
               pv.kubernetes.io/bound-by-controller: yes
               volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
Finalizers:    [kubernetes.io/pvc-protection]
Capacity:      8Gi
Access Modes:  RWO
VolumeMode:    Filesystem
Mounted By:    
Conditions:
  Type                      Status  LastProbeTime                     LastTransitionTime                Reason  Message
  ----                      ------  -----------------                 ------------------                ------  -------
  FileSystemResizePending   True    Mon, 01 Jan 0001 00:00:00 +0000   Mon, 11 Jan 2021 10:05:05 +0000           Waiting for user to (re-)start a pod to finish file system resize of volume on node.
Events:
  Type     Reason                    Age                    From                                                                                             Message
  ----     ------                    ----                   ----                                                                                             -------
  Normal   Provisioning              9m39s                  rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b59568f8-5xs5q_b3ea2092-486f-4ae8-ab41-0793c60da5f8  External provisioner is provisioning volume for claim "default/test-pvc-too"
  Normal   ExternalProvisioning      9m36s (x2 over 9m39s)  persistentvolume-controller                                                                      waiting for a volume to be created, either by external provisioner "rbd.csi.ceph.com" or manually created by system administrator
  Normal   ProvisioningSucceeded     9m34s                  rbd.csi.ceph.com_csi-rbdplugin-provisioner-7b59568f8-5xs5q_b3ea2092-486f-4ae8-ab41-0793c60da5f8  Successfully provisioned volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
  Warning  ExternalExpanding         71s                    volume_expand                                                                                    Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
  Normal   Resizing                  71s                    external-resizer rbd.csi.ceph.com                                                                External resizer is resizing volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
  Normal   FileSystemResizeRequired  70s                    external-resizer rbd.csi.ceph.com                                                                Require file system resize of volume on node
$ kubectl get events
10m         Normal    FileSystemResizeRequired     persistentvolumeclaim/test-pvc-too   Require file system resize of volume on node
10m         Normal    Resizing                     persistentvolumeclaim/test-pvc-too   External resizer is resizing volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
10m         Warning   ExternalExpanding            persistentvolumeclaim/test-pvc-too   Ignoring the PVC: didn't find a plugin capable of expanding the volume; waiting for an external controller to process this PVC.
18m         Normal    ExternalProvisioning         persistentvolumeclaim/test-pvc-too   waiting for a volume to be created, either by external provisioner "rbd.csi.ceph.com" or manually created by system administrator
18m         Normal    Provisioning                 persistentvolumeclaim/test-pvc-too   External provisioner is provisioning volume for claim "default/test-pvc-too"
18m         Normal    ProvisioningSucceeded        persistentvolumeclaim/test-pvc-too   Successfully provisioned volume pvc-c06c760d-b718-48a5-b92f-56683049e6d2
21m         Normal    Provisioning                 persistentvolumeclaim/test-pvc-too   External provisioner is provisioning volume for claim "default/test-pvc-too"

Nous sommes donc en attente que ce volume soit rattaché à un noeud, pour que son système de fichiers soit étendu.

Nous pourrons créer le Pod suivant, pour finaliser l’opération, tout en confirmant que l’on peut bien rattacher nos volumes aux noeuds Kubernetes :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: Pod
metadata:
  name: ceph-test
  namespace: default
spec:
  nodeSelector:
    kubernetes.io/arch: arm64
  containers:
  - name: dummy
    command: [ /bin/sh ]
    args:
    - -c
    - sleep 86400
    image: busybox
    volumeMounts:
    - name: mypvc
      mountPath: /data
  securityContext:
    runAsUser: 1001
  volumes:
  - name: mypvc
    persistentVolumeClaim:
      claimName: test-pvc-too
EOF

Nous pourrons suivre les logs d’attachement du volume, depuis le Pod rbdplugin s’executant sur le noeud où notre conteneur de test a été schedulé :

$ kubectl get pods -o wide | grep ceph-test
ceph-test   1/1     Running   0          8s    10.233.198.3   erato.friends.intra.example.com

$ kubectl get pods -o wide -n rbd-provisioner | grep erato
csi-rbdplugin-ldqrb    3/3     Running   0          24m   10.42.253.46    erato.friends.intra.example.com

$ kubectl logs -n rbd-provisioner csi-rbdplugin-ldqrb -c csi-rbdplugin
...
I0111 10:07:32.267613     537 utils.go:132] ID: 15 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0111 10:07:32.267854     537 utils.go:133] ID: 15 GRPC request: {}
I0111 10:07:32.269571     537 utils.go:138] ID: 15 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0111 10:07:32.296373     537 utils.go:132] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC call: /csi.v1.Node/NodeStageVolume
I0111 10:07:32.298122     537 utils.go:133] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"eb53775c-ec88-484f-b5f5-b421b55079d7","csi.storage.k8s.io/pv/name":"pvc-c06c760d-b718-48a5-b92f-56683049e6d2","csi.storage.k8s.io/pvc/name":"test-pvc-too","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","imageName":"k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda","journalPool":"kube","mapOptions":"nbds_max=16","mounter":"rbd-nbd","pool":"kube","radosNamespace":"","storage.kubernetes.io/csiProvisionerIdentity":"1610358940742-8081-rbd.csi.ceph.com","volumeNamePrefix":"k8s-rpi-"},"volume_id":"0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda"}
I0111 10:07:32.299358     537 rbd_util.go:808] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda setting disableInUseChecks on rbd volume to: false
I0111 10:07:32.375106     537 omap.go:84] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda got omap values: (pool="kube", namespace="", name="csi.volume.4b509ebf-53f3-11eb-a32d-e210cfc3feda"): map[csi.imageid:1004b3115f3d9b csi.imagename:k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda csi.volname:pvc-c06c760d-b718-48a5-b92f-56683049e6d2]
I0111 10:07:34.685087     537 cephcmds.go:59] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda command succeeded: rbd [device list --format=json --device-type nbd]
I0111 10:07:36.260359     537 rbd_attach.go:215] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda rbd: map mon 10.42.253.110,10.42.253.111,10.42.253.112
I0111 10:07:38.881072     537 cephcmds.go:59] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda command succeeded: rbd [--id kube -m 10.42.253.110,10.42.253.111,10.42.253.112 --keyfile=***stripped*** map kube/k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda --device-type nbd --options nbds_max=16]
I0111 10:07:38.881490     537 nodeserver.go:291] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda rbd image: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda/kube was successfully mapped at /dev/nbd0
I0111 10:07:42.568682     537 nodeserver.go:230] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda rbd: successfully mounted volume 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda to stagingTargetPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount/0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda
I0111 10:07:42.568974     537 utils.go:138] ID: 16 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC response: {}
I0111 10:07:42.584611     537 utils.go:132] ID: 17 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0111 10:07:42.585122     537 utils.go:133] ID: 17 GRPC request: {}
I0111 10:07:42.585505     537 utils.go:138] ID: 17 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0111 10:07:42.592659     537 utils.go:132] ID: 18 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0111 10:07:42.593737     537 utils.go:133] ID: 18 GRPC request: {}
I0111 10:07:42.594932     537 utils.go:138] ID: 18 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0111 10:07:42.606252     537 utils.go:132] ID: 19 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC call: /csi.v1.Node/NodeExpandVolume
I0111 10:07:42.606900     537 utils.go:133] ID: 19 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC request: {"capacity_range":{"required_bytes":17179869184},"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_id":"0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda","volume_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount"}
I0111 10:07:44.925118     537 cephcmds.go:59] ID: 19 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda command succeeded: rbd [device list --format=json --device-type nbd]
I0111 10:07:45.529426     537 utils.go:138] ID: 19 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC response: {}
I0111 10:07:45.583743     537 utils.go:132] ID: 20 GRPC call: /csi.v1.Node/NodeGetCapabilities
I0111 10:07:45.583850     537 utils.go:133] ID: 20 GRPC request: {}
I0111 10:07:45.584001     537 utils.go:138] ID: 20 GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}},{"Type":{"Rpc":{"type":2}}},{"Type":{"Rpc":{"type":3}}}]}
I0111 10:07:45.606729     537 utils.go:132] ID: 21 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC call: /csi.v1.Node/NodePublishVolume
I0111 10:07:45.607477     537 utils.go:133] ID: 21 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC request: {"staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount","target_path":"/var/lib/kubelet/pods/4e062abc-7fd4-4384-9c77-9469dde98fe4/volumes/kubernetes.io~csi/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/mount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":1}},"volume_context":{"clusterID":"eb53775c-ec88-484f-b5f5-b421b55079d7","csi.storage.k8s.io/pv/name":"pvc-c06c760d-b718-48a5-b92f-56683049e6d2","csi.storage.k8s.io/pvc/name":"test-pvc-too","csi.storage.k8s.io/pvc/namespace":"default","imageFeatures":"layering","imageFormat":"2","imageName":"k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda","journalPool":"kube","mapOptions":"nbds_max=16","mounter":"rbd-nbd","pool":"kube","radosNamespace":"","storage.kubernetes.io/csiProvisionerIdentity":"1610358940742-8081-rbd.csi.ceph.com","volumeNamePrefix":"k8s-rpi-"},"volume_id":"0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda"}
I0111 10:07:45.608877     537 nodeserver.go:518] ID: 21 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda target /var/lib/kubelet/pods/4e062abc-7fd4-4384-9c77-9469dde98fe4/volumes/kubernetes.io~csi/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/mount
isBlock false
fstype ext4
stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount/0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda
readonly false
mountflags [bind _netdev]
I0111 10:07:45.643938     537 nodeserver.go:426] ID: 21 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda rbd: successfully mounted stagingPath /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/globalmount/0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda to targetPath /var/lib/kubelet/pods/4e062abc-7fd4-4384-9c77-9469dde98fe4/volumes/kubernetes.io~csi/pvc-c06c760d-b718-48a5-b92f-56683049e6d2/mount
I0111 10:07:45.644138     537 utils.go:138] ID: 21 Req-ID: 0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda GRPC response: {}

Et enfin confirmer que le volume a bien été agrandi :

$ kubectl exec -it ceph-test -- /bin/sh
/ $ df -h /data
Filesystem                Size      Used Available Use% Mounted on
/dev/nbd0                15.6G     44.0M     15.6G   0% /data
/ $ exit
$ kubectl get pvc
NAME           STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
test-pvc-too   Bound    pvc-c06c760d-b718-48a5-b92f-56683049e6d2   16Gi       RWO            rwo-storage    12m
$ kubectl delete pod ceph-test
EOF

Moins critique, mais à contrôler tout de même, vérifions que la suppression d’un volume fonctionne :

$ kubectl describe pv pvc-c06c760d-b718-48a5-b92f-56683049e6d2
Name:            pvc-c06c760d-b718-48a5-b92f-56683049e6d2
Labels:          
Annotations:     pv.kubernetes.io/provisioned-by: rbd.csi.ceph.com
Finalizers:      [kubernetes.io/pv-protection]
StorageClass:    rwo-storage
Status:          Bound
Claim:           default/test-pvc-too
Reclaim Policy:  Delete
Access Modes:    RWO
VolumeMode:      Filesystem
Capacity:        16Gi
Node Affinity:   
Message:         
Source:
    Type:              CSI (a Container Storage Interface (CSI) volume source)
    Driver:            rbd.csi.ceph.com
    FSType:            ext4
    VolumeHandle:      0001-0024-eb53775c-ec88-484f-b5f5-b421b55079d7-0000000000000004-4b509ebf-53f3-11eb-a32d-e210cfc3feda
    ReadOnly:          false
    VolumeAttributes:      clusterID=eb53775c-ec88-484f-b5f5-b421b55079d7
                           csi.storage.k8s.io/pv/name=pvc-c06c760d-b718-48a5-b92f-56683049e6d2
                           csi.storage.k8s.io/pvc/name=test-pvc-too
                           csi.storage.k8s.io/pvc/namespace=default
                           imageFeatures=layering
                           imageFormat=2
                           imageName=k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda
                           journalPool=kube
                           mapOptions=nbds_max=16
                           mounter=rbd-nbd
                           pool=kube
                           radosNamespace=
                           storage.kubernetes.io/csiProvisionerIdentity=1610358940742-8081-rbd.csi.ceph.com
                           volumeNamePrefix=k8s-rpi-
Events:                
$ kubectl delete pvc test-pvc-too

Notons l’imageName, qui correspond au nom du volume du point de vue de notre cluster Ceph, avant de supprimer le PersistentVolumeClaim.

Depuis l’un des noeuds du cluster Ceph, nous pourrons confirmer que ce volume est présent, avant suppression du PersistentVolumeClaim, et qu’il n’existe plus ensuite :

[root@mon1 ~]# rbd -p kube ls | grep k8s-rpi
k8s-rpi-4b509ebf-53f3-11eb-a32d-e210cfc3feda
k8s-rpi-606956de-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-60695852-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-82e2ef87-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-c85d564d-50ea-11eb-9407-4e8ab1a29c3f
k8s-rpi-e10db2bd-526d-11eb-a0b5-0eaaf84a28e4
...
[root@mon1 ~]# rbd -p kube ls | grep k8s-rpi
k8s-rpi-606956de-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-60695852-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-82e2ef87-51d5-11eb-a0b5-0eaaf84a28e4
k8s-rpi-c85d564d-50ea-11eb-9407-4e8ab1a29c3f
k8s-rpi-e10db2bd-526d-11eb-a0b5-0eaaf84a28e4

Une fois la suppression du volume effective sur Kubernetes, nous pouvons confirmer que le volume Ceph correspondant a bien été supprimé.

Création, expansion, suppression, attachement et détachement des volumes du noeud : tout est en ordre.

Registry Docker

Registry Docker

Nous pourrons maintenant passer aux choses sérieuses, de déployer nos premières applications, à commencer par Prometheus, sujet sur lequel nous reviendrons dans un autre article, ou pour patienter, une registry Docker :

$ kubeclt create ns registry
$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: Service
metadata:
  name: registry
  namespace: registry
spec:
  ports:
  - name: registry
    port: 5000
  selector:
    k8s-app: registry
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: registry-pvc
  namespace: registry
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
---
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: registry
  namespace: registry
spec:
  replicas: 1
  selector:
    matchLabels:
      name: registry
  template:
    metadata:
      labels:
        name: registry
    spec:
      containers:
      - env:
        - name: REGISTRY_HTTP_ADDR
          value: :5000
        - name: REGISTRY_STORAGE_FILESYSTEM_ROOTDIRECTORY
          value: /var/lib/registry
        image: docker.io/library/registry:2.7.1
        imagePullPolicy: IfNotPresent
        name: registry
        ports:
        - containerPort: 5000
          name: registry
          protocol: TCP
        volumeMounts:
        - mountPath: /var/lib/registry
          name: registry-pvc
      volumes:
      - persitentVolumeClaim:
          claimName: registry-pvc
        name: registry-pvc
EOF

Rappelons alors que nous avions deployé Kubernetes avec, dans l’inventaire Kubespray, la variable suivante :

containerd_config:
  grpc:
    max_recv_message_size: 16777216
    max_send_message_size: 16777216
  debug:
    level: ""
  registries:
    docker.io: "https://registry-1.docker.io"
    katello: "https://katello.vms.intra.example.com:5000"
    "registry.registry.svc.cluster.local:5000": "http://registry.registry.svc.cluster.local:5000"
  max_container_log_line_size: -1
  metrics:
    address: ""
    grpc_histogram: false

Nos noeuds sont donc déjà en mesure d’accéder à cette registry, libre à nous d’y pousser de premières images, … Un point sur lequel nous reviendrons dans un prochain article, consacré à Tekton.

Snapshots

Enfin, notons que depuis Kubernetes 1.17, nous pourrons profiter d’une fonctionnalitée introduite par CSI, et déployer un snapshot controller.

Commencer par créer un ServiceAccount pour lui déléguer les privilèges nécessaire.

$ cat <<EOF | kubectl apply -f-
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: snapshot-controller
  namespace: rbd-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: snapshot-controller-runner
rules:
- apiGroups: [""]
  resources: ["persistentvolumes"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["events"]
  verbs: ["list", "watch", "create", "update", "patch"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshotclasses"]
  verbs: ["get", "list", "watch"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshotcontents"]
  verbs: ["create", "get", "list", "watch", "update", "delete"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshots"]
  verbs: ["get", "list", "watch", "update"]
- apiGroups: ["snapshot.storage.k8s.io"]
  resources: ["volumesnapshots/status"]
  verbs: ["update"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: snapshot-controller-role
subjects:
- kind: ServiceAccount
  name: snapshot-controller
  namespace: rbd-provisioner
roleRef:
  kind: ClusterRole
  name: snapshot-controller-runner
  apiGroup: rbac.authorization.k8s.io
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: rbd-provisioner
  name: snapshot-controller-leaderelection
rules:
- apiGroups: ["coordination.k8s.io"]
  resources: ["leases"]
  verbs: ["get", "watch", "list", "delete", "update", "create"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: snapshot-controller-leaderelection
  namespace: rbd-provisioner
subjects:
- kind: ServiceAccount
  name: snapshot-controller
  namespace: rbd-provisioner
roleRef:
  kind: Role
  name: snapshot-controller-leaderelection
  apiGroup: rbac.authorization.k8s.io
EOF

Poursuivre en déployant le contrôleur :

$ cat <<EOF | kubectl apply -f-
---
kind: StatefulSet
apiVersion: apps/v1
metadata:
  name: snapshot-controller
  namespace: rbd-provisioner
spec:
  serviceName: snapshot-controller
  replicas: 1
  selector:
    matchLabels:
      app: snapshot-controller
  template:
    metadata:
      labels:
        app: snapshot-controller
    spec:
      serviceAccount: snapshot-controller
      containers:
      - name: snapshot-controller
        image: k8s.gcr.io/sig-storage/snapshot-controller:v3.0.2
        args:
        - "--v=5"
        - "--leader-election=false"
        imagePullPolicy: Always
        securityContext:
          runAsUser: 1001
EOF

Il faudra ensuite créer une VolumeSnapshotClass, semblable à notre StorageClass :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshotClass
metadata:
  name: csi-rbdplugin-snapclass
driver: rbd.csi.ceph.com
parameters:
  clusterID: eb53775c-ec88-484f-b5f5-b421b55079d7
  snapshotNamePrefix: k8s-rpi-snap-
  csi.storage.k8s.io/snapshotter-secret-name: ceph-secret-admin
  csi.storage.k8s.io/snapshotter-secret-namespace: rbd-provisioner
deletionPolicy: Delete
EOF

Enfin, nous pourrons manipuler nos snapshots Ceph depuis l’API de notre cluster Kubernetes.

Demander la création d’un snapshot d’un volume existant en créant un VolumeSnapshot :

$ cat <<EOF | kubectl apply -f-
---
apiVersion: snapshot.storage.k8s.io/v1beta1
kind: VolumeSnapshot
metadata:
  name: rbd-pvc-snapshot
  namespace: downloads
spec:
  volumeSnapshotClassName: csi-rbdplugin-snapclass
  source:
    persistentVolumeClaimName: medusa-kube
EOF
$ kubectl get volumesnapshot -n downloads
NAME               READYTOUSE   SOURCEPVC     SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS             SNAPSHOTCONTENT                                    CREATIONTIME   AGE
rbd-pvc-snapshot   false        medusa-kube                                         csi-rbdplugin-snapclass   snapcontent-3cb99542-97e1-45f9-9445-0bbb06e35225                  31s

Nous pourrons suivre la création du snapshot, en interrogeant les logs de notre contrôleur :

$ kubectl logs -n rbd-provisioner snapshot-controller-0 -f
I0129 13:57:28.404847       1 main.go:66] Version: v3.0.2
I0129 13:57:28.410398       1 main.go:93] Start NewCSISnapshotController with kubeconfig [] resyncPeriod [1m0s]
I0129 13:57:28.412022       1 reflector.go:207] Starting reflector *v1beta1.VolumeSnapshotContent (1m0s) from github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117
[...]
I0129 14:03:58.732509       1 snapshot_controller_base.go:158] enqueued "downloads/rbd-pvc-snapshot" for sync
I0129 14:03:58.732586       1 snapshot_controller_base.go:202] syncSnapshotByKey[downloads/rbd-pvc-snapshot]
I0129 14:03:58.732611       1 snapshot_controller_base.go:205] snapshotWorker: snapshot namespace [downloads] name [rbd-pvc-snapshot]
I0129 14:03:58.732632       1 snapshot_controller_base.go:328] checkAndUpdateSnapshotClass [rbd-pvc-snapshot]: VolumeSnapshotClassName [csi-rbdplugin-snapclass]
[...]
I0129 14:04:35.033443       1 snapshot_controller_base.go:222] Updating snapshot "downloads/rbd-pvc-snapshot"
I0129 14:04:35.033988       1 snapshot_controller_base.go:358] updateSnapshot "downloads/rbd-pvc-snapshot"
I0129 14:04:35.034056       1 util.go:258] storeObjectUpdate updating snapshot "downloads/rbd-pvc-snapshot" with version 21039253
I0129 14:04:35.034157       1 snapshot_controller.go:180] synchronizing VolumeSnapshot[downloads/rbd-pvc-snapshot]: bound to: "snapcontent-3cb99542-97e1-45f9-9445-0bbb06e35225", Completed: true
I0129 14:04:35.034182       1 snapshot_controller.go:182] syncSnapshot [downloads/rbd-pvc-snapshot]: check if we should remove finalizer on snapshot PVC source and remove it if we can
I0129 14:04:35.034228       1 snapshot_controller.go:903] checkandRemovePVCFinalizer for snapshot [rbd-pvc-snapshot]: snapshot status [&v1beta1.VolumeSnapshotStatus{BoundVolumeSnapshotContentName:(*string)(0x40004c1940), CreationTime:(*v1.Time)(0x4000350de0), ReadyToUse:(*bool)(0x4000527b9e), RestoreSize:(*resource.Quantity)(0x4000374c00), Error:(*v1beta1.VolumeSnapshotError)(nil)}]
I0129 14:04:35.034334       1 snapshot_controller.go:858] Checking isPVCBeingUsed for snapshot [downloads/rbd-pvc-snapshot]
I0129 14:04:35.034473       1 snapshot_controller.go:883] isPVCBeingUsed: no snapshot is being created from PVC downloads/medusa-kube
I0129 14:04:35.034515       1 snapshot_controller.go:911] checkandRemovePVCFinalizer[rbd-pvc-snapshot]: Remove Finalizer for PVC medusa-kube as it is not used by snapshots in creation
I0129 14:04:35.059226       1 snapshot_controller.go:851] Removed protection finalizer from persistent volume claim medusa-kube
I0129 14:04:35.059293       1 snapshot_controller.go:191] syncSnapshot[downloads/rbd-pvc-snapshot]: check if we should add invalid label on snapshot
I0129 14:04:35.059312       1 snapshot_controller.go:209] syncSnapshot[downloads/rbd-pvc-snapshot]: validate snapshot to make sure source has been correctly specified
I0129 14:04:35.059329       1 snapshot_controller.go:218] syncSnapshot[downloads/rbd-pvc-snapshot]: check if we should add finalizers on snapshot
I0129 14:04:35.059355       1 snapshot_controller.go:386] syncReadySnapshot[downloads/rbd-pvc-snapshot]: VolumeSnapshotContent "snapcontent-3cb99542-97e1-45f9-9445-0bbb06e35225" found
I0129 14:04:39.890672       1 reflector.go:515] github.com/kubernetes-csi/external-snapshotter/client/v3/informers/externalversions/factory.go:117: Watch close - *v1beta1.VolumeSnapshot total 5 items received

Une fois le snapshot finalisé, nous devrions pouvoir confirmer que la colonne READYTOUSE est devenue true :

$ kubectl get volumesnapshot -n downloads
NAME               READYTOUSE   SOURCEPVC     SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS             SNAPSHOTCONTENT                                    CREATIONTIME   AGE
rbd-pvc-snapshot   true         medusa-kube                           8Gi           csi-rbdplugin-snapclass   snapcontent-3cb99542-97e1-45f9-9445-0bbb06e35225   14s            46s

Vérifions aussi sur le cluster Ceph, que notre snapshot est bien présent :

$ rbd -p kube ls
k8s-rpi-02f29ddf-5e51-11eb-a107-3a439f3884b7
...
k8s-rpi-e462aa11-5e54-11eb-a107-3a439f3884b7
k8s-rpi-snap-e725bd91-623a-11eb-8935-9e78daa852af

Controverse

La situation n’est pas parfaite, et il faut ici mentionner un problème de taille, commun à toutes les implémentations de CSI.

Lorsqu’un noeud ne répond plus, son status change en NotReady. Après un timeout, si celui-ci ne s’est pas manifesté, Kubernetes doit re-scheduler les Pods impactés sur un autre noeud.

Mais aujourd’hui, la spécification CSI veut que le contrôleur en charge d’attacher un volume a un noeud, lorsqu’il s’attribue un volume, créé un lease, empêchant l’attachement par un autre noeud lorsque notre volume a un accessModes de ReadWriteOnce.

Ceci implique donc que lorsqu’un noeud devient injoignable, l’attacher en question n’est plus en mesure de relacher son lock. Les Pods rattachés à de tels volumes ne seront pas en mesure de démarrer, tant que le noeud souffrant d’un incident ne sera pas revenu.

Nous pourrons reproduire, en débranchant le réseau d’un noeud hébergeant de tels Pods. Une fois que notre nouveau Pod est schedulé sur un noeud sain, nous retrouverons un évènement tel que le suivant :

Events:
  Type     Reason              Age   From                     Message
  ----     ------              ----  ----                     -------
  Normal   Scheduled           13s   default-scheduler        Successfully assigned downloads/sabnzbd-kube-7fdff54694-5v6zb to melpomene.friends.intra.example.com
  Warning  FailedAttachVolume  13s   attachdetach-controller  Multi-Attach error for volume "pvc-33c81bfe-109f-4439-8979-d5e6309f42fe" Volume is already used by pod(s) sabnzbd-kube-7fdff54694-mzst9

Il s’agit là d’une régression, par rapport aux versions plus anciennes de Kubernetes.

Le plus simple, à ce stade, sera de redémarrer le noeud fautif, en espérant que celui-ci rejoigne bien le cluster – ce qui ne sera pas garanti, notament lors d’incidents impactant un disque.

Pour l’instant, l’utilisation de volumes NFS, quoi que rarement recommandé, sera l’une des dernières pouvant fonctionner sans CSI, et ne souffrant donc pas de ce problème.

Si les premières versions de CSI ne permettaient en aucun cas de remonter un volume lors de tels incidents, notons que les dernières releases ont introduit une solution de contournement. En effet, nous retrouverons aujourd’hui ces locks, empêchant l’accès à une PersistentVolumeClaim par d’autres noeuds, en interrogeant les VolumeAttachments du cluster.

# kubectl describe pvc -n downloads sabnzbd-kube
Name:          sabnzbd-kube
Namespace:     downloads
StorageClass:  rwo-storage
Status:        Bound
Volume:        pvc-33c81bfe-109f-4439-8979-d5e6309f42fe
...
# kubectl get volumeattachments
NAME                                                                 ATTACHER         PV                                       NODE                             ATTACHED   AGE
...
csi-f0c7854ca41926e4fd1f766f481c6492b43152b689244fbdb190e7b8bda4106f rbd.csi.ceph.com pvc-33c81bfe-109f-4439-8979-d5e6309f42fe clio.friends.intra.example.com   true       23m
...
# kubectl describe volumeattachment csi-f0c7854ca41926e4fd1f766f481c6492b43152b689244fbdb190e7b8bda4106f
Name:         csi-f0c7854ca41926e4fd1f766f481c6492b43152b689244fbdb190e7b8bda4106f
...
Spec:
  Attacher:   rbd.csi.ceph.com
  Node Name:  clio.friends.intra.example.com
  Source:
    Persistent Volume Name:  pvc-33c81bfe-109f-4439-8979-d5e6309f42fe
...
# kubectl delete volumeattachment csi-f0c7854ca41926e4fd1f766f481c6492b43152b689244fbdb190e7b8bda4106f
volumeattachment.storage.k8s.io "csi-f0c7854ca41926e4fd1f766f481c6492b43152b689244fbdb190e7b8bda4106f" deleted
# kubectl get events -n downloads -w
...
0s          Normal    SuccessfulAttachVolume   pod/sabnzbd-kube-7fdff54694-5v6zb    AttachVolume.Attach succeeded for volume "pvc-33c81bfe-109f-4439-8979-d5e6309f42fe"

Après suppression du VolumeAttachment mentionnant le nom de notre PersistentVolume, nous pouvons confirmer que le nouveau Pod démarre finalement.

Et de même, nous pouvons voire que si ce VolumeAttachment n’est pas supprimé, notre Pod témoin refuse de se lancer :

root@pandore:~# kubectl get pods -o wide -n downloads -w
NAME                                  READY  STATUS             RESTARTS  AGE    IP              NODE
backups-kube-64fb9d44cd-dgdws         2/2    Running            0         13m    10.233.198.25   erato.friends.intra.example.com
medusa-kube-7ff5cbfd66-4jrj5          0/2    ContainerCreating  0         6m24s            euterpe.friends.intra.example.com
medusa-kube-7ff5cbfd66-zhr7f          2/2    Terminating        0         15m    10.233.197.116  clio.friends.intra.example.com
newznab-kube-669d69ddc8-8r682         4/4    Running            0         7d15h  10.233.193.25   epimethee.friends.intra.example.com
newznab-mariadb-kube-6c67d6b57-2kt5t  2/2    Running            0         7d16h  10.233.195.40   pyrrha.friends.intra.example.com
sabnzbd-kube-7fdff54694-5v6zb         3/4    Running            0         6m24s  10.233.200.39   melpomene.friends.intra.example.com
sabnzbd-kube-7fdff54694-mzst9         4/4    Terminating        0         15m    10.233.197.115  clio.friends.intra.example.com
transmission-kube-c68dd989-9fk2j      4/4    Running            0         7d17h  10.233.195.36   pyrrha.friends.intra.example.com

Une fois nos applications de nouveau joignables, nous pourrons nous pencher sur notre noeud. Si son reboot n’a pas suffit, peut-être voudra-t-on le redéployer, réutilisant les mêmes noms et IPs. Ou le supprimer complètement du cluster, purgeant Node, Pods, VolumeAttachments, … toutes les ressources ingoignables.

Conclusion

Si les plateformes ARM ont pu souffrir du manque de plusieurs fonctionnalités par le passé, notament CSI, elles se prètent de mieux en mieux au déploiement de clusters Kubernetes. On y retrouve aujourd’hui tous les incontournables : Containerd, Prometheus, Tekton, …

Par ailleurs, l’arrivée des Raspberry 4, disposant de 4 ou 8G de mémoire RAM, en font un choix de prédilection, pour le déploiement d’un cluster Kubernetes.

En combinaison avec Ceph : une recette parfaite pour déployer votre cloud à moindre coût. Bien que l’on puisse regretter les lacunes de CSI.