Use GlusterFS storage in JupyterHub for K8s

Posted by docpepper on Mon, 06 Jan 2020 11:12:11 +0100

Using GlusterFS in Kubernetes ( https://www.gluster.org/ There are endpoint (external storage) and heketi (k8s built-in GlusterFS service).This article focuses on using endpoint to set up GlusterFS storage for use in JupyterHub for K8s.For simplicity, install using the default JupyterHub helm.According to the Quick Setup JupyterHub for K8s After installation, a hub-db-dir pvc will appear under Jupyterhub's installation namespace, which I will supply using GlusterFS's volume.

1. Create endpoint for volume of GlusterFS

Save the following to file 0a-glusterfs-gvzr00-endpoint-jupyter.yaml:

apiVersion: v1
kind: Endpoints
metadata:
  name: glusterfs-gvzr00
  namespace: jupyter
subsets:
- addresses:
  - ip: 10.1.1.193
  - ip: 10.1.1.205
  - ip: 10.1.1.112
  ports:
  - port: 10000
    protocol: TCP
  • Where addresss is the peer node access address of the corresponding replicated volume of GlusterFS (where three-node replication provides redundant storage).

Create a service and save the following to file 0b-glusterfs-gvzr00-service-jupyter.yaml:

apiVersion: v1
kind: Service
metadata:
  name: glusterfs-gvzr00
  namespace: jupyter
spec:
  ports:
  - port: 10000
    protocol: TCP
    targetPort: 10000
  sessionAffinity: None
  type: ClusterIP

2. pv and pvc for hub-db-dir of jupyterhub system

Create the pv and pvc of the jupyterhub main service program to store system data.

2.1 Create pv

Save the following to file 1a-glusterfs-gvzr00-pv-jupyter-hub.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: hub-db-dir
  namespace: jupyter
spec:
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: "glusterfs-gvzr00"
    path: "gvzr00/jupyterhub/hub-db-dir"
    readOnly: false

2.2 Create pvc

Delete pvc first.

kubectl delete pvc/hub-db-dir -n jhub

Save the following to file 1b-glusterfs-gvzr00-pvc-jupyter-hub.yaml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: hub-db-dir
  namespace: jupyter
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 8Gi

3. Create pv and pvc of supermap of jupyterhub system

Each user's own pv and pvc to store user data for notebook server.

3.1 Create pv

Save the following to file 2a-glusterfs-gvzr00-pv-jupyter-supermap.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: claim-supermap
  namespace: jupyter
spec:
  capacity:
    storage: 16Gi
  accessModes:
    - ReadWriteMany
  glusterfs:
    endpoints: "glusterfs-gvzr00"
    path: "gvzr00/jupyterhub/claim-supermap"
    readOnly: false

3.2 Create pvc

Save the following to file 2b-glusterfs-gvzr00-pvc-jupyter-supermap.yaml:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: claim-supermap
  namespace: jupyter
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 16Gi

4. Run Settings

Modify the above files according to your cluster address and storage capacity.

4.1 Create pv and pvc

Save the following to the file apply.sh:

echo "Create endpoint and svc, glusterfs-gvzr00 ..."
kubectl apply -f 0a-glusterfs-gvzr00-endpoint-jupyter.yaml
kubectl apply -f 0b-glusterfs-gvzr00-service-jupyter.yaml

echo "Create pv and pvc, hub-db-dir ..."
kubectl apply -f 1a-glusterfs-gvzr00-pv-jupyter-hub.yaml
kubectl apply -f 1b-glusterfs-gvzr00-pvc-jupyter-hub.yaml

echo "Create pv and pvc, claim--supermap ..."
kubectl apply -f 2a-glusterfs-gvzr00-pv-jupyter-supermap.yaml
kubectl apply -f 2b-glusterfs-gvzr00-pvc-jupyter-supermap.yaml

echo "Finished."
echo ""

Then run apply.sh.

4.2 Delete pv and pvc

Save the following to the file delete.sh

# 
echo "Delete pv and pvc, hub-db-dir ..."
kubectl delete pvc/hub-db-dir -n jupyter
kubectl delete pv/hub-db-dir -n jupyter

echo "Delete pv and pvc, claim--supermap ..."
kubectl delete pvc/claim-supermap -n jupyter
kubectl delete pv/claim--supermap -n jupyter

echo "Delete endpoint and svc, glusterfs-gvzr00 ..."
kubectl delete svc/glusterfs-gvzr00 -n jupyter
kubectl delete ep/glusterfs-gvzr00 -n jupyter

echo "Finished."
echo ""

Run delete.sh when all deletions are required.

4.3 View pv and pvc

Through Dashboard or command:

kubectl get pv -n jhub

kubectl get pvc -n jhub

5. More References

Topics: Big Data jupyter Kubernetes