A Machine Learning Engineer's Homepage

It is not just about machine learning ...

Home Blogs LinkedIn Publications

Understanding Persistent Volumes in Kubernetes: Part 1 - Stateful Persistent Volumes

💬 “大直若屈, 大巧若拙, 大辩若讷。(Great straightness appears bent; Great skill appears clumsy; Great eloquence appears hesitant)”
— Laozi

This tutorial is the first part of a series aimed at demystifying Persistent Volumes (PVs) in Kubernetes, focusing on stateful storage concepts. In this post, we’ll explore what Persistent Volumes are, why they matter, and how they work with Persistent Volume Claims (PVCs) and Pods. We’ll provide practical YAML examples and dive into implementation, troubleshooting, and common pitfalls. Part 2 will cover dynamic provisioning of PVs.

1. Introduction to Persistent Volumes

A Persistent Volume (PV) in Kubernetes is a cluster-wide resource that represents a piece of storage in your cluster. Unlike ephemeral storage (e.g., container storage that disappears when a Pod is terminated), PVs provide a way to manage durable storage that persists beyond the lifecycle of a Pod. They are essential for stateful applications like databases, file servers, or any workload requiring data retention.

Why Use Persistent Volumes?

Benefits of Persistent Volumes

Relationship Between PV, PVC, and Pods

A Persistent Volume (PV) is a cluster resource that defines the storage details (e.g., capacity, access mode, and storage type). A Persistent Volume Claim (PVC) is a request for storage by a user or application, specifying requirements like size and access mode. The PVC binds to a matching PV. A Pod then uses the PVC to mount the PV’s storage as a volume, making it accessible to containers. This abstraction separates storage provisioning (PV) from storage consumption (PVC/Pod).

In this tutorial, we’ll walk through creating a PV, binding it with a PVC, and using it in a Pod, followed by implementation and troubleshooting steps.

2. Sample Persistent Volume (PV) YAML

Below is an example of a Persistent Volume YAML configuration using an NFS-backed storage.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-pv
  labels:
    type: nfs
spec:
  capacity:
    storage: 1Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: manual
  nfs:
    path: /mnt/data
    server: 192.168.1.100

Explanation of PV YAML

This PV defines a 1Gi NFS-backed volume that multiple Pods can read and write to.

3. Sample Persistent Volume Claim (PVC) YAML

The PVC requests storage from a PV that matches its requirements.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: manual
  selector:
    matchLabels:
      type: nfs

Explanation of PVC YAML

The PVC searches for a PV with matching accessModes, storageClassName, and labels, then binds to it (in this case, my-pv).

4. Sample Pod YAML Using the PVC

A Pod uses the PVC to mount the PV’s storage.

apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  containers:
  - name: my-container
    image: nginx
    volumeMounts:
    - mountPath: /usr/share/nginx/html
      name: my-volume
  volumes:
  - name: my-volume
    persistentVolumeClaim:
      claimName: my-pvc

Explanation of Pod YAML

This Pod mounts the storage from my-pvc (bound to my-pv) into the Nginx container’s web directory.

5. How PV, PVC, and Pod Work Together

Here’s a step-by-step explanation of how a Pod uses a PV: PV Creation: A cluster administrator creates a PV (my-pv) that defines storage (e.g., 1Gi on an NFS server). PVC Creation: A user or application creates a PVC (my-pvc) to request storage with specific requirements (e.g., size, access mode, storage class). PVC Binding: The PVC (my-pvc) requests storage and binds to a matching PV (my-pv) based on access mode, storage class, and labels. Pod Usage: The Pod (my-pod) references the PVC (my-pvc) in its volumes section, mounting the PV’s storage into the container’s filesystem. Since the accessMode is set to readWriteMany in our case, you can see in the diagram below that multiple pods in different nodes can mount the NFS storage simultaneously. Data Access: The container reads/writes data to the mounted path (/usr/share/nginx/html), which persists on the PV’s underlying storage.

pv-stateful

This process ensures that the Pod’s data is stored durably, even if the Pod is deleted or rescheduled.

6. Concept Clarifications, Misconceptions, and Pitfalls

Common Misconceptions

Pitfalls

Best Practices

7. Implementation, Inspection, and Troubleshooting

Implementation

Set Up NFS Server: Ensure your NFS server is running and accessible (e.g., 192.168.1.100:/mnt/data). Apply PV YAML:

   kubectl apply -f pv.yaml

Apply PVC YAML:

   kubectl apply -f pvc.yaml

Apply Pod YAML:

   kubectl apply -f pod.yaml

Inspection

Troubleshooting

8. Summary

In this tutorial, we explored the fundamentals of Persistent Volumes (PVs) in Kubernetes, focusing on stateful storage. We learned that PVs provide durable storage for stateful applications, abstracted through PVCs for easy consumption by Pods. We walked through creating a PV, PVC, and Pod with YAML examples, explaining how they connect to enable persistent data storage. We also covered common pitfalls and troubleshooting steps to ensure smooth implementation. This foundational knowledge prepares you for Part 2, where we’ll dive into dynamic provisioning of PVs.

9. Additional Resources

To deepen your understanding, check out these official Kubernetes resources: