Continuing my Kubernetes series, this post covers storage – one of the trickier concepts to grasp when starting with Kubernetes. Understanding how containers handle data, the difference between ephemeral and persistent storage, and how Kubernetes manages volumes is crucial for running stateful applications.
Before diving into Kubernetes storage, let’s look at how Docker handles storage – it makes understanding Kubernetes much easier.
Docker Storage
Docker stores data on the local filesystem at /var/lib/docker, organizing it into directories for containers, images, volumes, and other data.
When you build a container from a Dockerfile, each line creates a layer. The first time you build, all layers are created fresh. On subsequent builds, Docker uses cached layers for unchanged instructions, making builds much faster.
By default, container storage is ephemeral – when the container is deleted, its data disappears. For persistent data, you need volumes.
Docker Volumes:
Create a volume with docker volume create volume-name. You can then mount this volume to your container, and the data persists even after the container is removed.
Volumes are stored at /var/lib/docker/volumes.
Two mounting methods:
- Volume mount – Mounts a volume from the Docker volumes directory
- Bind mount – Mounts any directory from the Docker host into the container
Storage Drivers vs Volume Drivers:
Storage drivers manage storage for images and containers – handling the layered filesystem.
Volume drivers handle volumes through plugins. These plugins can integrate with third-party storage platforms like NFS, cloud storage, or specialized storage solutions.
Container Storage Interface (CSI)
CSI is a standard that allows Kubernetes to work with various storage providers. It makes it possible to incorporate third-party storage platforms without modifying Kubernetes core code. Storage vendors implement CSI drivers, and Kubernetes uses those drivers to provision and manage storage.
Volumes in Kubernetes
Just like Docker, containers in Kubernetes have ephemeral storage by default. To make data persistent, you attach volumes.
Volumes persist even when containers are deleted. However, basic volumes are local to the node they’re on. If your pod moves to a different node, it won’t have access to that data. For sharing volumes across nodes, you need network storage like NFS or cloud solutions like AWS EBS or Azure Disks.
Example: Mounting a directory from node to container
spec: containers: - name: myapp-container image: nginx volumeMounts: - name: app-storage mountPath: /data/app volumes: - name: app-storage hostPath: path: /data/foo type: Directory
This mounts the /data/foo directory from the node into /data/app inside the container.
Persistent Volumes (PVs)
Configuring storage at the pod level for every application gets cumbersome. Persistent Volumes let you manage storage centrally.
A PersistentVolume (PV) is a cluster resource created via YAML, just like other Kubernetes resources. It defines a block of storage with specific characteristics:
- Storage capacity (e.g., 10Gi)
- Access modes (ReadWriteOnce, ReadOnlyMany, ReadWriteMany)
- Volume type (hostPath, NFS, AWS EBS, etc.)
- Reclaim policy (what happens when it’s no longer needed)
Example PV:
apiVersion: v1kind: PersistentVolumemetadata: name: pv-1spec: capacity: storage: 10Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain hostPath: path: /mnt/data
This defines a 10GB volume using local storage at /mnt/data.
Persistent Volume Claims (PVCs)
A PersistentVolumeClaim is how pods request storage. It’s also a Kubernetes object created via YAML.
Here’s the key concept: PVs are created on the cluster by admins. PVCs are created by users requesting storage. Kubernetes then matches claims to suitable PVs – it’s not a direct one-to-one relationship you configure manually.
Example PVC:
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: mealie-data namespace: mealiespec: accessModes: - ReadWriteOnce resources: requests: storage: 500Mi
This requests 500Mi of storage with ReadWriteOnce access.
How Matching Works:
The Kubernetes control plane looks for a PV that matches the PVC based on:
- Storage capacity – PV must have >= requested storage
- Access modes – must match (ReadWriteOnce, ReadOnlyMany, ReadWriteMany)
- Storage class – must match if specified
- Selectors/Labels – if specified in the PVC
The first suitable match wins – not necessarily the “best” match. Once bound, it’s a one-to-one exclusive relationship. The PV status changes from Available to Bound, and the PVC status changes from Pending to Bound.
Using the PVC in a Pod:
Once your PVC is created and bound, you attach it to your container:
spec: containers: - image: ghcr.io/mealie-recipes/mealie:v1.2.0 name: mealie ports: - containerPort: 9000 volumeMounts: - mountPath: /app/data name: mealie-data volumes: - name: mealie-data persistentVolumeClaim: claimName: mealie-data
Even if the pod is deleted, the claim remains. PVCs are independent of pods – that’s the whole point. Your data persists beyond the lifecycle of individual pods.
Reclaim Policies
What happens to a PV when its PVC is deleted? That’s determined by the reclaim policy.
Retain (default for manually created PVs):
- PV remains after PVC deletion
- Data is preserved
- PV status becomes Released (not Available)
- An admin must manually clean up the data and make it available again
Delete:
- PV and underlying storage are deleted when PVC is deleted
- Common with dynamic provisioning
- Be careful with this – deleting a PVC deletes your data
Storage Classes
So far, we’ve been manually creating PVs. This means someone has to provision the underlying storage (create the disk, set up NFS shares, etc.) before creating the PV resource in Kubernetes.
Storage Classes automate this process – they enable dynamic provisioning.
The workflow with Storage Classes:
- Create a StorageClass that defines how storage should be provisioned (which provisioner to use, parameters like disk type, replication settings, etc.)
- Reference that StorageClass in your PVC
- Mount the PVC in your pod
When the PVC is created, Kubernetes automatically provisions the underlying storage and creates the PV. No manual disk creation needed.
This is especially useful in cloud environments where you can dynamically provision EBS volumes, Azure Disks, or Google Persistent Disks on demand.
Wrapping Up
Storage in Kubernetes starts with understanding Docker’s layered filesystem and volume concepts. In Kubernetes, we build on this with Persistent Volumes for centralized storage management and Persistent Volume Claims for requesting storage.
The key takeaways:
- Basic volumes are ephemeral and node-local
- PVs provide persistent, centrally-managed storage
- PVCs are how pods request storage from available PVs
- Kubernetes automatically matches PVCs to suitable PVs
- Reclaim policies determine what happens when PVCs are deleted
- Storage Classes enable dynamic provisioning, removing the need for manual storage setup
Understanding these concepts is essential for running databases, file storage, and any stateful application in Kubernetes.
This wraps up storage, I’ll see you in the next post when we’ll cover the last topic, Networking in Kubernetes.