Continuing my Kubernetes series, this post covers storage – one of the trickier concepts to grasp when starting with Kubernetes. Understanding how containers handle data, the difference between ephemeral and persistent storage, and how Kubernetes manages volumes is crucial for running stateful applications.

Before diving into Kubernetes storage, let’s look at how Docker handles storage – it makes understanding Kubernetes much easier.

Docker Storage

Docker stores data on the local filesystem at /var/lib/docker, organizing it into directories for containers, images, volumes, and other data.

When you build a container from a Dockerfile, each line creates a layer. The first time you build, all layers are created fresh. On subsequent builds, Docker uses cached layers for unchanged instructions, making builds much faster.

By default, container storage is ephemeral – when the container is deleted, its data disappears. For persistent data, you need volumes.

Docker Volumes:

Create a volume with docker volume create volume-name. You can then mount this volume to your container, and the data persists even after the container is removed.

Volumes are stored at /var/lib/docker/volumes.

Two mounting methods:

  • Volume mount – Mounts a volume from the Docker volumes directory
  • Bind mount – Mounts any directory from the Docker host into the container

Storage Drivers vs Volume Drivers:

Storage drivers manage storage for images and containers – handling the layered filesystem.

Volume drivers handle volumes through plugins. These plugins can integrate with third-party storage platforms like NFS, cloud storage, or specialized storage solutions.

Container Storage Interface (CSI)

CSI is a standard that allows Kubernetes to work with various storage providers. It makes it possible to incorporate third-party storage platforms without modifying Kubernetes core code. Storage vendors implement CSI drivers, and Kubernetes uses those drivers to provision and manage storage.

Volumes in Kubernetes

Just like Docker, containers in Kubernetes have ephemeral storage by default. To make data persistent, you attach volumes.

Volumes persist even when containers are deleted. However, basic volumes are local to the node they’re on. If your pod moves to a different node, it won’t have access to that data. For sharing volumes across nodes, you need network storage like NFS or cloud solutions like AWS EBS or Azure Disks.

Example: Mounting a directory from node to container

spec:
containers:
- name: myapp-container
image: nginx
volumeMounts:
- name: app-storage
mountPath: /data/app
volumes:
- name: app-storage
hostPath:
path: /data/foo
type: Directory

This mounts the /data/foo directory from the node into /data/app inside the container.

Persistent Volumes (PVs)

Configuring storage at the pod level for every application gets cumbersome. Persistent Volumes let you manage storage centrally.

A PersistentVolume (PV) is a cluster resource created via YAML, just like other Kubernetes resources. It defines a block of storage with specific characteristics:

  • Storage capacity (e.g., 10Gi)
  • Access modes (ReadWriteOnce, ReadOnlyMany, ReadWriteMany)
  • Volume type (hostPath, NFS, AWS EBS, etc.)
  • Reclaim policy (what happens when it’s no longer needed)

Example PV:

apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-1
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
hostPath:
path: /mnt/data

This defines a 10GB volume using local storage at /mnt/data.

Persistent Volume Claims (PVCs)

A PersistentVolumeClaim is how pods request storage. It’s also a Kubernetes object created via YAML.

Here’s the key concept: PVs are created on the cluster by admins. PVCs are created by users requesting storage. Kubernetes then matches claims to suitable PVs – it’s not a direct one-to-one relationship you configure manually.

Example PVC:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mealie-data
namespace: mealie
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi

This requests 500Mi of storage with ReadWriteOnce access.

How Matching Works:

The Kubernetes control plane looks for a PV that matches the PVC based on:

  • Storage capacity – PV must have >= requested storage
  • Access modes – must match (ReadWriteOnce, ReadOnlyMany, ReadWriteMany)
  • Storage class – must match if specified
  • Selectors/Labels – if specified in the PVC

The first suitable match wins – not necessarily the “best” match. Once bound, it’s a one-to-one exclusive relationship. The PV status changes from Available to Bound, and the PVC status changes from Pending to Bound.

Using the PVC in a Pod:

Once your PVC is created and bound, you attach it to your container:

spec:
containers:
- image: ghcr.io/mealie-recipes/mealie:v1.2.0
name: mealie
ports:
- containerPort: 9000
volumeMounts:
- mountPath: /app/data
name: mealie-data
volumes:
- name: mealie-data
persistentVolumeClaim:
claimName: mealie-data

Even if the pod is deleted, the claim remains. PVCs are independent of pods – that’s the whole point. Your data persists beyond the lifecycle of individual pods.

Reclaim Policies

What happens to a PV when its PVC is deleted? That’s determined by the reclaim policy.

Retain (default for manually created PVs):

  • PV remains after PVC deletion
  • Data is preserved
  • PV status becomes Released (not Available)
  • An admin must manually clean up the data and make it available again

Delete:

  • PV and underlying storage are deleted when PVC is deleted
  • Common with dynamic provisioning
  • Be careful with this – deleting a PVC deletes your data

Storage Classes

So far, we’ve been manually creating PVs. This means someone has to provision the underlying storage (create the disk, set up NFS shares, etc.) before creating the PV resource in Kubernetes.

Storage Classes automate this process – they enable dynamic provisioning.

The workflow with Storage Classes:

  1. Create a StorageClass that defines how storage should be provisioned (which provisioner to use, parameters like disk type, replication settings, etc.)
  2. Reference that StorageClass in your PVC
  3. Mount the PVC in your pod

When the PVC is created, Kubernetes automatically provisions the underlying storage and creates the PV. No manual disk creation needed.

This is especially useful in cloud environments where you can dynamically provision EBS volumes, Azure Disks, or Google Persistent Disks on demand.

Wrapping Up

Storage in Kubernetes starts with understanding Docker’s layered filesystem and volume concepts. In Kubernetes, we build on this with Persistent Volumes for centralized storage management and Persistent Volume Claims for requesting storage.

The key takeaways:

  • Basic volumes are ephemeral and node-local
  • PVs provide persistent, centrally-managed storage
  • PVCs are how pods request storage from available PVs
  • Kubernetes automatically matches PVCs to suitable PVs
  • Reclaim policies determine what happens when PVCs are deleted
  • Storage Classes enable dynamic provisioning, removing the need for manual storage setup

Understanding these concepts is essential for running databases, file storage, and any stateful application in Kubernetes.

This wraps up storage, I’ll see you in the next post when we’ll cover the last topic, Networking in Kubernetes.

Posted in

Leave a comment