Understanding Cluster Architecture
I’ve been working through Kubernetes fundamentals in my preparation for the CKA and wanted to document what I cover as I go through my studies. This is the first in a series where I’ll break down Kubernetes piece by piece. Starting with the Core Concepts within Kubernetes
The Cluster: Nodes and Their Roles
A Kubernetes cluster is made up of nodes – essentially your servers. You have two types:
Worker nodes host your applications as containers. These are where your actual workloads run.
Master nodes (also called control plane nodes) manage, plan, schedule, and monitor the worker nodes. Think of the master as the brain of the operation.
The master node runs several critical components: etcd, kube-scheduler, controller-manager, and kube-apiserver (which orchestrates all operations on the cluster). We’ll dive into each of these shortly.
Every node also needs a container runtime engine like Docker or containerd to make applications container-capable. Speaking of which – Kubernetes and Docker used to be tightly coupled, but Kubernetes developed CRI (Container Runtime Interface) to work with different container runtime vendors. This is why you’ll see containerd used more often now instead of Docker.
Master Node Components
Let’s break down what’s running on that master node:
etcd – A reliable key-value store that holds all the cluster data. It stores information about nodes, pods, configs, secrets – everything. When you install Kubernetes manually, you install etcd separately. If you use kubeadm, it comes bundled as a pod called etcd-master.
kube-apiserver – This is what you interact with when you run kubectl commands. Every interaction with the cluster goes through the API server. When you run a kubectl command, it hits the kube-apiserver, which then communicates with other components like etcd, the scheduler, or kubelet. The API server handles authentication, validates requests, retrieves data, updates etcd, and coordinates with the scheduler and kubelet.
kube-controller-manager – Runs various controllers (like node-controller and replication-controller) that watch the cluster state and take action when needed. If a node goes down or a pod fails, controllers notice and work to remediate the situation.
kube-scheduler – Decides which pods go on which nodes. It filters nodes based on the pod’s requirements, ranks them, and assigns the pod to the best-fit node.
Worker Node Components
On the worker nodes, you have:
kubelet – The agent running on each node. It communicates with the kube-apiserver and reports back on the node’s status and the pods running on it.
kube-proxy – Handles networking so pods can communicate with each other. It manages the pod network and helps route traffic between pods across different nodes.
Pods: The Smallest Unit
Pods are the smallest deployable unit in Kubernetes – a single instance of an application. Usually, there’s a one-to-one relationship between a pod and a container, but you can have multiple containers in a pod (like an app container and a helper/sidecar container).
You can deploy pods directly via kubectl commands or using YAML files. YAML manifests include fields like apiVersion, kind (the type of resource), metadata (names and labels), and spec (the detailed configuration).
Replication and High Availability
Running a single pod isn’t great for availability. That’s where ReplicationControllers and ReplicaSets come in. They ensure a specified number of pod replicas are always running. If a pod crashes, the controller spins up a new one automatically.
Labels and selectors are key here – they help controllers identify which pods they’re responsible for managing. In your ReplicaSet definition, you specify both the replica count and the pod template.
Deployments: Managing Everything
Deployments sit one level above ReplicaSets. Instead of managing individual pods or even ReplicaSets manually, you create a Deployment that defines your desired state. Need to update an image? Scale up? Roll back? The Deployment handles it all.
This is what you’ll use most often – deployments make it easy to create, edit, manage, and patch multiple containers all at once.
Pro tip for the exam: When you need to generate a YAML file for a resource, add --dry-run=client -o yaml to your kubectl run or create command. This generates the YAML without actually creating the resource – super useful for quick templates.
Services: Consistent Access to Pods
Pods are ephemeral – they come and go, IPs change constantly. So how do you consistently reach them? Services.
A Service provides a fixed address to access a set of pods. Think of it as a grouping mechanism. For example, you might have a frontend service that routes traffic to several frontend pods. Instead of tracking individual pod IPs, your application connects to the service, which handles routing to available pods.
Creating a service is straightforward:
kubectl expose deployment frontend --port 8080
This creates a service with its own fixed IP. Now you have a stable endpoint regardless of pod changes.
Types of Services:
ClusterIP (default) – Creates an internal IP for the service. Great for pod-to-pod communication within the cluster, like frontend pods talking to backend pods.
NodePort – Exposes a port on each node, allowing direct access via the node’s IP. Not commonly used, especially in cloud environments.
LoadBalancer – Used with cloud providers. Creates a load balancer that exposes the service externally with its own IP and routes traffic into the cluster.
When creating services, make sure your selectors and labels match the pods you’re exposing – that’s how the service knows which pods to route to.
Quick tip: To expose a pod with specific settings:
kubectl expose pod redis --port=6379 --name=redis-service --type=ClusterIP
Namespaces: Logical Separation
Namespaces provide logical grouping and isolation of objects. Useful when you have many users or want to separate resources by team, environment, or project.
Create a namespace:
kubectl create namespace mealie
Deploy pods to a specific namespace:
kubectl create deployment app --image=nginx --namespace=mealie
Change your default namespace:
kubectl config set-context --current --namespace=mealie
Imperative vs Declarative
Two approaches to managing Kubernetes resources:
Imperative – Telling Kubernetes exactly what to do and how to do it. Step-by-step instructions using kubectl commands like run, create, expose, edit, scale, and set. Good for quick tasks and learning.
Declarative – Specifying the desired end state using YAML files and kubectl apply. You declare what you want, Kubernetes figures out how to get there. This is the preferred approach for production – your YAML files become the source of truth for your infrastructure.
Exam Resources
Two helpful commands for studying:
kubectl explain [resource]– Get detailed info about a resourcekubectl api-resources– List all available resources
Wrapping Up
Understanding these core components – how the master and worker nodes interact, what pods and services are, how deployments manage everything – is foundational to working with Kubernetes. In the next post, I’ll dive deeper into further aspects I covered such as scheduling, env variables, secrets etc.
See you in the next one!
Leave a comment