Thursday, December 11, 2025

How to Diagnose a Crashed API Server

๐Ÿฉบ When the Kubernetes API Server Crashes: A Practical Survival Guide

Nothing strikes fear into a Kubernetes admin like a dead API server.
One minute you're applying a new manifest…
The next minute:

kubectl stops responding.
Your cluster is silent.
And your control plane pod flashes in docker ps for half a second—then disappears.

If this sounds familiar, don’t panic.
I’m going to walk you through a battle-tested, step-by-step process for diagnosing and fixing a crashed Kubernetes API server — even when kubectl is completely unusable.

These same techniques also work for other static pods like etcd, kube-scheduler, and kube-controller-manager.


๐Ÿ”ง Step 1 — Restart the Kubelet

Static control plane pods are managed entirely by the kubelet, not by deployments.
Restarting kubelet speeds up the troubleshooting loop:

systemctl restart kubelet

If the API server can be started, kubelet will try immediately.


๐Ÿ” Step 2 — Check If Kubelet Can Parse the Manifest

Most API server outages begin with a broken YAML file at:

/etc/kubernetes/manifests/kube-apiserver.yaml

Maybe you edited it.
Maybe a lab exercise changed it.
Maybe it became corrupted.

To see what kubelet thinks about your file, tail its logs:

journalctl -fu kubelet | grep apiserver

Watch for at least 30–60 seconds.

You’ll usually find one of three types of errors:


1️⃣ YAML Syntax Errors

Kubelet can't even read the manifest.

You may see errors like:

“Could not process manifest file…”

This means your YAML structure is invalid:

  • wrong indentation
  • misplaced fields
  • tabs instead of spaces

missing dashes

Fix the file, save it, and kubelet will retry.

๐Ÿ“ Remember: YAML parsers only report the first error.
If the server still won’t restart, repeat the process.


2️⃣ Invalid Fields or Arguments

Here, the YAML structure is correct, but the pod spec isn't.

Examples:

  • wrong volume path
  • wrong flag passed to kube-apiserver
  • unsupported field under spec

Kubelet logs will show a "valid YAML but invalid content" message.

Fix the offending fields and let kubelet restart the pod.


3️⃣ CrashLoopBackOff

The manifest is valid, the pod starts… and immediately exits with an error.

This is your cue to dig deeper.


๐Ÿ’ฅ Step 3 — When the API Server Starts but Crashes Immediately

Even if the control plane is down, kubelet still saves pod logs to disk.

Navigate to the static pod log directory:

cd /var/log/pods
ls -ld *apiserver*

You’ll see a directory like:

kube-system_kube-apiserver-controlplane_<random-id>

Note:
This directory changes every time the pod is recreated, so always re-check.


Enter the directory:

cd kube-system_kube-apiserver-controlplane_<id>
ls -l

You’ll find a folder named kube-apiserver:

cd kube-apiserver
ls -l

Inside, you’ll see log files such as:

0.log
1.log
2.log

Open the most recent one:

cat 1.log

This is where the truth reveals itself — certificate errors, invalid flags, CIDR conflicts, missing files, or admission plugin failures.
Whatever caused the crash will appear clearly here.


๐ŸŽฏ Summary: Your API Server Recovery Playbook

When Kubernetes loses its API server, follow this proven sequence:

1. Restart kubelet

For faster retries and easier debugging.

2. Check kubelet logs

journalctl -fu kubelet | grep apiserver

Identify whether you have:

  • Syntax errors
  • Invalid manifest fields
  • CrashLoopBackOff

3. If it’s crashing, read the real logs

Stored in:

/var/log/pods/.../kube-apiserver/*.log

Once you know the root cause, fixing the issue becomes straightforward.


๐Ÿงช Want to Practice?

Here are hands-on resources with real broken API server scenarios:

Practice breaking and fixing the API server — it’s the fastest way to master control-plane debugging.


MS in Computer Science with paid training in USA company