Return to site

🚀🩺 KUBERNETES STARTUP PROBE: BOOT SAFELY, AVOID CRASHLOOPS

November 3, 2025

🔸 TL;DR

▪️ Use startupProbe for apps with slow or spiky boot times.

▪️ It delays liveness checks until your app has finished starting, preventing CrashLoopBackOff storms.

▪️ Keep readinessProbe for traffic gating and livenessProbe for hung-process recovery. Each probe has a different job. 💡

🔸 WHAT IT IS

▪️ A probe that tells K8s: “⏳ Don’t kill me yet—I’m still starting.”

▪️ While startupProbe is failing, livenessProbe is ignored; once it succeeds, liveness takes over.

🔸 WHEN TO USE IT

▪️ Heavy frameworks (e.g., Spring Boot with lots of auto-config) or apps warming caches/JIT on cold start.

▪️ Images that run DB migrations or contact external systems on boot.

▪️ Any service that sometimes takes >30–60s to become healthy.

🔸 HOW IT WORKS (MENTAL MODEL)

▪️ startupProbe = boot watchdog (one-time gate).

▪️ readinessProbe = traffic gate (can flap).

▪️ livenessProbe = hang detector (restart if stuck).

🔸 MINIMAL EXAMPLE (HTTP)

🔸 TUNING TIPS

▪️ Set startupProbe.failureThreshold × periodSeconds ≈ worst-case boot time (be generous).

▪️ Prefer fast, dependency-light endpoints (no DB calls).

▪️ For TCP-only apps, use tcpSocket; for scripts, use exec.

▪️ Keep liveness stricter (fast check, lower thresholds) than readiness.

▪️ Avoid identical endpoints for all three probes—use dedicated health groups if possible.

🔸 COMMON PITFALLS

▪️ ❌ Too-short startup window → unnecessary restarts.

▪️ ❌ Using readiness only—pods receive traffic before they’re ready.

▪️ ❌ Heavy health endpoint (does DB/mq calls) → probe becomes the bottleneck.

▪️ ❌ Forgetting timeoutSeconds: slow network can cause false negatives.

🔸 TAKEAWAYS

▪️ startupProbe prevents premature restarts, stabilizing deployments with slow cold starts.

▪️ Combine startup + readiness + liveness for clear, complementary responsibilities.

▪️ Right sizing thresholds = fewer CrashLoopBackOffs, smoother rollouts, happier on-calls. ✅

#Kubernetes #Containers #CloudNative #DevOps #SRE #PlatformEngineering #Observability #SpringBoot #LivenessProbe #ReadinessProbe #StartupProbe