Posts Tagged ‘kubernetes’

SIGTERM and PID1

I’ve been working a lot with Kubernetes this year and an interesting problem surfaced when attempting to get a container to terminate gracefully.

What Kubernetes does

The Kubernetes termination lifecycle is detail in Kubernetes best practices: terminating with grace, but the most important bits are:

  • When a pod is set to the Terminating state, all containers are sent SIGTERM
  • Kubernetes waits a grace period (default is 30s) for containers to handle SIGTERM
    • If a container process has no handler for SIGTERM, the Linux kernel will kill the process immediately
    • If a container process does have a handler for SIGTERM, the handler can do whatever is needed to wrap up, then exit
  • At the end of the grace period, any container still alive is sent SIGKILL and deleted

This is all very reasonable but it does depends on container processes handling SIGTERM or not handling SIGTERM and letting the kernel kill the process.

Not having a handler for SIGTERM

In general, without an explicit handler for SIGTERM, the kernel will kill the process. However, there is one very important exception, a process having Process ID (PID) 1 will not be killed, as PID 1 is not killable via signals.

Note that this is true for all termination signals, SIGKILL won’t have an impact either

When a container is run (e.g. via docker run), whatever process is started, from the declared entrypoint, is PID 1.

As for why PID 1 is unkillable, I couldn’t find an exact reason, but given that PID 1 is usually for an init daemon, I’d wager it’s due to the importance of init, as it’s the ancestor of all userspaces processes. That said, while this all makes sense in the context of a full-featured Linux distro, this protection is questionable when it comes to running processes within a container.

Responding to SIGTERM

So, for processes without an explicit signal handler, when Kubernetes issues SIGTERM, nothing will happen. The process will simply keep running. Only after the termination grace period, when Kubernetes forcible deletes the container, will the process be killed. In some cases, this is not a problem (the termination grace period is doesn’t matter and graceful termination isn’t a concern) but for programs that handle long-running tasks, it can certainly be an issue.

If the process running is an application or script where the source code is available, the solution is obvious, write a handler for SIGTERM.

If writing a handler isn’t possible, add an init program to the container and use it to run the application or script. tini works great here, as it’s lightweight and designed for containers (it’s also what docker run uses when the --init flag specified).