Deepdive into Kubernetes Probe's period seconds
Kubernetes has a container’s health check mechanism called Probe
.
This might be a common feature for almost Kubernetes users.
Now at v1.18, there are three types of probes ➡️ Liveness Probe
, Readiness Probe
and Startup Probe
.
Though these three types of probes are different only in their behavior when a health check fails, the basic mechanism of the health check itself is not so different.
Also Probes
has three types of health checks ➡️ command execution, TCP or HTTP request.
This kind of health check mechanism is a popular idea not only in Kubernetes but also in other products. However, the detail behaviors are different depending on each products.
Probing loop period seconds
Let’s take the following example 👀 In order to probe your application is ready to accept requests, you might sometimes need to check not only your application but also the databases or external services it depends on. In such a case, the health check may take longer than expected due to network latency or external service congestion.
Suppose that Kubernetes Readiness Probe is enabled with PeriodSeconds
set to 3 seconds, and a single probe of HTTP request takes 2 seconds.
In this case, will the next Probe request begin 3 seconds after the previous request is responded? Or will it be performed 3 seconds after the last request, regardless of the previous response time? (meaning 1 second later after the previous request is responded)
Furthermore, if the Probe request takes more than 3 seconds, will the next request start in parallel to the previous request?
The Kubernetes official documentation does not describe this behavior in detail (I worked on the official website translatation into Japanese)
Deepdive into kubelet code
Probes
is performed by the Kubelet on the node where the container is running. The Kubelet Prober
package contains the code about it.
https://github.com/kubernetes/kubernetes/tree/master/pkg/kubelet/prober
You can see three major components here: prober.go
, prober_manager.go
, and worker.go
.
It is helpful to understand only by its comments.
-
prober
Prober helps to check the liveness/readiness/startup of a container.
-
prober_manager
Manager manages pod probing. It creates a probe “worker” for every container that specifies a probe (AddPod). The worker periodically probes its assigned container and caches the results.
-
worker
worker handles the periodic probing of its assigned container. Each worker has a go-routine associated with it which runs the probe loop until the container permanently terminates, or the stop channel is closed.
From the comments we can see that the worker has been running Probe in goroutine periodically.
Let’s see the worker’s run()
function.
// run periodically probes the container.
func (w *worker) run() {
probeTickerPeriod := time.Duration(w.spec.PeriodSeconds) * time.Second
// If kubelet restarted the probes could be started in rapid succession.
// Let the worker wait for a random portion of tickerPeriod before probing.
time.Sleep(time.Duration(rand.Float64() * float64(probeTickerPeriod)))
probeTicker := time.NewTicker(probeTickerPeriod)
defer func() {
// Clean up.
probeTicker.Stop()
if !w.containerID.IsEmpty() {
w.resultsManager.Remove(w.containerID)
}
w.probeManager.removeWorker(w.pod.UID, w.container.Name, w.probeType)
ProberResults.Delete(w.proberResultsSuccessfulMetricLabels)
ProberResults.Delete(w.proberResultsFailedMetricLabels)
ProberResults.Delete(w.proberResultsUnknownMetricLabels)
}()
probeLoop:
for w.doProbe() {
// Wait for next probe tick.
select {
case <-w.stopCh:
break probeLoop
case <-probeTicker.C:
// continue
}
}
}
At probeLoop
, the sleep logic is implemented by <-probeTicker.C
.
It is a channel of time.Ticker
that blocks the Probes execution loop at regular intervals. Therefore doProbe()
will be executed at the intervals.
https://golang.org/pkg/time/#Ticker
NewTicker returns a new Ticker containing a channel that will send the time with a period specified by the duration argument. It adjusts the intervals or drops ticks to make up for slow receivers.
So how does the time.Ticker
loop behave depending on the execution time of doProbe()
?
Let’s write a small code with the same logic and see how it behaves.
It is a code that the execution time of doProbe()
gradually increase by 1 seconds.
PeriodSeconds is set to 3 seconds.
package main
import (
"context"
"log"
"time"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), time.Second*20)
defer cancel()
t := time.NewTicker(time.Millisecond * 3000)
defer t.Stop()
probeLoop:
for doProbe() {
select {
case <-ctx.Done():
break probeLoop
case <-t.C:
log.Println("ticker next")
}
}
log.Println("finished")
}
var i uint = 0
func doProbe() (keepGoing bool) {
log.Println("doProbe start", i)
time.Sleep(time.Second * time.Duration(i))
log.Println("doProbe end")
i++
return true
}
The output is as follows.
$ go run .
2020/08/18 23:29:30 doProbe start 0
2020/08/18 23:29:30 doProbe end
2020/08/18 23:29:33 ticker next # next after 3 seconds
2020/08/18 23:29:33 doProbe start 1
2020/08/18 23:29:34 doProbe end
2020/08/18 23:29:36 ticker next # next after 2 seconds
2020/08/18 23:29:36 doProbe start 2
2020/08/18 23:29:38 doProbe end
2020/08/18 23:29:39 ticker next # next after 1 seconds
2020/08/18 23:29:39 doProbe start 3
2020/08/18 23:29:42 doProbe end
2020/08/18 23:29:42 ticker next # next after 0 seconds
2020/08/18 23:29:42 doProbe start 4
2020/08/18 23:29:46 doProbe end
2020/08/18 23:29:46 ticker next # next after 0 seconds
2020/08/18 23:29:46 doProbe start 5
2020/08/18 23:29:51 doProbe end
2020/08/18 23:29:51 finished
When a Probe takes less than 3 seconds, it run every 3 seconds regardless of the Probe’s execution time. For example, when a Probe takes 2 seconds, the next Probe start 1 seconds after the previous Probe ends. When a Probe takes longer than 3 seconds, the next Probe will start as soon as the Probe finishes.
☑️Here is a summary of this page
- When a Probe takes less than period seconds, it run every period seconds regardless of the Probe’s execution time.
- When a Probe takes longer than period seconds, the next Probe will start as soon as the Probe finishes.
As you can see, the functionality of Kubernetes depends on that of Go. Even if you are not a Kubernetes developer or contributor, reading codes(or even only comments) will give you some good tips.
Have a better k8s life 🙌