Availability

Last modified Apr 20, 2023

What is ‘up’?

First of all we calculate whether a service is up or not. A service is considered available if

Service has at least 1 replica available

Let us consider a cluster with 2 services, both with two replicas desired. If there are no available replicas, we consider the service down, otherwise it is up.

Service	Available Replicas	Desired replicas
ServiceOne	2	2
ServiceOne	1	2
ServiceOne	0	2

Uptime calculation

If we have two services, A and B, we would calculate the current uptime for each service

`A` status	`B` status	Up
		100%
		50%

Now we would calculate the average over a time-span

Time	`A` status	`B` status	UpTime
1			100%
2			50%
3			100%
4			0%
total uptime			62.5%

And the total uptime for the entire cluster will be 62.5% for a period over 4 timeunits

What we dont measure

Replicas not used

Lets say that an ingress is set up wrongly, but the replica is still up and running - From the user-perspective, the service is down, but since there are available replicas, we will say that it is up.