Monitoring
Metrics can be retrieved from the Kubernetes cluster's REST API using HTTP GET requests to various `/metrics` endpoints. Accessing these endpoints requires authentication against the Kubernetes API, as described in https://kubernetes.io/docs/reference/access-authn-authz/authentication/. Alternatively, if the command kubectl
is already set up to work with this cluster, the following command can be used to establish a local, pre-authenticated proxy:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
Metrics Endpoints
Cluste Metrics
The primary endpoint for metrics is at /metrics
and output looks like this:
# HELP APIServiceOpenAPIAggregationControllerQueue1_adds Total number of adds handled by workqueue: APIServiceOpenAPIAggregationControllerQueue1
# TYPE APIServiceOpenAPIAggregationControllerQueue1_adds counter
APIServiceOpenAPIAggregationControllerQueue1_adds 120577
# HELP APIServiceOpenAPIAggregationControllerQueue1_depth Current depth of workqueue: APIServiceOpenAPIAggregationControllerQueue1
# TYPE APIServiceOpenAPIAggregationControllerQueue1_depth gauge
APIServiceOpenAPIAggregationControllerQueue1_depth 0
# HELP APIServiceOpenAPIAggregationControllerQueue1_queue_latency How long an item stays in workqueueAPIServiceOpenAPIAggregationControllerQueue1 before being requested.
# TYPE APIServiceOpenAPIAggregationControllerQueue1_queue_latency summary
APIServiceOpenAPIAggregationControllerQueue1_queue_latency{quantile="0.5"} 21
APIServiceOpenAPIAggregationControllerQueue1_queue_latency{quantile="0.9"} 55
APIServiceOpenAPIAggregationControllerQueue1_queue_latency{quantile="0.99"} 55
APIServiceOpenAPIAggregationControllerQueue1_queue_latency_sum 2.987667e+06
APIServiceOpenAPIAggregationControllerQueue1_queue_latency_count 120577
# HELP APIServiceOpenAPIAggregationControllerQueue1_retries Total number of retries handled by workqueue: APIServiceOpenAPIAggregationControllerQueue1
# TYPE APIServiceOpenAPIAggregationControllerQueue1_retries counter
APIServiceOpenAPIAggregationControllerQueue1_retries 120597
# HELP APIServiceOpenAPIAggregationControllerQueue1_work_duration How long processing an item from workqueueAPIServiceOpenAPIAggregationControllerQueue1 takes.
# TYPE APIServiceOpenAPIAggregationControllerQueue1_work_duration summary
APIServiceOpenAPIAggregationControllerQueue1_work_duration{quantile="0.5"} 4488
APIServiceOpenAPIAggregationControllerQueue1_work_duration{quantile="0.9"} 66467
APIServiceOpenAPIAggregationControllerQueue1_work_duration{quantile="0.99"} 66467
APIServiceOpenAPIAggregationControllerQueue1_work_duration_sum 4.531781208e+09
APIServiceOpenAPIAggregationControllerQueue1_work_duration_count 120577
# HELP APIServiceRegistrationController_adds Total number of adds handled by workqueue: APIServiceRegistrationController
# TYPE APIServiceRegistrationController_adds counter
APIServiceRegistrationController_adds 675630
....
The format is described in-depth in https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md.
Furthermore, Kubernetes has a metrics server running that serves additional metrics about nodes at http(s)://<k8s-api>/apis/metrics.k8s.io
.
Node Metrics
Node metrics can be retrieved from https://<k8s-api>/apis/metrics.k8s.io/v1beta1/nodes
:
{
"kind": "NodeMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes"
},
"items": [
{
"metadata": {
"name": "gke-th-infra-v2-cluster-default-pool-7b9b2482-zff5",
"selfLink": "/apis/metrics.k8s.io/v1beta1/nodes/gke-th-infra-v2-cluster-default-pool-7b9b2482-zff5",
"creationTimestamp": "2019-09-20T16:27:40Z"
},
"timestamp": "2019-09-20T16:27:07Z",
"window": "30s",
"usage": {
"cpu": "211506344n",
"memory": "2185068Ki"
}
},
...
}
}
Pod Metrics
Pod Metrics can be retrieved from https://<k8s-api>/apis/metrics.k8s.io/v1beta1/namespaces/<namespace>/pods
{
"kind": "PodMetricsList",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/demo/pods"
},
"items": [
{
"metadata": {
"name": "index-77886848dd-msj6q",
"namespace": "demo",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/demo/pods/index-77886848dd-msj6q",
"creationTimestamp": "2019-09-20T16:31:20Z"
},
"timestamp": "2019-09-20T16:30:56Z",
"window": "30s",
"containers": [
{
"name": "index",
"usage": {
"cpu": "0",
"memory": "3048Ki"
}
}
]
},
...
}
}