Troubleshooting Kubernetes: API error at resource listing

Page content

I this post I will show you how to troubleshoot your Kubernetes cluster when you get an API error at resource listing.

Yesterday when I tried to use my Kubernetes cluster, I got an error:

couldn't get resource list for custom.metrics.k8s.io/v1beta1: the server is currently unable to handle the request

I din’t understand first so I started debugging. T

The custom.metrics.k8s.io/v1beta1 is not a standard Kubernetes api. Kubernetes API can be extended with Custom Resources that are fully managed by Kubernetes and available to Kubectl and other tools. When you get it wit kubectl the Kubernetes API will find it like any other API object in etcd. The other way is to use a third-party API server living in the Kubernetes cluster as a pod to serv the object. So when you get it wit kubectl the Kubernetes API server will forward the request to the third-party API server. For this forward to work you need an APIService object:

apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
    port: 443
  version: v1beta1
  versionPriority: 100

First I believed it is the CRD of the metrics-server, and I started debugging the metrics-server. Check its logs but I found no problem. The I find the APIService object and realized this is not the same object.

kubectl get apiservice
NAME                                   SERVICE                                                          AVAILABLE                 AGE
v1.                                    Local                                                            True                      3y362d
v1.acme.cert-manager.io                Local                                                            True                      131d
v1.admissionregistration.k8s.io        Local                                                            True                      3y4d
v1.apiextensions.k8s.io                Local                                                            True                      3y4d
v1.apps                                Local                                                            True                      3y362d
v1.authentication.k8s.io               Local                                                            True                      3y362d
v1.authorization.k8s.io                Local                                                            True                      3y362d
v1.autoscaling                         Local                                                            True                      3y362d
v1.autoscaling.k8s.io                  Local                                                            True                      79d
v1.batch                               Local                                                            True                      3y362d
v1.cert-manager.io                     Local                                                            True                      79d
v1.certificates.k8s.io                 Local                                                            True                      623d
v1.coordination.k8s.io                 Local                                                            True                      3y362d
...
v1beta1.custom.metrics.k8s.io          cattle-monitoring-system/rancher-monitoring-prometheus-adapter   False (ServiceNotFound)   679d
...
v1beta1.metrics.k8s.io                 kube-system/metrics-server                                       True                      2y36d

The information we’re looking for on this table is all the APIServices where Available is False. So I figured out the problem is caused by the removal og rancher, that left behind som object at deletion.

Conclusion

One of the most common cause of the ‘couldn’t get resource list for’ error is the Kubernetes Metrics object served by metrics-server or prometheus-adapter, because metrics are not stored in Etcd like other resources. They are stored in memory or in prometheus, and then exposed via an API extension. If the API is not available, you will get this error.