Custom Kube-Scheduler

Page content

In this post I will show you how you can create a custom Kube-Scheduler to chaneg scheduling options.

What is Kube-Scheduler?

Kube-Scheduler is the component that is makes decisions on where to run Pods based on various criteria such as node selectors, affinities, hardware constraints, resource limits. By default the sheduler schedules the pods to the lease used node. In this example I will change this to the MostAllocated strategy. With this configuration you can save resources and mony.

This post was tested on Kubernetes version 1.25 and later.

Create Config for the scheduler

apiVersion: kubescheduler.config.k8s.io/v1beta2
kind: KubeSchedulerConfiguration
leaderElection:
   leaderElect: false
profiles:
   - schedulerName: custom-scheduler
     pluginConfig:
       - args:
           apiVersion: kubescheduler.config.k8s.io/v1beta2
           kind: NodeResourcesFitArgs
           scoringStrategy:
               resources:
                   - name: cpu
                     weight: 1
                   - name: memory
                     weight: 1
               type: MostAllocated
         name: NodeResourcesFit
     plugins:
       score:
           enabled:
               - name: NodeResourcesFit
                 weight: 1
kubectl create configmap custom-scheduler-config -n kube-system --from-file=scheduler-config.yaml

Create the ServiceAccount for the custom kube-scheduler

apiVersion: v1
kind: ServiceAccount
metadata:
 name: custom-scheduler
 namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
 name: custom-scheduler
rules:
- apiGroups:
 - ""
 resources:
 - pods
 - pods/status
 - pods/binding
 verbs:
 - get
 - list
 - watch
 - create
 - update
 - patch
 - delete
- apiGroups:
 - ""
 resources:
 - nodes
 verbs:
 - get
 - list
 - watch
- apiGroups:
 - storage.k8s.io
 resources:
 - storageclasses
 - csinodes
 - csidrivers
 - csistoragecapacities
 verbs:
 - watch
 - list
 - get
- apiGroups:
 - apps
 resources:
 - replicasets
 - statefulsets
 verbs:
 - watch
 - list
 - get
- apiGroups:
 - ""
 resources:
 - persistentvolumeclaims
 - services
 - namespaces
 - configmaps
 - replicationcontrollers
 - persistentvolumes
 - poddisruptionbudgets
 - replicasets
 - statefulsets
 verbs:
 - watch
 - list
 - get
- apiGroups:
 - policy
 resources:
 - poddisruptionbudgets
 verbs:
 - watch
 - list
 - get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
 name: custom-scheduler
roleRef:
 apiGroup: rbac.authorization.k8s.io
 kind: ClusterRole
 name: custom-scheduler
subjects:
- kind: ServiceAccount
 name: custom-scheduler
 namespace: kube-system
kubectl apply -f scheduler-sa.yaml

Create the custom scheduler deployment

apiVersion: apps/v1
kind: Deployment
metadata:
 name: custom-scheduler
 namespace: kube-system
spec:
 replicas: 1
 selector:
   matchLabels:
     name: custom-scheduler
 template:
   metadata:
     labels:
       component: scheduler
       name: custom-scheduler
       tier: control-plane
   spec:
     containers:
     - command:
       - /usr/local/bin/kube-scheduler
       - --leader-elect=false
       - --config=/etc/kubernetes/scheduler-config.yaml
       - -v=5
       env: []
       image: registry.k8s.io/kube-scheduler:v1.25.12
       imagePullPolicy: IfNotPresent
       resources:
         requests:
           cpu: 200m
           memory: 128Mi
         limits:
           memory: 128Mi
       livenessProbe:
         httpGet:
           path: /healthz
           port: 10259
           scheme: HTTPS
       name: custom-scheduler
       readinessProbe:
         httpGet:
           path: /healthz
           port: 10259
           scheme: HTTPS
       volumeMounts:
       - mountPath: /etc/kubernetes/scheduler-config.yaml
         name: custom-scheduler-config
         subPath: scheduler-config.yaml
     serviceAccountName: custom-scheduler
     volumes:
     - configMap:
         name: custom-scheduler-config
       name: custom-scheduler-config
kubectl apply -f custom-scheduler-deployment.yaml

Schedule Pods with the custom kube-scheduler

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  template:
    spec:
      schedulerName: custom-scheduler
      containers:
      - name: my-app
        image: my-app-image