Scheduler & Controller Manager¶

🎯 Learning Objectives

Understand scheduler internals and algorithms
Master custom scheduler implementation
Learn controller reconciliation patterns
Build custom controllers and operators
Troubleshoot scheduling and controller issues

The scheduler and controller manager are critical for maintaining desired cluster state. Understanding their internals enables advanced troubleshooting and customization.

State Reconciliation

Controllers continuously reconcile desired state with actual state. Understanding this pattern is key to troubleshooting.

Controller Conflicts

Multiple controllers managing the same resource can cause conflicts. Always understand controller ownership.

Scheduler Architecture¶

Scheduling Algorithm¶

The scheduler uses a two-phase algorithm:

Phase 1: Filtering (Predicates) - Filters out nodes that cannot host the pod - Checks resource availability, node selectors, taints/tolerations

Phase 2: Scoring (Priorities) - Ranks remaining nodes - Considers resource balance, affinity, anti-affinity

# Scheduler policy example
apiVersion: kubescheduler.config.k8s.io/v1
kind: KubeSchedulerConfiguration
profiles:
- schedulerName: default-scheduler
  plugins:
    filter:
      enabled:
      - name: NodeResourcesFit
      - name: NodeAffinity
    score:
      enabled:
      - name: NodeResourcesLeastAllocated
        weight: 1
      - name: NodeAffinity
        weight: 1

Scheduler Extensibility

Kubernetes scheduler is highly extensible. You can implement custom schedulers or extend the default.

Custom Scheduler¶

# Pod with custom scheduler
apiVersion: v1
kind: Pod
metadata:
  name: my-pod
spec:
  schedulerName: my-custom-scheduler
  containers:
  - name: app
    image: my-app:latest

Custom Schedulers

Custom schedulers enable specialized scheduling logic for specific workloads (e.g., GPU scheduling, batch jobs).

Controller Manager¶

Controller Pattern¶

Controllers follow a reconciliation loop:

// Pseudo-code for controller pattern
for {
    desiredState := getDesiredState()
    actualState := getActualState()

    if desiredState != actualState {
        reconcile(desiredState, actualState)
    }

    sleep(reconciliationInterval)
}

Reconciliation Loop

Controllers continuously watch for changes and reconcile state. This ensures eventual consistency.

Core Controllers¶

Deployment Controller: - Manages ReplicaSets - Handles rolling updates - Maintains deployment history

ReplicaSet Controller: - Maintains desired pod replicas - Creates/deletes pods as needed - Handles pod failures

StatefulSet Controller: - Manages stateful workloads - Maintains pod identity - Handles ordered scaling

Controller Dependencies

Controllers have dependencies (Deployment → ReplicaSet → Pod). Understanding these helps troubleshoot issues.

Troubleshooting¶

Scheduler Issues¶

Pods Stuck in Pending¶

# Check pod events
kubectl describe pod <pod-name>

# Check node resources
kubectl describe node <node-name>

# Check taints
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints

# Check scheduler logs
kubectl logs -n kube-system kube-scheduler-<node> --tail=100

Pending Pods

Common causes: insufficient resources, node selectors, taints without tolerations, affinity rules.

Controller Issues¶

Controllers Not Reconciling¶

# Check controller manager logs
kubectl logs -n kube-system kube-controller-manager-<node>

# Check leader election
kubectl get endpoints -n kube-system kube-controller-manager

# Check controller status
kubectl get deployments,replicasets,pods

Leader Election

Only the leader performs reconciliation. If leader election fails, controllers won't work.

Best Practices¶

Production Recommendations

Monitor scheduler metrics (scheduling latency, pending pods)
Set appropriate resource requests and limits
Use node affinity for workload placement
Implement custom schedulers for specialized needs
Monitor controller reconciliation loops
Test controller behavior under failure scenarios

Next Chapter: Advanced Networking & CNI