HPA & VPA Deep Dive¶
🎯 Learning Objectives
- Master Horizontal Pod Autoscaler (HPA)
- Understand Vertical Pod Autoscaler (VPA)
- Learn advanced autoscaling patterns
- Troubleshoot autoscaling issues
- Optimize autoscaling configurations
Autoscaling enables dynamic resource adjustment based on demand. Understanding HPA and VPA is essential for cost optimization and performance.
Autoscaling Benefits
Autoscaling optimizes resource usage, reduces costs, and maintains performance under varying load.
Scaling Limits
Set appropriate min/max replicas to prevent excessive scaling or resource exhaustion.
Horizontal Pod Autoscaler (HPA)¶
Basic HPA¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: web-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: web
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
HPA Metrics
HPA can scale based on: CPU, memory, custom metrics, external metrics, object metrics.
Custom Metrics¶
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: custom-metrics-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api
minReplicas: 3
maxReplicas: 20
metrics:
- type: Pods
pods:
metric:
name: requests_per_second
target:
type: AverageValue
averageValue: "100"
Custom Metrics
Custom metrics enable scaling based on application-specific metrics (requests, queue depth, etc.).
Vertical Pod Autoscaler (VPA)¶
VPA Configuration¶
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: web-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: web
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 2Gi
VPA Modes
- Off: Only provides recommendations
- Initial: Sets resources at pod creation
- Auto: Updates resources dynamically (requires recreating pods)
- Recreate: Recreates pods to apply changes
Troubleshooting¶
HPA Not Scaling¶
# Check HPA status
kubectl get hpa
# Describe HPA
kubectl describe hpa <hpa-name>
# Check metrics
kubectl get --raw /apis/metrics.k8s.io/v1beta1/namespaces/<ns>/pods
# Check HPA controller logs
kubectl logs -n kube-system <hpa-controller-pod>
Troubleshooting Steps
- Verify metrics are available
- Check HPA configuration
- Verify target resource exists
- Review HPA controller logs
- Check resource requests/limits
VPA Issues¶
# Check VPA status
kubectl get vpa
# Check VPA recommendations
kubectl describe vpa <vpa-name>
# Check VPA recommender logs
kubectl logs -n kube-system <vpa-recommender-pod>
VPA Recommendations
VPA needs time to collect metrics before providing recommendations. Monitor for several hours.
Best Practices¶
Production Recommendations
- Set appropriate min/max replicas
- Use multiple metrics for HPA
- Test scaling behavior under load
- Monitor autoscaling events
- Use VPA for right-sizing recommendations
- Document autoscaling policies
Next Chapter: Advanced Security Hardening