Question 33
Domain 5: Deployment, Scaling, Safety, and ComplianceWhat Kubernetes deployment strategy would best meet these requirements?
Correct answer: B
Explanation
A separate Deployment for each agent matches Kubernetes’ model for managing independently scalable workloads, and resource requests/limits let the scheduler place pods based on capacity. HPA is the standard way to scale CPU-bound pods, node selectors target GPU-capable nodes, and a rolling update strategy supports gradual replacement with minimal downtime.
Why each option is right or wrong
A. Deploy all agents in a single pod with all resources allocated statically.
B. Deploy each agent as a separate Kubernetes Deployment with appropriate resource requests/limits, use Horizontal Pod Autoscaler (HPA) for CPU-based agents, use node selectors for GPU allocation, and implement rolling update strategy.
Kubernetes workloads are managed at the Deployment level, and each Deployment can define its own pod template, replica count, and update policy under the apps/v1 API. For CPU-driven agents, the HorizontalPodAutoscaler in autoscaling/v2 can scale between a configured minimum and maximum replica count based on observed CPU utilization, while GPU-bound pods are scheduled onto GPU-capable nodes using node selectors or node affinity in the pod spec. Resource requests and limits are enforced by the scheduler and kubelet, and a rollingUpdate strategy is the default Deployment update mechanism, replacing pods incrementally to keep service available during changes.
C. Deploy all agents on a single large node with all GPUs.
D. Manually scale agents by changing replica counts based on time of day.