Kubernetes v1.36: A New Era of Workload Scheduling with Separate PodGroup API
Introduction
Kubernetes v1.36 marks a significant milestone in the evolution of workload-aware scheduling, building on the foundation laid in v1.35. AI, ML, and batch workloads often demand scheduling logic that goes far beyond simple per-Pod placement. To address these challenges, the Kubernetes community has introduced a streamlined architecture that cleanly separates the static template definition from runtime state management. This release not only refines the Workload and PodGroup APIs but also debuts topology-aware scheduling, workload-aware preemption, and deeper integration with the Job controller. Let’s dive into the key improvements.
Workload and PodGroup API: A Clean Separation
From v1alpha1 to v1alpha2
In Kubernetes v1.35, both the Pod group definition and its runtime state were bundled within the same Workload resource. This design worked but limited scalability and clarity. Kubernetes v1.36 introduces the scheduling.k8s.io/v1alpha2 API group, which entirely replaces the previous v1alpha1 version. Now, the Workload API serves as a static template, while the new PodGroup API handles all runtime aspects. This separation simplifies the scheduler’s logic and improves performance by allowing per-replica sharding of status updates.
How the New Model Works
The kube-scheduler can now read the PodGroup object directly, without needing to parse the Workload resource itself. The scheduler only cares about the runtime state—the group’s current membership, scheduling conditions, and policy. This decoupling makes the scheduling cycle more efficient and paves the way for future enhancements like atomic workload processing.
Configuration Example
A Workload controller (such as the Job controller) defines a Workload object that acts as a template. For instance, a training job might define a template for worker pods:
apiVersion: scheduling.k8s.io/v1alpha2
kind: Workload
metadata:
name: training-job-workload
namespace: some-ns
spec:
podGroupTemplates:
- name: workers
schedulingPolicy:
gang:
minCount: 4
Controllers then stamp out runtime PodGroup instances based on these templates. Each PodGroup object holds the actual scheduling policy and a reference to the template it was created from. It also includes status conditions that reflect the scheduling state of all member Pods, enabling the scheduler to make informed decisions.
apiVersion: scheduling.k8s.io/v1alpha2
kind: PodGroup
metadata:
name: training-job-workers-xyz
namespace: some-ns
spec:
# ... policy fields inherited from template
status:
conditions:
- type: Scheduled
status: "True"
lastTransitionTime: "..."
This example demonstrates how the runtime PodGroup carries the necessary information for the scheduler to work with, without requiring access to the original Workload object.
Enhanced Scheduling Capabilities
PodGroup Scheduling Cycle
Kubernetes v1.36 introduces a dedicated PodGroup scheduling cycle in the kube-scheduler. This cycle enables atomic processing of workloads: the scheduler evaluates all Pods in a PodGroup together, ensuring that either the entire group is scheduled or none are (gang scheduling semantics). This is critical for batch workloads where all workers must be available simultaneously.
Topology-Aware Scheduling and Workload-Aware Preemption
The release also debuts the first iterations of topology-aware scheduling and workload-aware preemption. Topology-aware scheduling allows the scheduler to place Pods from a PodGroup in a way that respects node topology (e.g., same rack or availability zone), reducing latency for tightly coupled workloads. Workload-aware preemption ensures that preemption decisions consider the entire PodGroup—for example, not preempting a single Pod from a group if doing so would prevent the rest from running.
Dynamic Resource Allocation via ResourceClaims
Another highlight is the support for ResourceClaim objects within workloads. This unlocks Dynamic Resource Allocation (DRA) for PodGroups, allowing workloads to request specialized hardware (e.g., GPUs, FPGAs) on a per-PodGroup basis. The scheduler can then coordinate the allocation of these resources across all members of the group.
Integration with Job Controller
To demonstrate real-world readiness, v1.36 delivers the first phase of integration between the Job controller and the new Workload/PodGroup APIs. This integration allows existing batch jobs to seamlessly leverage the improved scheduling capabilities without requiring manual configuration of Workload resources. The Job controller can automatically create the appropriate Workload template and PodGroup instances, enabling users to benefit from gang scheduling and topology awareness with minimal changes to their workflows.
Conclusion
Kubernetes v1.36 represents a leap forward in workload-aware scheduling. By cleanly separating static templates from runtime state, introducing a dedicated PodGroup scheduling cycle, and adding support for topology-aware scheduling and DRA, the release provides a robust foundation for AI/ML and batch workloads. The integration with the Job controller ensures that these features are practical and easy to adopt. As the community continues to refine these APIs in future releases, users can expect even more powerful scheduling capabilities for complex workloads.
Related Articles
- How to Transition from CEO to Fulfilling Sabbatical: A Step-by-Step Guide
- Mastering Multi-Cloud Visibility with HCP Terraform and Infragraph: A Step-by-Step Guide
- 7 Game-Changing Features of the Data Wrangler Notebook Results Table You Need to Know
- 10 Surprising Benefits of Deleting Instagram That Will Soothe Your Soul
- How to Track AI Spend on Amazon Bedrock with IAM Cost Allocation
- 10 Crucial Facts About Microsoft Teams’ Game-Changing File Preview Fix
- CSPNet Breakthrough: New Architecture Delivers Performance Gains Without Compromising Speed
- Kubernetes v1.36: Key Upgrades to Workload-Aware Scheduling – 8 Essential Insights