Add WorkerPool scheduling fields by han2ni3bal-pixel · Pull Request #247 · agent-substrate/substrate

han2ni3bal-pixel · 2026-06-14T15:21:35Z

Summary

This PR adds a small, controlled set of scheduling/resource fields to WorkerPool so WorkerPool-managed Pods can be placed onto appropriate Kubernetes nodes and get proper resource guarantees.

Added fields:

resources
nodeSelector
tolerations
priorityClassName
nodeAffinity

Scope

This intentionally does not expose a full PodTemplateSpec, and does not add support for full affinity, podAffinity, or podAntiAffinity.

The goal is to keep the first version small while addressing the WorkerPool-to-Kubernetes-Node scheduling gap discussed in #212.

Related issue

Part of #212.
Related to #47.

google-cla · 2026-06-14T15:21:40Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Julian Gutierrez Oschmann (juli4n) · 2026-06-15T22:45:03Z

+// WorkerPoolPodTemplate defines optional scheduling and resource settings for
+// worker pods. Resources apply to the fixed ateom container; nodeAffinity is
+// mapped to spec.affinity.nodeAffinity on the pod.
+type WorkerPoolPodTemplate struct {


Can we scope this down this CL to NodeSelector and Tolerations? Do you have use cases for node affinity and priorityClass?

I would leave resources out of scope as that is being discussed in https://docs.google.com/document/d/1_i__JEdmRexKOGP4VJ7bOWrgdS9vPd-BQSR0sxg7GQ4/edit?resourcekey=0-jtYHhxfAGZCNcoknXn2GbA&tab=t.0

Julian Gutierrez Oschmann (@juli4n) Thanks for the feedback!

That makes sense to me regarding resources. We can leave resource requests/limits out of scope for this CL while the WorkerPool resource model is still being discussed separately.

For nodeAffinity, the use case I had in mind is supporting WorkerPools across heterogeneous node pools where nodeSelector is too limited. For example, a WorkerPool may want to require or prefer one of several equivalent accelerator/local-SSD/cache node pools, or express a soft preference during a node pool migration. In those cases, nodeSelector only supports exact key/value matching, while nodeAffinity gives us required/preferred terms and set-based matching.

For priorityClassName, the use case is preserving warm WorkerPool capacity under cluster pressure. Since WorkerPool pods are the slots available for actor resume, giving certain pools a higher priority can help protect latency-sensitive or interactive actor workloads from being displaced by lower-priority batch workloads. It also allows different WorkerPools to represent different service classes, e.g. interactive vs. batch.

That said, I’m happy to scope this PR down if you still prefer to keep the initial API surface smaller. Would you prefer that I keep nodeAffinity and priorityClassName in this CL with the above use cases, or should I remove them and follow up separately after nodeSelector/tolerations land?

Thanks again for taking a look.

I’ve removed the resources field from this CL, btw.

Julian Gutierrez Oschmann (juli4n) · 2026-06-16T15:12:56Z

+				WithPath(ateompath.BasePath).
+				WithType(corev1.HostPathDirectoryOrCreate)))
+
+	applyWorkerPoolPodTemplate(podSpecAC, containerAC, wp.Spec.Template)


While I understand that with the introduction of the template now creating the DeploymentApplyConfiguration is more complex / dynamic (it cannot be represented with a single struct literal), I find the approach of creating the static part here and then calling a method (in another file) to add the dynamic part a bit convoluted.

How about we move buildDeploymentApplyConfig to a new file (workerpool_apply.go?) and implement the whole generation of *appsv1ac.DeploymentApplyConfiguration in that method? Then we can add some table driven tests that exercise the buildDeploymentApplyConfig method e2e that take WorkerPool as input and assert expected *appsv1ac.DeploymentApplyConfiguration as output.

Julian Gutierrez Oschmann (juli4n) reviewed Jun 15, 2026

View reviewed changes

han2ni3bal-pixel force-pushed the feat-workerpool-scheduling-fields branch 2 times, most recently from ef4ff52 to bb06b0c Compare June 16, 2026 02:45

feat: add workerpool pod template fields

674aca3

han2ni3bal-pixel force-pushed the feat-workerpool-scheduling-fields branch from bb06b0c to 674aca3 Compare June 16, 2026 03:01

Julian Gutierrez Oschmann (juli4n) reviewed Jun 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add WorkerPool scheduling fields#247

Add WorkerPool scheduling fields#247
han2ni3bal-pixel wants to merge 1 commit into
agent-substrate:mainfrom
han2ni3bal-pixel:feat-workerpool-scheduling-fields

han2ni3bal-pixel commented Jun 14, 2026

Uh oh!

google-cla Bot commented Jun 14, 2026

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jun 15, 2026

Uh oh!

han2ni3bal-pixel Jun 16, 2026

Uh oh!

han2ni3bal-pixel Jun 16, 2026

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

han2ni3bal-pixel commented Jun 14, 2026

Summary

Scope

Related issue

Uh oh!

google-cla Bot commented Jun 14, 2026

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

han2ni3bal-pixel Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

han2ni3bal-pixel Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Julian Gutierrez Oschmann (juli4n) Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants