Server: decouple held long-poll Publishes from the request-queue worker budget by marcschier · Pull Request #3950 · OPCFoundation/UA-.NETStandard

marcschier · 2026-07-03T13:36:26Z

Description

Decouples held long-poll Publish requests from the server's request-queue worker/thread budget (a follow-up to the scalability analysis added by #3941). A parked Publish — one waiting for the next subscription notification — no longer occupies one of MaxRequestThreadCount worker slots for the whole wait, so a small, fixed worker pool can hold many thousands of outstanding Publishes and MaxRequestThreadCount no longer has to scale with session count.

The problem

The OS thread is already released while a Publish is parked (SessionPublishQueue.PublishAsync returns a RunContinuationsAsynchronously TaskCompletionSource task). But ServerBase.RequestQueue.WorkerLoopAsync awaited each request to completion inline and held the active-worker counter for the entire long-poll. Each held Publish therefore occupied a worker-loop slot; ScheduleIncomingRequest spawns workers up to MaxRequestThreadCount and the queue raises the process ThreadPool ceiling to match. Consequence: to serve N sessions you had to size MaxRequestThreadCount ≥ N.

The change — park-or-complete, via an explicit park sink

IRequestParkSink (Opc.Ua.Types) + a nullable RequestLifetime.ParkSink carrier. One-shot, idempotent.
Opc.Ua.Core: internal RequestParkSink (TaskCompletionSource<bool>, cross-TFM) and a compat-safe internal IParkableIncomingRequest : IEndpointIncomingRequest. EndpointIncomingRequest creates a sink only for Publish requests and flows it onto the RequestLifetime. ServerBase.RequestQueue awaits Task.WhenAny(processing, ParkSink.ParkedTask) and releases the active-worker slot at the park point rather than at completion, observing the detached parked task fault-safely (ProcessRequestSafeAsync never throws and always faults the request to the client, exactly as the legacy path did).
Opc.Ua.Server: StandardServer.PublishAsync flows the sink; SubscriptionManager.PublishAsync and SessionPublishQueue.PublishAsync take an IRequestParkSink parameter (used throughout — the pre-existing 2.0-only signatures are replaced, not duplicated); NotifyParked() is raised at the single park site (queuing the incomplete Tcs.Task).
Escape hatch: ServerConfiguration.DecoupleHeldPublishRequests (default true) restores the legacy inline-await worker path when set to false.

Response delivery, publish ordering, timeout and cancellation are unchanged — the response still flows through the request's value-task source, and the per-request timeout/cancel still fault the TaskCompletionSource.

Performance — no fast-path regression by construction

Only Publish requests carry a sink and take the park-or-complete path. Every other request (Read/Write/Browse/Call/…) keeps the legacy inline path with no extra per-request allocation and no Task.WhenAny — it is byte-for-byte the prior behavior. Publish adds a single WhenAny + one small TaskCompletionSource, which is negligible against a multi-second park and is dwarfed by the worker slot it releases. Worker-pool footprint stays flat as the number of held Publishes grows (see the 200-parked-on-2-workers test).

Tests

Opc.Ua.Core.Tests/RequestQueueTests: a parked request releases its worker so a single worker services the next request while the first is still parked; with DecoupleHeldPublishRequests = false the single worker stays blocked until completion; 200 parked requests are served by a 2-worker pool.
Opc.Ua.Server.Tests/SessionPublishQueueTests: NotifyParked is raised exactly once when a Publish parks and never when it returns/faults immediately.
Regression (net10.0): Opc.Ua.Core.Tests 3592 passed, Opc.Ua.Server.Tests 1904 passed, Opc.Ua.Subscriptions.Tests 505 passed (real Publish long-polls, failover, transfer, durable). Opc.Ua.Server/Opc.Ua.Core build clean on net48 and net10.0.

Docs

Docs/ServerScalability.md rewritten as user-facing documentation covering the server's scalability characteristics, admission controls, the held-Publish decoupling, and DecoupleHeldPublishRequests / MaxRequestThreadCount sizing guidance.

Related Issues

Follow-up to the scalability analysis added in #3941 (Docs/ServerScalability.md).

Checklist

I have signed the CLA and read the CONTRIBUTING doc.
I have added tests that prove my fix is effective or that my feature works and increased code coverage.
I have added all necessary documentation.
I have verified that my changes do not introduce (new) build or analyzer warnings.
I ran all tests locally using the UA.slnx solution against at least .net framework and .net 10, and all passed.
I fixed all failing and flaky tests in the CI pipelines and all CodeQL warnings.
I have addressed all PR feedback received.

Draft — do not merge until reviewed.

…r budget A parked Publish (waiting for notifications) now releases its processing worker at the park point instead of occupying one of MaxRequestThreadCount worker slots for the whole wait, so the worker/thread budget no longer scales with session count. - Opc.Ua.Types: new IRequestParkSink; RequestLifetime.ParkSink carrier. - Opc.Ua.Core: internal RequestParkSink + IParkableIncomingRequest; EndpointIncomingRequest creates the sink and flows it onto the RequestLifetime; ServerBase.RequestQueue awaits park-or-complete (Task.WhenAny) and releases the active-worker slot at park, observing the detached parked task fault-safely. New ServerConfiguration.DecoupleHeldPublishRequests (default true) escape hatch, plumbed via InitializeRequestQueue. - Opc.Ua.Server: StandardServer.PublishAsync flows the sink; SubscriptionManager.PublishAsync and SessionPublishQueue.PublishAsync gain IRequestParkSink? overloads; NotifyParked() is raised at the single park site (queuing the incomplete Tcs.Task). Response delivery, ordering, timeout and cancellation are unchanged. net10 + net48 clean.

- Opc.Ua.Core.Tests RequestQueueTests: a parked request releases its worker so a single worker services the next request while the first is still parked; the DecoupleHeldPublishRequests=false path keeps the worker blocked; 200 parked requests are served by a 2-worker pool. - Opc.Ua.Server.Tests SessionPublishQueueTests: NotifyParked is raised exactly once when a Publish parks and never when it returns/faults immediately.

…h hot path free Only requests that can actually park (Publish long-polls) allocate a park sink and take the park-or-complete worker path; every other request (Read/Write/Browse/...) keeps the legacy inline path with no extra per-request allocation or Task.WhenAny. This makes the dominant non-Publish request path byte-for-byte equivalent to before the change, so there is no fast-path regression, while held Publishes still release their worker at the park point. ParkSink is now nullable (null = cannot park).

…lability.md

CLAassistant · 2026-07-03T13:36:33Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ marcschier
❌ agent

agent seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

codecov · 2026-07-03T14:08:03Z

Codecov Report

❌ Patch coverage is 75.00000% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.49%. Comparing base (26ef554) to head (dc8ac3d).

Files with missing lines	Patch %	Lines
...pc.Ua.Core/Stack/Server/ServerBase.RequestQueue.cs	65.51%	10 Missing ⚠️
Stack/Opc.Ua.Core/Stack/Server/ServerBase.cs	71.42%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3950      +/-   ##
==========================================
- Coverage   73.52%   73.49%   -0.03%     
==========================================
  Files        1170     1171       +1     
  Lines      170163   170202      +39     
  Branches    29361    29366       +5     
==========================================
- Hits       125117   125096      -21     
- Misses      34043    34101      +58     
- Partials    11003    11005       +2

Files with missing lines	Coverage Δ
Libraries/Opc.Ua.Server/Server/StandardServer.cs	`75.64% <100.00%> (+0.01%)`	⬆️
.../Opc.Ua.Server/Subscription/SessionPublishQueue.cs	`85.15% <100.00%> (+0.05%)`	⬆️
.../Opc.Ua.Server/Subscription/SubscriptionManager.cs	`82.02% <100.00%> (-0.68%)`	⬇️
...ack/Server/EndpointBase.EndpointIncomingRequest.cs	`81.25% <100.00%> (+1.25%)`	⬆️
Stack/Opc.Ua.Core/Stack/Server/RequestParkSink.cs	`100.00% <100.00%> (ø)`
Stack/Opc.Ua.Types/BuiltIn/RequestLifetime.cs	`90.32% <ø> (ø)`
Stack/Opc.Ua.Core/Stack/Server/ServerBase.cs	`63.22% <71.42%> (+0.03%)`	⬆️
...pc.Ua.Core/Stack/Server/ServerBase.RequestQueue.cs	`83.49% <65.51%> (-3.10%)`	⬇️

... and 15 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- ServerScalability.md: rewrite as user-facing documentation — drop tracking IDs (B1-B5/S1-S4) and status/temporal wording (delivered/now/roadmap/options), and describe the server's scalability characteristics, admission controls, and the held-Publish decoupling as current functionality. - Remove the delegating PublishAsync compat overloads (2.0-only API, no migration needed) in favor of the IRequestParkSink signature throughout: ISubscriptionManager, SubscriptionManager, SessionPublishQueue now expose only the park-sink method; test/benchmark callers updated to pass the sink (null when none).

…anch Integrates PR #3950 "Server: decouple held long-poll Publishes from the request-queue worker budget" (the S1 scalability item) into this branch. Conflict resolution: Docs/ServerScalability.md - took #3950's concise rewrite (its "Admission control and rate limiting" section already summarizes the rate-limiting + retry-after work, and it references RateLimiting.md for the full surface) and added a See-also link to RetryAfterSignaling.md. All code files auto-merged cleanly (StandardServer.cs regions are disjoint).

…y numbers With DecoupleHeldPublishRequests default-on (S1), a held Publish releases its request-processing worker at the park point, so the worker pool no longer has to scale with the session count. An A/B load-test campaign (ServerManySessionsLoadTest, Xeon W-2235 6c/12t, quiet machine) shows a ~200-worker pool cleanly establishes and serves ~4000 sessions (8-17x the ~140-350 the coupled-off path manages at the same budget), while a session-count-sized pool (10500) is ~2x slower to establish and reaches a LOWER ceiling (thread oversubscription during the RSA-handshake burst). - LoadTest fixture: MaxRequestThreadCount 10500->200, MinRequestThreadCount 200->50, with the rationale comment rewritten for the S1 small-pool model. - Benchmarks.md: refresh the server-session scalability table (2000/2500/4000 now establish cleanly; 10000 remains the establishment wall) and the MaxRequestThreadCount guidance (size to active concurrency, not session count). ServerScalability.md prose was rewritten upstream (8f7a95d); measured numbers now live in Benchmarks.md, which that doc references.

marcschier · 2026-07-04T04:29:17Z

Merged into #3946. The held long-poll Publish decoupling (S1) commits from this branch — including the latest dc8ac3dc3 (right-size load-test worker pool / refresh benchmark numbers) — are now part of #3946, which combines the admission-control / rate-limiting, retry-after signaling, and held-Publish decoupling work into one server-session-scalability PR. Closing as its changes are carried by #3946.

agent added 4 commits July 3, 2026 15:31

Docs: mark S1 (held-Publish worker decoupling) delivered in ServerSca…

a397ecf

…lability.md

marcschier commented Jul 3, 2026

View reviewed changes

marcschier changed the title ~~Server: decouple held long-poll Publishes from the request-queue worker budget (S1)~~ Server: decouple held long-poll Publishes from the request-queue worker budget Jul 3, 2026

marcschier mentioned this pull request Jul 4, 2026

Server session scalability: admission control, retry-after signaling, and held-Publish decoupling #3946

Open

7 tasks

marcschier closed this Jul 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Server: decouple held long-poll Publishes from the request-queue worker budget#3950

Server: decouple held long-poll Publishes from the request-queue worker budget#3950
marcschier wants to merge 6 commits into
masterfrom
copilot/decouple-held-publishes

marcschier commented Jul 3, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Jul 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jul 3, 2026 •

edited

Loading

Uh oh!

marcschier commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

marcschier commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

The problem

The change — park-or-complete, via an explicit park sink

Performance — no fast-path regression by construction

Tests

Docs

Related Issues

Checklist

Uh oh!

CLAassistant commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Jul 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

marcschier commented Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

marcschier commented Jul 3, 2026 •

edited

Loading

CLAassistant commented Jul 3, 2026 •

edited

Loading

codecov Bot commented Jul 3, 2026 •

edited

Loading