Server: restore monitored-item data/event queues on failover (#3939)#6
Merged
marcschier merged 1 commit intoJul 4, 2026
Conversation
…dation#3939) Add an async queue-restore path so a networked ISubscriptionStore can re-hydrate per-monitored-item data/event queues without blocking the synchronous MonitoredItem creation path, plus a continuous shared-store mirror so queued-but-unpublished values survive an HA failover. - ISubscriptionStore: add Restore{DataChange,Event}MonitoredItemQueueAsync (sync hooks kept as fallback) - IStoredMonitoredItem/StoredMonitoredItem: transient Restored* queue properties - MasterNodeManager: pre-hydrate queues before MonitoredItem construction - MonitoredItem.RestoreQueue: prefer pre-hydrated queue, fall back to sync - Make core queue mutators virtual for mirroring subclasses - StandardServer + hosted service: DI seams for ISubscriptionStore and IMonitoredItemQueueFactory - Opc.Ua.Redundancy.Server: SharedKeyValueMonitoredItemQueueFactory + mirroring queues; wire into UseDistributedSubscriptionMirroring and SharedKeyValueSubscriptionStore async restore - Tests (core, redundancy, AOT) and docs (HighAvailability, migration guide)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements OPCFoundation#3939 — the remaining failover work tracked from OPCFoundation#3918.
Stacked on OPCFoundation#3918 (base branch
nodestatestorage); this PR adds only the OPCFoundation#3939 changes on top and should merge after / into OPCFoundation#3918.Problem
SharedKeyValueSubscriptionStoremirrors subscription definitions and retransmission state, but the per-monitored-item data/event queues were not restored:RestoreDataChangeMonitoredItemQueue/RestoreEventMonitoredItemQueuerun on the synchronous monitored-item creation path, so a networked/async store cannot re-hydrate them. After a failover, values that were queued-but-not-yet-published on the failed replica were lost.Changes
Async restore plumbing (
Opc.Ua.Server)ISubscriptionStore: addRestoreDataChangeMonitoredItemQueueAsync/RestoreEventMonitoredItemQueueAsync(existing sync hooks kept as the fallback for local/durable stores).IStoredMonitoredItem/StoredMonitoredItem: transient (never-serialized)RestoredDataChangeQueue/RestoredEventQueueused to carry a pre-hydrated queue.MasterNodeManager.RestoreMonitoredItemsAsyncpre-fetches each queue asynchronously and hands it to the still-synchronousMonitoredItemconstructor, so a networked store never blocks the creation path.MonitoredItem.RestoreQueueprefers the pre-hydrated queue, else falls back to the synchronous store method.virtualso a mirror can subclass them.StandardServer+OpcUaServerHostedService: DI seams forISubscriptionStoreandIMonitoredItemQueueFactory.Continuous mirror (
Opc.Ua.Redundancy.Server)SharedKeyValueMonitoredItemQueueFactory+Mirroring{DataChange,Event}MonitoredItemQueue: snapshot queue contents on each mutation, coalesce, and persist via a non-blocking background drain (encrypted at rest via the configuredIRecordProtector); on promotion the restore rebuilds a still-mirroring queue.SharedKeyValueSubscriptionStoredelegates its async restore to the factory and cleans stale queue keys;UseDistributedSubscriptionMirroringregisters the factory + store together.Tests & docs
Docs/HighAvailability.md(the two "queues not restored" notes) andDocs/migrate/2.0.x/sessions-subscriptions.md(new asyncISubscriptionStoremembers).Conventions honored: async TAP only (no sync-over-async),
System.Threading.Lock,ByteString/ArrayOf, sealed + DI-injectable with direct-construct fallback, NativeAOT-safe, additive/non-breaking (new 2.0 surface), MIT headers, no regions.