feat(spider-scheduler): Add gRPC binary. by sitaowang1998 · Pull Request #369 · y-scope/spider

sitaowang1998 · 2026-07-01T03:13:08Z

Description

This PR:

Adds scheduler requests unpack.
Adds scheduler gRPC service handlers and error mapping to gRPC status.
Adds scheduler gRPC server binary.

Note

This PR does not add new tests.

Checklist

The PR satisfies the contribution guidelines.
This is a breaking change and that has been indicated in the PR title, OR this isn't a
breaking change.
Necessary docs have been updated, OR no docs need to be updated.

Validation performed

GitHub workflows pass.

Summary by CodeRabbit

New Features
- Added a scheduler gRPC server command-line entrypoint that starts from a YAML config file.
- Added gRPC support for next task, heartbeat, and shutdown requests.
- Expanded scheduler configuration to include storage endpoint and pool sizing.
Bug Fixes
- Improved task assignment protobuf-to-core conversion and request handling.
- Enhanced scheduler service error mapping and shutdown/cancellation behavior.
Tests
- Added a unit test to verify assignment record conversions.

# Conflicts: # components/spider-scheduler/src/error.rs # components/spider-scheduler/src/lib.rs # components/spider-scheduler/src/service.rs

coderabbitai · 2026-07-01T03:13:18Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 7fa84b49-cbc7-4273-b647-5900c6aac7b3

📥 Commits

Reviewing files that changed from the base of the PR and between 41f0a80 and 4f97796.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (1)

components/spider-scheduler/src/grpc.rs

🚧 Files skipped from review as they are similar to previous changes (1)

components/spider-scheduler/src/grpc.rs

Walkthrough

This PR adds protobuf-to-core conversion for task assignment records, scheduler request unpacking, a tonic gRPC adapter, server configuration types, and a new scheduler gRPC server binary with Cargo wiring.

Changes

Scheduler gRPC server

Layer / File(s)	Summary
Proto-to-core assignment conversion `components/spider-proto-rust/src/assignment.rs`	Adds `From<ProtoTaskAssignmentRecord>` for `TaskAssignmentRecord` and a unit test for the conversion.
Scheduler request unpacking `components/spider-proto-rust/src/unpack/mod.rs`, `components/spider-proto-rust/src/unpack/scheduler.rs`	Registers the scheduler unpack module and implements `RequestUnpack` for `NextTaskRequest`, `HeartbeatRequest`, and `ShutdownRequest`.
GrpcSchedulerService adapter `components/spider-scheduler/src/grpc.rs`, `components/spider-scheduler/src/lib.rs`	Adds `GrpcSchedulerService`, maps scheduler errors to tonic `Status`, converts `next_task` responses, and exports the new gRPC module.
Server configuration and derives `components/spider-scheduler/src/config.rs`, `components/spider-scheduler/src/runtime.rs`, `components/spider-scheduler/src/execution_manager_registry.rs`, `components/spider-scheduler/src/lib.rs`	Adds `ServerConfig`, updates related derives, and re-exports `ServerConfig` from the crate.
gRPC server binary and Cargo wiring `components/spider-scheduler/src/bin/grpc_server.rs`, `components/spider-scheduler/Cargo.toml`	Adds the `spider_scheduler_grpc_server` binary, config loading, storage connection, shutdown handling, and dependency/bin target updates.

Estimated code review effort: 3 (Moderate) | ~30 minutes

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant GrpcSchedulerService
  participant SchedulerServiceState
  participant RequestUnpack

  Client->>GrpcSchedulerService: next_task / heartbeat / shutdown
  GrpcSchedulerService->>RequestUnpack: unpack protobuf request
  RequestUnpack-->>GrpcSchedulerService: domain inputs
  GrpcSchedulerService->>SchedulerServiceState: next_task / heartbeat / shutdown
  SchedulerServiceState-->>GrpcSchedulerService: result or SchedulerServiceError
  GrpcSchedulerService-->>Client: protobuf response or tonic Status

sequenceDiagram
  participant Main
  participant ServerConfig
  participant GrpcSchedulerStorageClient
  participant GrpcSchedulerService
  participant TonicServer

  Main->>ServerConfig: load YAML config
  Main->>GrpcSchedulerStorageClient: connect(storage_endpoint)
  Main->>GrpcSchedulerService: new(state, cancellation_token)
  Main->>TonicServer: serve_with_shutdown(grpc_service)
  TonicServer-->>Main: shutdown on Ctrl-C or cancellation
  Main->>Main: runtime.stop().await

Possibly related PRs

y-scope/spider#342: Adds related protobuf-to-core task assignment handling used by the scheduler RPC flow.
y-scope/spider#365: Introduces scheduler service state and error types consumed by the new gRPC adapter.
y-scope/spider#368: Adds runtime wiring that the new scheduler server entrypoint uses.

Suggested reviewers: LinZhihao-723

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely captures the main change: adding a gRPC binary for spider-scheduler.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

sitaowang1998 · 2026-07-01T04:56:32Z

@CodeRabbit review

coderabbitai · 2026-07-01T04:57:16Z

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

🧹 Nitpick comments (4)

components/spider-proto-rust/src/unpack/scheduler.rs (1)
22-32: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Unbounded client-supplied wait_time_ms can hold a request open indefinitely.

Duration::from_millis(self.wait_time_ms) accepts any client-supplied u64 unmodified and is later passed straight to dispatch_source.dequeue(wait_time). A malicious or buggy client can request an effectively unbounded wait (e.g., u64::MAX ms), tying up server resources for that RPC until an assignment appears or the connection drops. Consider clamping wait_time_ms to a sane maximum before converting to Duration.
🛡️ Proposed clamp
+const MAX_WAIT_TIME_MS: u64 = 60_000;
+
 impl RequestUnpack for NextTaskRequest {
     type Unpacked = (ExecutionManagerId, Option<TaskAssignmentRecord>, Duration);

     fn unpack(self) -> Result<Self::Unpacked, UnpackError> {
         Ok((
             ExecutionManagerId::from(self.execution_manager_id),
             self.prev_assignment.map(ProtoTaskAssignmentRecord::into),
-            Duration::from_millis(self.wait_time_ms),
+            Duration::from_millis(self.wait_time_ms.min(MAX_WAIT_TIME_MS)),
         ))
     }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-proto-rust/src/unpack/scheduler.rs` around lines 22 - 32,
The NextTaskRequest::unpack path currently converts client-controlled
wait_time_ms directly into a Duration, allowing an effectively unbounded wait.
Update NextTaskRequest::unpack to clamp self.wait_time_ms to a sensible maximum
before calling Duration::from_millis, and keep the rest of the unpacked tuple
unchanged. Make the limit explicit near the Duration::from_millis conversion so
the behavior is easy to find and maintain.
components/spider-scheduler/src/grpc.rs (1)
161-212: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Minor inconsistency: TAG constant only used in next_task.

next_task defines a local const TAG while heartbeat/shutdown pass string literals directly to service_error_handler. Purely cosmetic, no functional issue.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-scheduler/src/grpc.rs` around lines 161 - 212, The local
TAG constant in next_task is inconsistent with heartbeat and shutdown, which
pass string literals directly to service_error_handler. Clean this up by either
reusing a shared tag symbol across next_task, heartbeat, and shutdown, or by
removing TAG and passing the same literal style everywhere so the error handling
calls are uniform in grpc.rs.
components/spider-scheduler/src/bin/grpc_server.rs (2)
44-59: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

No SIGTERM handling for graceful shutdown.

Shutdown is only wired to the cancellation token and Ctrl-C (SIGINT). In containerized/orchestrated deployments (e.g., Kubernetes, systemd), shutdown is typically signalled via SIGTERM, which this server would not currently handle gracefully, likely resulting in a hard kill instead of a clean drain/stop.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-scheduler/src/bin/grpc_server.rs` around lines 44 - 59, The
gRPC server shutdown path in serve_with_shutdown only reacts to the cancellation
token and tokio::signal::ctrl_c, so SIGTERM is not handled. Update the shutdown
future in grpc_server.rs to also listen for SIGTERM alongside Ctrl-C and
cancellation_token, and trigger cancellation when SIGTERM is received. Use the
existing serve_with_shutdown, cancellation_token, and select! shutdown block so
the server can drain cleanly in orchestrated environments.
61-64: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Stop error is silently swallowed when serve also fails.

stop_result is computed unconditionally, but if serve_result is Err, the ? on line 62 returns before line 63 is evaluated, so any error from runtime.stop() is dropped without ever being surfaced (not even logged).
Suggested fix
     let stop_result = runtime.stop().await;
-    serve_result?;
-    stop_result?;
+    if let Err(ref error) = stop_result {
+        tracing::error!(error = %error, "Failed to stop scheduler runtime cleanly.");
+    }
+    serve_result?;
+    stop_result?;
     Ok(())
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/spider-scheduler/src/bin/grpc_server.rs` around lines 61 - 64, The
grpc_server main flow is dropping a runtime.stop() failure whenever serve_result
is Err because the serve_result? return short-circuits before stop_result? is
reached. Update the grpc_server logic to evaluate and report both outcomes
explicitly in this block around runtime.stop(), serve_result, and stop_result,
so a stop error is not silently lost even when serving also fails.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@components/spider-proto-rust/src/unpack/scheduler.rs`:
- Around line 22-32: The NextTaskRequest::unpack path currently converts
client-controlled wait_time_ms directly into a Duration, allowing an effectively
unbounded wait. Update NextTaskRequest::unpack to clamp self.wait_time_ms to a
sensible maximum before calling Duration::from_millis, and keep the rest of the
unpacked tuple unchanged. Make the limit explicit near the Duration::from_millis
conversion so the behavior is easy to find and maintain.

In `@components/spider-scheduler/src/bin/grpc_server.rs`:
- Around line 44-59: The gRPC server shutdown path in serve_with_shutdown only
reacts to the cancellation token and tokio::signal::ctrl_c, so SIGTERM is not
handled. Update the shutdown future in grpc_server.rs to also listen for SIGTERM
alongside Ctrl-C and cancellation_token, and trigger cancellation when SIGTERM
is received. Use the existing serve_with_shutdown, cancellation_token, and
select! shutdown block so the server can drain cleanly in orchestrated
environments.
- Around line 61-64: The grpc_server main flow is dropping a runtime.stop()
failure whenever serve_result is Err because the serve_result? return
short-circuits before stop_result? is reached. Update the grpc_server logic to
evaluate and report both outcomes explicitly in this block around
runtime.stop(), serve_result, and stop_result, so a stop error is not silently
lost even when serving also fails.

In `@components/spider-scheduler/src/grpc.rs`:
- Around line 161-212: The local TAG constant in next_task is inconsistent with
heartbeat and shutdown, which pass string literals directly to
service_error_handler. Clean this up by either reusing a shared tag symbol
across next_task, heartbeat, and shutdown, or by removing TAG and passing the
same literal style everywhere so the error handling calls are uniform in
grpc.rs.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 2e12ec90-561a-4584-889b-6dfd122af107

📥 Commits

Reviewing files that changed from the base of the PR and between dca2b8a and 41f0a80.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (10)

components/spider-proto-rust/src/assignment.rs
components/spider-proto-rust/src/unpack/mod.rs
components/spider-proto-rust/src/unpack/scheduler.rs
components/spider-scheduler/Cargo.toml
components/spider-scheduler/src/bin/grpc_server.rs
components/spider-scheduler/src/config.rs
components/spider-scheduler/src/execution_manager_registry.rs
components/spider-scheduler/src/grpc.rs
components/spider-scheduler/src/lib.rs
components/spider-scheduler/src/runtime.rs

sitaowang1998 and others added 16 commits June 29, 2026 11:24

Add scheduler service

de929da

Update doc

182d09d

Add grpc server

a82298d

Fix docstring

bea09d5

Merge branch 'main' into scheduler-service

09c351b

Fix merge

3baaf8b

Merge branch 'scheduler-service' into scheduler-grpc-server

7c82fef

Implement scheduler runtime.

c7b2216

Merge branch 'scheduler-runtime' into scheduler-grpc-server

351b00f

Add scheduler grpc server

b3d3407

Fix toml

c876a54

Merge branch 'oss-main' into scheduler-runtime

04e3e1e

# Conflicts: # components/spider-scheduler/src/error.rs # components/spider-scheduler/src/lib.rs # components/spider-scheduler/src/service.rs

Merge branch 'scheduler-runtime' into scheduler-grpc-server

50a562f

Fix merge

bbf288d

Merge branch 'main' into scheduler-grpc-server

5edb5b8

Fix grpc status

41f0a80

sitaowang1998 requested a review from a team as a code owner July 1, 2026 03:13

coderabbitai Bot reviewed Jul 1, 2026

View reviewed changes

sitaowang1998 added 2 commits July 2, 2026 15:33

Merge branch 'main' into scheduler-grpc-server

fa97dd6

Fix logging

4f97796

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(spider-scheduler): Add gRPC binary.#369

feat(spider-scheduler): Add gRPC binary.#369
sitaowang1998 wants to merge 18 commits into
y-scope:mainfrom
sitaowang1998:scheduler-grpc-server

sitaowang1998 commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

sitaowang1998 commented Jul 1, 2026

Uh oh!

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sitaowang1998 commented Jul 1, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Validation performed

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Uh oh!

sitaowang1998 commented Jul 1, 2026

Uh oh!

coderabbitai Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sitaowang1998 commented Jul 1, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading

coderabbitai Bot commented Jul 1, 2026 •

edited

Loading