perf(runtime): size Tokio worker pool from Lambda memory tier by duncanista · Pull Request #1277 · DataDog/datadog-lambda-extension

duncanista · 2026-06-24T03:14:29Z

DRAFT — stacked on #1271 (jordan.gonzalez/cold-start-instrumentation/feature). Review/merge #1271 first; this PR's diff is against that branch, not main.

Jira: none yet — add before marking ready.

Overview

Cold-start hypothesis H15: right-size the Tokio runtime.

Today the extension uses #[tokio::main], which sizes the worker pool from std::thread::available_parallelism(). In a Lambda sandbox that reflects the host's core count, not the fraction of vCPU the function is actually granted, so low-memory functions can spin up more worker threads than they have CPU for (extra thread stacks + scheduler overhead during init).

This PR replaces #[tokio::main] with an explicit multi-thread runtime whose worker count is derived from the Lambda memory tier:

AWS allocates roughly 1 vCPU per 1769 MB of configured memory.
workers = round(mem_mb / 1769), computed with integer math ((mem_mb + 884) / 1769, since 884 = 1769 / 2) to avoid clippy::pedantic float-cast lints, then clamped to 1..=4.
The memory value is read from AWS_LAMBDA_FUNCTION_MEMORY_SIZE and parsed as u32; if the variable is missing or unparseable, we fall back to 2 workers (no .unwrap()).

Resulting mapping (a few points):

Memory	vCPU (approx)	Workers
≤ 1769 MB	≤ 1.0	1
~2654–3538 MB	~1.5–2.0	2
~5307 MB	~3.0	3
≥ 7076 MB (incl. 10240 max)	≥ 4.0	4 (clamped)

.enable_all() is preserved and the rt-multi-thread Tokio feature is unchanged. The init body is moved verbatim from async fn main into a new async fn run() that the runtime drives via block_on(run()); no other behavior changes, and all H0 cold-start instrumentation (the available_parallelism debug log, log_init_checkpoint helper and its calls) is preserved.

This pairs with the H0 available_parallelism log added in #1271: that log makes the value the runtime would have used observable per tier, and this change makes the value it does use deliberate. Net cold-start and steady-state effect of the chosen worker count still needs benchmarking across memory tiers before this is considered a confirmed win.

Testing

cargo fmt and cargo clippy --bin bottlecap --no-deps are clean (only the pre-existing buf_redux/multipart future-incompatibility note remains; pedantic + unwrap_used are denied crate-wide and pass).
The heuristic was unit-checked against representative inputs — 128 / 512 / 884 / 1769 → 1, 2654 / 3008 / 3538 → 2, 5307 → 3, 7076 / 10240 → 4, 0 → 1, and missing/empty/non-numeric/whitespace-padded inputs → fallback/trimmed as expected.
Worker-count performance impact across memory tiers to be measured (see note above).

Replace #[tokio::main] with an explicit multi-thread runtime whose worker count is derived from AWS_LAMBDA_FUNCTION_MEMORY_SIZE. AWS grants ~1 vCPU per 1769 MB, so workers = round(mem_mb / 1769) clamped to 1..=4 (integer math, no float casts; defaults to 2 when the env var is missing or unparseable). The init body moves verbatim into run(); all H0 cold-start instrumentation is preserved.

datadog-official · 2026-06-24T03:28:29Z

✨ Fix all issues with BitsAI

⚠️ Warnings

🚦 7 Pipeline jobs failed

DataDog/datadog-lambda-extension | integration-suite: [on-demand]

DataDog/datadog-lambda-extension | e2e-test-status (amd64)

DataDog/datadog-lambda-extension | e2e-test-status (amd64, fips)

View all 7 failed jobs.

Useful? React with 👍 / 👎

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: a75e3cb | Docs | Datadog PR Page | Give us feedback!}

duncanista mentioned this pull request Jun 24, 2026

Cold-start improvements — integration (combined testing) #1284

Draft

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(runtime): size Tokio worker pool from Lambda memory tier#1277

perf(runtime): size Tokio worker pool from Lambda memory tier#1277
duncanista wants to merge 1 commit into
jordan.gonzalez/cold-start-instrumentation/featurefrom
jordan.gonzalez/tokio-runtime/feature

duncanista commented Jun 24, 2026

Uh oh!

datadog-official Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

duncanista commented Jun 24, 2026

Overview

Testing

Uh oh!

datadog-official Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Warnings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

datadog-official Bot commented Jun 24, 2026 •

edited

Loading