Skip to content

[Testing] [ENH] Pre-publish CI: run aws-sam-cli durable integration tests against candidate emulator image #430

@dhegberg

Description

@dhegberg

Migration from aws/aws-durable-execution-sdk-python-testing#225


Background

The emulator image public.ecr.aws/durable-functions/aws-durable-execution-emulator:latest is consumed automatically by aws-sam-cli: on every sam local invoke of a durable function, sam-cli pulls :latest and refreshes the local cache (see durable_functions_emulator_container.py; customers can override per-invoke with DURABLE_EXECUTIONS_EMULATOR_IMAGE_TAG but the default is :latest). This means any image we publish ships immediately to every durable-functions customer running sam-cli, with no version pin in between.

PR #216 in this repo recently demonstrated the blast radius: ~26 sam-cli durable integration tests went red across local-invoke, local-start-lambda, tier1-finch, and tier1-windows-other jobs (e.g. aws-sam-cli Integration Tests #496, run #8779 / local-start-lambda) the moment v1.2.0 went to :latest. Customer-visible symptom: a fresh samdev local invoke against any durable function 500s on first checkpoint or 404s on first local execution get|history|stop|callback. Mitigations are in flight on the sam-cli side (aws/aws-sam-cli#9038 merged, #9040 open) but they do not address the class problem: this repo's release pipeline has no signal from sam-cli before publishing :latest.

Why our existing tests didn't catch this

tests/web/e2e/routes_arn_encoding_int_test.py (added in #222) drives a real boto client against this repo's WebServer and would have caught the emulator-side routing bug. It does not — and cannot — exercise sam-cli's LocalLambdaHttpService, which is a separate Flask service that customers' boto clients actually hit when using samdev local invoke. Anything we change in the ARN, callback ID, or function-qualifier shape can break sam-cli's service without touching ours.

Proposal

Add a pre-publish CI step that builds the candidate emulator image and runs sam-cli's durable integration suite against it. Concrete shape:

  1. Build the emulator image from this repo (we already do this in ecr-release.yml).
  2. Tag it locally with a candidate tag, e.g. aws-durable-execution-emulator:pr-${SHA}.
  3. Check out aws/aws-sam-cli at develop, install in SAM_CLI_DEV=1 mode.
  4. Run, with DURABLE_EXECUTIONS_EMULATOR_IMAGE_TAG=pr-${SHA}:
    pytest -vv \
      tests/integration/local/invoke/test_invoke_durable.py \
      tests/integration/local/start_lambda/test_start_lambda_durable.py \
      tests/integration/local/execution/test_execution.py \
      tests/integration/local/callback/test_callback.py
    That's the durable subset — ~50 tests, runs in ~3–5 min in CI based on the local-invoke and local-start-lambda timings above.
  5. Publish to ECR only if step 4 is green.

Gate this on PRs that touch src/** so we get the signal pre-merge as well as pre-publish.

Acceptance criteria

  • A workflow (e.g. .github/workflows/sam-cli-compat.yml) that runs the four sam-cli durable test files against the locally-built emulator image and is required for PRs that change src/.
  • The publish job (ecr-release.yml) gated on the same workflow's success.
  • A README / CONTRIBUTING note explaining that any change affecting the emulator's HTTP contract — ARN shape, callback-token shape, route layout, response codes — must keep this job green.

Out of scope

  • Pinning sam-cli to a specific emulator tag. That just inverts the dependency: customers stop picking up emulator fixes until sam-cli ships a new release. Roll-forward + this CI gate is the durable answer.
  • Running the full sam-cli integration suite. The four files above cover every code path that talks to the emulator.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    testing-sdkIssues related to the AWS Durable Execution Testing SDK

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions