Skip to content

fix(adagents): follow safe well-known redirects#967

Merged
bokelley merged 1 commit into
mainfrom
fix/adagents-well-known-redirects
Jul 1, 2026
Merged

fix(adagents): follow safe well-known redirects#967
bokelley merged 1 commit into
mainfrom
fix/adagents-well-known-redirects

Conversation

@bokelley

@bokelley bokelley commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #947.

This updates the initial /.well-known/adagents.json discovery fetch to follow only safe HTTP redirects:

  • HTTPS-only Location targets
  • same registrable domain, anchored to the originally requested publisher domain
  • Public Suffix List private-section behavior through the existing same_registrable_domain helper
  • IDNA/ACE normalization before eTLD+1 comparison
  • 3-hop cap and circular redirect detection
  • fresh SDK-owned pinned clients after the first hop

authoritative_location dereferences remain no-follow; a 30x response from that exact named URL is still surfaced as an error.

Validation

  • PYTHONPATH=src pytest tests/test_adagents.py -q
  • PYTHONPATH=src pytest tests/test_adagents.py::TestFetchAdagents tests/test_adagents.py::TestSSRFProtection tests/test_etld.py -q
  • uv run --python 3.11 --extra dev ruff check src/adcp/adagents.py tests/test_adagents.py
  • pre-commit hooks on commit: black, ruff, mypy, adopter type-check fixtures, bandit, whitespace, YAML/JSON, large files, merge/case conflict, private key checks

Expert review

  • Security review: no blockers; specifically checked SSRF/DNS rebinding, private PSL/eTLD anchoring, IDNA normalization, and authoritative_location no-follow behavior.
  • Code review: no blockers; suggested extra relative-Location and circular redirect coverage, both added.

@aao-ipr-bot aao-ipr-bot Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving. Redirect-following is re-implemented above the follow_redirects=False primitive, so every hop re-clears the same SSRF gate the old no-follow policy relied on — the boundary moved, the invariant didn't.

Things I checked

  • SSRF holds on every followed hop. _fetch_well_known_adagents_url sets current_client = None after the first response, so hops ≥1 build a fresh _owned_pinned_client (IP-pinned, trust_env=False, adagents.py:272-306) and _resolve_well_known_redirect_url re-runs _check_safe_host + _dns_validate_host before any bytes. The caller-injected (possibly unpinned) client only ever touches the original publisher URL — no regression.
  • eTLD+1 anchor is correct against PSL private abuse. Both sides ACE-normalized via _idna_ascii_host(uts46=True) before same_registrable_domain (etld.py:113), which sets include_psl_private_domains=True — so victim.github.ioattacker.github.io is rejected. original_hostname is held constant across hops (anchored to origin, not chained). IDN test (bücher.comxn--bcher-kva.com) passes.
  • authoritative_location stays strictly no-follow. _resolve_direct routes is_redirect=True to _fetch_adagents_url (no-follow); a 30x on the named authoritative URL falls to status_code != 200 and raises HTTP {status}. Confirmed by test_authoritative_location_http_redirect_is_refused (test_adagents.py:896, match=\"HTTP 302\").
  • Hop math. range(MAX_WELL_KNOWN_REDIRECT_HOPS + 1) = 4 iterations, raises at hop >= 3 → at most 3 redirects (4 fetches). Circular detection keys on the resolved absolute URL per iteration. Cap test and circular test both trace correctly.
  • Non-redirect path equivalence. The _fetch_adagents_url_adagents_headers + _fetch_adagents_response + _parse_adagents_response split carries every old inline branch (304/404/403-cf-challenge/non-200/JSON-object/authorized_agents-vs-authoritative_location) over verbatim. The validator reset (_adagents_headers(user_agent, None) + current_cache = None) on redirect hops correctly fail-closes via the existing "304 without a cache entry" guard.
  • Deps. idna>=3.7,<4 declared in pyproject.toml — the >=3.7 floor excludes CVE-2024-3651, which matters now that attacker-controlled redirect hostnames flow into idna.encode. Load-bearing; don't relax it.

Follow-ups (non-blocking — file as issues)

  • Spec is silent on well-known redirects. ad-tech-protocol-expert: sound-with-caveats — AdCP doesn't normatively say the well-known fetch is no-redirect, nor mandate apex→www normalization, so this is a defensible unilateral SDK policy rather than a spec requirement. Worth an upstream clarification so a publisher whose apex 301s to www resolves the same on a strict implementation (e.g. the JS SDK) as it does here.
  • Independent visited_urls sets. security-reviewer (Low): the well-known loop's cycle set is separate from _resolve_direct's. Not exploitable — both chains are independently hop-capped (3 and MAX_REDIRECT_DEPTH=5) — but one shared set would tighten posture.

Minor nits (non-blocking)

  1. Bounded non-repeating bounce. The circular check is exact-string on the urljoin-normalized URL, so a server can still bounce /a/b/a?x=1 without repeating. Fully bounded by the hop cap, so it's cosmetic — noting for completeness.

security-reviewer: no blockers. code-reviewer: no Blocker/Major. ad-tech-protocol-expert: sound-with-caveats. LGTM. Follow-ups noted below.

@bokelley bokelley merged commit fbb91c1 into main Jul 1, 2026
26 checks passed
@bokelley bokelley deleted the fix/adagents-well-known-redirects branch July 1, 2026 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant