fix(export): surface bulk partial failures via X-Export-Warnings header#73
Conversation
Accumulate structured ExportFailure entries in BulkExportResult, return 422 EXPORT_ALL_FAILED when every candidate fails, and attach X-Export-Warnings on partial 200 ZIP responses without changing the stable ZIP body contract.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR captures per-session export failures as structured ExportFailure entries, exposes them via a bounded X-Export-Warnings header and export-warnings.json inside ZIPs for partial success, and returns ErrorCode.EXPORT_ALL_FAILED (HTTP 422) with failures when every attempted session fails; tests and docs updated accordingly. ChangesStructured bulk-export failure reporting
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/export_api.py`:
- Around line 180-182: The current code writes the entire failure_payload into
resp.headers["X-Export-Warnings"] which can exceed header size limits; change
this to a bounded header strategy: when result.failures is non-empty, compute a
compact summary (e.g., total_failures count, first N failure summaries or IDs)
and set resp.headers["X-Export-Warnings"] to a JSON object containing the count,
truncated=true flag, and the sampled entries instead of the full
failure_payload; put the full failure_payload into the response body/manifest
(or a downloadable URL) for clients that need details. Ensure you update the
logic that populates failure_payload and references to result.failures so the
header uses only the capped sample and includes truncation metadata.
In `@utils/export_engine.py`:
- Around line 289-294: The current ExportFailure uses raw exception text
(str(exc)) which can leak sensitive info; instead, map the exception to a
stable, sanitized message using the failure code / ErrorCode mapping and include
that in ExportFailure.message. Update the code around result.failures.append
(the ExportFailure constructor call where session_id=sid and
code=failure_code_for_exception(exc, phase=phase)) to call an existing or new
helper (e.g., failure_code_for_exception -> use its returned code with a lookup
like message_for_error_code(code) or a small function
sanitized_message_for_exception(exc, phase)) and pass that sanitized, stable
text as message rather than str(exc).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 266ab9dd-dd9b-4997-a7dd-a0c62cf82b3b
📒 Files selected for processing (5)
api/error_codes.pyapi/export_api.pydocs/api-reference.mdtests/test_export_api_bulk.pyutils/export_engine.py
|
@CodeRabbit , Can you reeviewe this PR fully? |
|
@CodeRabbit. check this PR? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== utils/export_engine.py ==="
cat utils/export_engine.py
echo ""
echo "=== api/export_api.py ==="
cat api/export_api.py
echo ""
echo "=== api/error_codes.py ==="
cat api/error_codes.py
echo ""
echo "=== docs/api-reference.md ==="
cat docs/api-reference.md
echo ""
echo "=== tests/test_export_api_bulk.py ==="
cat tests/test_export_api_bulk.pyLength of output: 44546 🏁 Script executed: #!/bin/bash
echo "=== conftest.py (root) ==="
fd -i "conftest.py" --exec cat {}
echo ""
echo "=== Check failure_count usage ==="
rg -n "failure_count" --type py
echo ""
echo "=== Check ExportFailure import usage ==="
rg -n "ExportFailure" --type py
echo ""
echo "=== Check _EXPORT_WARNINGS_HEADER_MAX constants ==="
rg -n "_EXPORT_WARNINGS_HEADER_MAX" --type py
echo ""
echo "=== Check str(item.code) / StrEnum behaviour ==="
rg -n "StrEnum" --type pyLength of output: 4396
PR
|
| # | Severity | File | Item |
|---|---|---|---|
| 1 | Medium | utils/export_engine.py |
Redundant isinstance(exc, json.JSONDecodeError) branch |
| 2 | Minor | utils/export_engine.py |
phase should use Literal["parse","export"] |
| 3 | Minor | utils/export_engine.py |
failure_count vs len(failures) dual bookkeeping |
| 4 | Minor | api/export_api.py |
Single-entry overflow not guarded in _export_warnings_header_payload |
| 5 | Minor | api/export_api.py |
_serialize_export_failures return type |
| 6 | Minor | tests/test_export_api_bulk.py |
No test for byte-level header truncation |
| 7 | Minor | tests/test_export_api_bulk.py |
No test for since="incremental" with partial failures |
| 8 | Minor | docs/api-reference.md |
EXPORT_ALL_FAILED failures shape wording ambiguous |
The core logic (failure accumulation, response routing, ZIP embedding, state bookmark exclusion) is correct and the acceptance criteria from issue #69 are all met. Items 1 and 4 are the most worth fixing before merge; the rest can reasonably be follow-up work.
Remove redundant JSONDecodeError branch, tighten phase typing, add last-resort header byte cap, and add truncation/incremental bookmark tests.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/test_export_api_bulk.py (1)
129-141: ⚡ Quick winTighten the assertion for entry-limit truncation.
The test name indicates it's specifically testing entry-limit truncation, and there's a separate test for byte-size overflow (lines 144-156). The assertion on line 141 should use
== 20instead of<= 20to precisely verify that the entry limit is applied correctly. Using<=weakens the test and could miss regressions in the truncation logic.✏️ Proposed fix
- assert len(payload["failures"]) <= 20 + assert len(payload["failures"]) == 20🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_export_api_bulk.py` around lines 129 - 141, The test test_export_warnings_header_payload_truncates_at_entry_limit should assert the exact entry-limit truncation; update the assertion that currently checks len(payload["failures"]) <= 20 to require len(payload["failures"]) == 20 so the test precisely verifies _export_warnings_header_payload enforces the 20-entry cap; modify the assertion in the test function test_export_warnings_header_payload_truncates_at_entry_limit accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@tests/test_export_api_bulk.py`:
- Around line 129-141: The test
test_export_warnings_header_payload_truncates_at_entry_limit should assert the
exact entry-limit truncation; update the assertion that currently checks
len(payload["failures"]) <= 20 to require len(payload["failures"]) == 20 so the
test precisely verifies _export_warnings_header_payload enforces the 20-entry
cap; modify the assertion in the test function
test_export_warnings_header_payload_truncates_at_entry_limit accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ec63f992-f2f4-478b-b00c-442aac2c9585
📒 Files selected for processing (4)
api/export_api.pydocs/api-reference.mdtests/test_export_api_bulk.pyutils/export_engine.py
✅ Files skipped from review due to trivial changes (1)
- docs/api-reference.md
🚧 Files skipped from review as they are similar to previous changes (2)
- utils/export_engine.py
- api/export_api.py
Extract ErrorCode to models/error_codes.py so export_engine no longer transitively imports Flask via api. Document export-phase exc handling.
|
@CodeRabbit , all comments are addressed? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
echo "=== utils/export_engine.py ==="
cat utils/export_engine.py
echo ""
echo "=== api/export_api.py ==="
cat api/export_api.py
echo ""
echo "=== tests/test_export_api_bulk.py ==="
cat tests/test_export_api_bulk.py
echo ""
echo "=== docs/api-reference.md ==="
cat docs/api-reference.mdLength of output: 45857
The one remaining open item is |
closes #69
Summary
Surfaces per-session bulk export failures to HTTP clients instead of silently omitting failed sessions from the ZIP. Keeps the stable 200 + ZIP body contract from Week 1; partial failures are signaled via X-Export-Warnings.
Problem
POST /api/export caught per-session parse/export errors, logged them, and returned HTTP 200 with a ZIP missing failed sessions. BulkExportResult.failure_count was populated but never exposed to the client.
Changes
X-Export-Warnings shape:
[{"session_id": "session_def456", "code": "PARSE_ERROR", "message": "human-readable detail"}]
Export state bookmark (_write_state) only includes successfully exported session IDs — failed sessions are not advanced in incremental mode.
Acceptance criteria
Test plan
pytest tests/test_export_api_bulk.py -v
pytest -q
mypy -p api -p utils -p models
ruff check .
ruff format --check .
Manual smoke: POST /api/export against a corpus with one corrupt .jsonl → inspect X-Export-Warnings header.
Out of scope
Dependencies
Sprint item #2 · 5 pt · branch fix/export-partial-failure-warnings
Summary by CodeRabbit
New Features
Documentation
Tests