Skip to content

test: add parse/export/search performance benchmarks and CI artifacts#76

Merged
wpak-ai merged 5 commits into
masterfrom
chore/parse-export-benchmarks
Jun 12, 2026
Merged

test: add parse/export/search performance benchmarks and CI artifacts#76
wpak-ai merged 5 commits into
masterfrom
chore/parse-export-benchmarks

Conversation

@clean6378-max-it

@clean6378-max-it clean6378-max-it commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Closes #74

Summary

  • Add pytest-benchmark harness under tests/benchmarks/ for parse, bulk export, and search paths
  • Generate synthetic corpora at session scope (10 / 500 / 5000+ line JSONL; 10 / 50 / 100 session export trees; 50-session search corpus)
  • Add tracemalloc memory ceiling test for large-file parse (peak < 10× on-disk file size)
  • Add non-gating benchmarks CI job that uploads benchmark-results.json as an artifact
  • Document local runs in benchmarks/README.md and link from CONTRIBUTING.md
  • Skip timing benchmarks in normal pytest via --benchmark-skip; memory test runs in main CI

Sprint items

Closes Chen W2 #7

Test plan

  • pytest tests/benchmarks/ --benchmark-only -o addopts= -v
  • pytest tests/benchmarks/test_parse_memory.py -v
  • pytest -q green locally
  • mypy -p api -p utils -p models green
  • ruff check . and ruff format --check . green
  • CI benchmarks job completes and benchmark-results artifact is downloadable

Summary by CodeRabbit

  • Tests

    • Added performance benchmarks for parsing, search, and bulk export across multiple dataset sizes.
    • Added a memory-usage test, fixtures to generate synthetic corpora, and a test client for search benchmarking.
  • Documentation

    • Added a benchmarking guide with local run instructions and baseline refresh notes.
    • Updated contributing docs with benchmark execution commands.
  • Chores

    • CI now runs benchmarks and uploads a benchmark-results artifact.
    • Added pytest-benchmark as a dev dependency and included a baseline snapshot.

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds pytest-benchmark benchmarks (parse, export, search), fixtures to synthesize JSONL corpora (small/medium/large and multi-session export/search corpora), a tracemalloc memory ceiling test, CI job uploading benchmark JSON, and documentation for local runs and baselines.

Changes

Performance Benchmark Suite

Layer / File(s) Summary
Dependencies and pytest configuration
requirements-dev.txt, pyproject.toml, benchmarks/baselines.json
Added pytest-benchmark==5.2.3, registered the benchmark pytest marker, updated addopts to skip benchmarks by default, and added an initial benchmarks/baselines.json.
Test fixture infrastructure
tests/benchmarks/conftest.py
Built synthetic JSONL corpus generation and fixtures: write_jsonl, seed_search_corpus, small/medium/large parse files, parametrized export_corpus, and bench_client_search_corpus Flask test client.
Parse benchmarks
tests/benchmarks/test_parse_bench.py, tests/benchmarks/test_parse_memory.py
Implemented three parse throughput benchmarks (small/medium/large) using pytest-benchmark and a tracemalloc memory ceiling test asserting peak < 10× on-disk file size.
Export and search benchmarks
tests/benchmarks/test_export_bench.py, tests/benchmarks/test_search_bench.py
Added parametrized bulk export benchmarks (10, 50, 100 sessions) measuring run_bulk_export and a full-corpus search benchmark via Flask test client GET /api/search.
CI integration and documentation
.github/workflows/ci.yml, CONTRIBUTING.md, benchmarks/README.md
Configured CI benchmarks job that runs pytest --benchmark-only, emits benchmark-results.json, uploads it as an artifact, added a contributing-guide test command, and added README with run and baseline instructions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • timon0305
  • wpak-ai

"🐰 Hop hop, I ran the tests with care,
JSONL forests everywhere,
Benchmarks hum, baselines wait,
Tracemalloc guards memory's gate,
Little rabbit claps — performance fair! 🥕"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 26.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding parse/export/search performance benchmarks and CI artifacts.
Linked Issues check ✅ Passed The PR fulfills all core objectives from issue #74: pytest-benchmark harness with parse/export/search tests, synthetic fixtures (small ~10, medium ~500, large ≥5000 lines), tracemalloc memory test, non-gating CI job with artifact upload, and documentation.
Out of Scope Changes check ✅ Passed All changes directly support benchmark infrastructure. Minor updates to pyproject.toml (ruff glob fix to tests/**/*.py) and requirements-dev.txt (pytest-benchmark dependency) are necessary supporting changes within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch chore/parse-export-benchmarks

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@requirements-dev.txt`:
- Line 8: Update verification for the pytest-benchmark pin: confirm that the
project's benchmarks and test suite are compatible with pytest-benchmark 5.x by
running CI/tests with pytest-benchmark==5.2.3 (or latest) and reviewing 5.x
release notes for breaking changes; if incompatible or unsafe, pin
requirements-dev.txt to a specific compatible version (e.g.,
pytest-benchmark==4.0.0) or add an upper bound (e.g.,
pytest-benchmark>=4.0.0,<5.0.0) and document the decision in the PR, referencing
the package name pytest-benchmark when making the change.

In `@tests/benchmarks/conftest.py`:
- Line 26: This is a static-analysis false positive: keep using json.dumps when
writing to the file but annotate the call site (the f.write(json.dumps(entry,
separators=(",", ":")) + "\n") expression) with a short inline comment
explaining that jsonify is a Flask helper for HTTP responses and not for file
serialization, and add a suppression directive for the static analyzer (e.g., an
ast-grep/linters-ignore comment) on that line so the hint is not raised again.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f2f5f062-13bd-46db-a7f0-dd9ac150905b

📥 Commits

Reviewing files that changed from the base of the PR and between 8083f3e and 6283017.

📒 Files selected for processing (12)
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • benchmarks/README.md
  • benchmarks/baselines.json
  • pyproject.toml
  • requirements-dev.txt
  • tests/benchmarks/__init__.py
  • tests/benchmarks/conftest.py
  • tests/benchmarks/test_export_bench.py
  • tests/benchmarks/test_parse_bench.py
  • tests/benchmarks/test_parse_memory.py
  • tests/benchmarks/test_search_bench.py

Comment thread requirements-dev.txt Outdated
Comment thread tests/benchmarks/conftest.py Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/benchmarks/conftest.py (1)

26-30: 💤 Low value

Documentation is good; suppression format note.

The inline comment clearly documents why json.dumps is correct for file I/O (not Flask's HTTP jsonify), which addresses the past review feedback well. However, the linters-ignore directive on line 28 is not a recognized suppression format for ast-grep, so the [info] hint may still appear. Since the code is correct and well-documented, this is acceptable—the comment's documentation value is more important than suppressing the false-positive hint.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/conftest.py` around lines 26 - 30, The inline suppression
"linters-ignore: prefer-jsonify" is not a recognized ast-grep format and should
be removed; keep the explanatory comment and the json.dumps call that builds the
serialized variable, and if you need to silence the false positive instead add
the repository's supported suppression syntax (use the project's documented
ast-grep/flake8/pytest suppression keyword) next to the json.dumps line or in a
nearby comment so the linter recognizes it.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/benchmarks/conftest.py`:
- Around line 26-30: The inline suppression "linters-ignore: prefer-jsonify" is
not a recognized ast-grep format and should be removed; keep the explanatory
comment and the json.dumps call that builds the serialized variable, and if you
need to silence the false positive instead add the repository's supported
suppression syntax (use the project's documented ast-grep/flake8/pytest
suppression keyword) next to the json.dumps line or in a nearby comment so the
linter recognizes it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 575ac0e3-08ed-4c88-ab95-b08edf66daae

📥 Commits

Reviewing files that changed from the base of the PR and between 6283017 and e38b910.

📒 Files selected for processing (2)
  • requirements-dev.txt
  • tests/benchmarks/conftest.py

…tures

Pin pytest-benchmark to 5.2.3 after verifying compatibility with pytest 9.0 and the benchmark suite. Annotate benchmark JSONL serialization to document that json.dumps is intentional for file I/O, not Flask jsonify.
@clean6378-max-it clean6378-max-it force-pushed the chore/parse-export-benchmarks branch from e38b910 to 68ba23c Compare June 11, 2026 16:56

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/benchmarks/conftest.py (1)

16-31: ⚡ Quick win

Parse the template once outside the loop.

The json.loads(TEMPLATE_LINE) call on line 20 is executed for every iteration, re-parsing the same JSON string up to 5,000 times for the large fixture. Parsing once and using copy.deepcopy() in the loop would eliminate this overhead and improve fixture generation performance.

♻️ Proposed refactor
+import copy
 import json
 from pathlib import Path
 
 ...
 
 def write_jsonl(path: Path, line_count: int) -> Path:
     """Write a JSONL session file with *line_count* rows derived from the template fixture."""
+    template = json.loads(TEMPLATE_LINE)
     with path.open("w", encoding="utf-8") as f:
         for i in range(line_count):
-            entry = json.loads(TEMPLATE_LINE)
+            entry = copy.deepcopy(template)
             entry["timestamp"] = f"2026-06-12T10:{i % 60:02d}:00Z"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/benchmarks/conftest.py` around lines 16 - 31, The write_jsonl function
currently calls json.loads(TEMPLATE_LINE) inside the for-loop causing repeated
parsing; parse TEMPLATE_LINE once before the loop (e.g., template =
json.loads(TEMPLATE_LINE)) and inside the loop use copy.deepcopy(template) to
get a fresh entry, then modify entry["timestamp"] and the message content as
before; ensure copy is imported (from copy import deepcopy or import copy) and
replace json.loads(TEMPLATE_LINE) references with deepcopy(template) to
eliminate the repeated parse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/benchmarks/conftest.py`:
- Around line 34-45: The seed_search_corpus function creates the bench-project
directory with project.mkdir(parents=True) which can fail if the directory
already exists; update the call in seed_search_corpus to include exist_ok=True
(i.e., project.mkdir(parents=True, exist_ok=True)) so the function is robust to
repeated runs or pre-existing directories.

---

Nitpick comments:
In `@tests/benchmarks/conftest.py`:
- Around line 16-31: The write_jsonl function currently calls
json.loads(TEMPLATE_LINE) inside the for-loop causing repeated parsing; parse
TEMPLATE_LINE once before the loop (e.g., template = json.loads(TEMPLATE_LINE))
and inside the loop use copy.deepcopy(template) to get a fresh entry, then
modify entry["timestamp"] and the message content as before; ensure copy is
imported (from copy import deepcopy or import copy) and replace
json.loads(TEMPLATE_LINE) references with deepcopy(template) to eliminate the
repeated parse.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7a9d7b84-0ef6-4ab8-93a8-4ddfb51b4e73

📥 Commits

Reviewing files that changed from the base of the PR and between e38b910 and 68ba23c.

📒 Files selected for processing (12)
  • .github/workflows/ci.yml
  • CONTRIBUTING.md
  • benchmarks/README.md
  • benchmarks/baselines.json
  • pyproject.toml
  • requirements-dev.txt
  • tests/benchmarks/__init__.py
  • tests/benchmarks/conftest.py
  • tests/benchmarks/test_export_bench.py
  • tests/benchmarks/test_parse_bench.py
  • tests/benchmarks/test_parse_memory.py
  • tests/benchmarks/test_search_bench.py
✅ Files skipped from review due to trivial changes (2)
  • benchmarks/README.md
  • benchmarks/baselines.json
🚧 Files skipped from review as they are similar to previous changes (8)
  • .github/workflows/ci.yml
  • pyproject.toml
  • requirements-dev.txt
  • tests/benchmarks/test_parse_bench.py
  • tests/benchmarks/test_search_bench.py
  • tests/benchmarks/test_export_bench.py
  • tests/benchmarks/test_parse_memory.py
  • CONTRIBUTING.md

Comment thread tests/benchmarks/conftest.py
Parse the JSONL template once per write_jsonl call and deepcopy entries
in the loop. Use exist_ok=True when creating bench-project in
seed_search_corpus.
@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

@CodeRabbit please review this PR inn more depth

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
echo "=== tests/benchmarks/conftest.py ===" && cat tests/benchmarks/conftest.py
echo ""
echo "=== tests/benchmarks/test_parse_bench.py ===" && cat tests/benchmarks/test_parse_bench.py
echo ""
echo "=== tests/benchmarks/test_export_bench.py ===" && cat tests/benchmarks/test_export_bench.py
echo ""
echo "=== tests/benchmarks/test_search_bench.py ===" && cat tests/benchmarks/test_search_bench.py
echo ""
echo "=== tests/benchmarks/test_parse_memory.py ===" && cat tests/benchmarks/test_parse_memory.py
echo ""
echo "=== pyproject.toml ===" && cat pyproject.toml
echo ""
echo "=== .github/workflows/ci.yml ===" && cat .github/workflows/ci.yml
echo ""
echo "=== benchmarks/README.md ===" && cat benchmarks/README.md
echo ""
echo "=== requirements-dev.txt ===" && cat requirements-dev.txt
echo ""
echo "=== CONTRIBUTING.md ===" && cat CONTRIBUTING.md
echo ""
echo "=== benchmarks/baselines.json ===" && cat benchmarks/baselines.json

Length of output: 23586


🏁 Script executed:

#!/bin/bash
# Check the fixtures directory and what session_with_tools.jsonl looks like
echo "=== Fixtures directory ===" && ls tests/fixtures/ 2>/dev/null || ls fixtures/ 2>/dev/null
echo ""
echo "=== session_with_tools.jsonl (first 3 lines) ===" && head -3 tests/fixtures/session_with_tools.jsonl 2>/dev/null || head -3 fixtures/session_with_tools.jsonl 2>/dev/null
echo ""
# Check existing utils for parse_session and run_bulk_export signatures
echo "=== parse_session function ===" 
ast-grep --pattern 'def parse_session($_) {$$$}'
echo ""
echo "=== run_bulk_export function ===" 
ast-grep --pattern 'def run_bulk_export($$$) {$$$}'
echo ""
# Check for NoopSink
echo "=== NoopSink ===" && rg -n "NoopSink" --type py -C 3
echo ""
# Check create_app signature
echo "=== create_app ===" && rg -n "def create_app" --type py -C 5

Length of output: 5766



@clean6378-max-it — here is an in-depth review of PR #76.


Deep Review: PR #76 — Parse/Export/Search Performance Benchmarks

tests/benchmarks/conftest.py

1. Module-level file I/O is fragile (medium)

TEMPLATE_LINE is read at import time, before any test runs:

TEMPLATE_LINE = (FIXTURES / "session_with_tools.jsonl").read_text(...).splitlines()[0]

If the fixture file is missing or the path changes, the entire conftest fails to import, which silently kills all benchmark discovery with a cryptic ModuleNotFoundError. Move this inside write_jsonl or into a @pytest.fixture(scope="session") so the failure surface is a clear pytest error at the right point.

2. export_corpus fixture scope vs. parametrized size (medium)

export_corpus is function-scoped (default @pytest.fixture). For the 100-session parametrized case that writes 2 000 files, pytest recreates the corpus directory on every test invocation. Upgrade to scope="session" with tmp_path_factory to match the parse fixtures:

`@pytest.fixture`(scope="session", params=[10, 50, 100], ids=["sessions-10","sessions-50","sessions-100"])
def export_corpus(tmp_path_factory: pytest.TempPathFactory, request: pytest.FixtureRequest) -> Path:
    count = request.param
    root = tmp_path_factory.mktemp(f"export_corpus_{count}")
    project = root / "bench-project"
    project.mkdir()
    for i in range(count):
        write_jsonl(project / f"session_{i:04d}.jsonl", 20)
    return project

Note: with indirect=True in the test, the parametrize values still flow through request.param correctly.

3. bench_client_search_corpus should also be session-scoped (low-medium)

While only one test currently uses this fixture, keeping it function-scoped means a future second test pays the full 50-session + Flask-app setup cost again. Session scope costs nothing here.

4. Timestamp cycling in write_jsonl (low)

entry["timestamp"] = f"2026-06-12T10:{i % 60:02d}:00Z"

For line_count=5000 this produces 83 full cycles with duplicate timestamps. If parse_session deduplicates or sorts on timestamp, the large-file benchmark exercises artificially compressed data. Use a more unique scheme:

minute = (i // 60) % 60
second = i % 60
entry["timestamp"] = f"2026-06-12T{10 + i // 3600:02d}:{minute:02d}:{second:02d}Z"

tests/benchmarks/test_parse_memory.py

5. tracemalloc.clear_traces() missing — contamination risk (high)

tracemalloc.start()
try:
    parse_session(str(path))
    _, peak = tracemalloc.get_traced_memory()
finally:
    tracemalloc.stop()

If any prior test (including a pytest plugin or coverage tool) called tracemalloc.start() without stopping, calling it again here is a no-op. Worse, get_traced_memory() will return the peak since that earlier start, inflating the number. Fix:

tracemalloc.start()
tracemalloc.clear_traces()   # ← add this
try:
    parse_session(str(path))
    _, peak = tracemalloc.get_traced_memory()
finally:
    tracemalloc.stop()

6. No assertion on parse_session return value (low)

If parse_session silently returns an empty result (e.g., due to a schema mismatch in generated data), the memory test still passes while proving nothing. Add a minimal sanity check:

result = parse_session(str(path))
assert result is not None and len(result) > 0, "parse_session returned empty — corpus data invalid"

pyproject.toml

7. per-file-ignores glob doesn't cover benchmark subdirectory (medium)

[tool.ruff.lint.per-file-ignores]
"tests/*.py" = ["E402"]

The glob tests/*.py matches files directly in tests/, but not tests/benchmarks/conftest.py or any file in subdirectories. Ruff will enforce E402 (import-not-at-top) on the benchmark files if they ever need a sys.path prepend. Fix:

"tests/**/*.py" = ["E402"]

Or add an explicit entry:

"tests/benchmarks/*.py" = ["E402"]

.github/workflows/ci.ymlbenchmarks job

8. Artifact upload skipped when benchmarks fail (medium)

- name: Run benchmarks
  run: >
    pytest tests/benchmarks/ --benchmark-only ...

- uses: actions/upload-artifact@...
  with:
    name: benchmark-results
    path: benchmark-results.json

If the pytest step exits non-zero (import error, fixture crash, etc.), the upload step is skipped entirely. Add if: always() to preserve the partial JSON for debugging:

- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02
  if: always()
  with:
    name: benchmark-results
    path: benchmark-results.json

9. Job fails loudly but is labelled "informational" — consider continue-on-error (low)

The job name says "informational" and the README says there is no gate, yet a benchmark crash will mark the CI run red. Since it has no needs: gate either, it may block merge if added to branch protection later. Consider:

benchmarks:
  name: Performance benchmarks (informational)
  runs-on: ubuntu-latest
  continue-on-error: true   # ← prevents a flaky benchmark from blocking merges

10. benchmarks job has no dependency on pytest passing (low)

The benchmarks job starts in parallel with all other jobs and could spend CI minutes on a commit that already fails unit tests. Adding needs: [pytest] would avoid that waste:

benchmarks:
  needs: [pytest]

benchmarks/README.md

11. README placement vs. actual test location (low)

The README lives at benchmarks/README.md but all test code lives under tests/benchmarks/. The top of the README should make that explicit so new contributors don't hunt for benchmark files:

> Test files live under `tests/benchmarks/`; this directory holds only documentation and the informational `baselines.json` snapshot.

Minor / Nits

Location Issue
All test files benchmark fixture parameter has no type hint. Consider from pytest_benchmark.fixture import BenchmarkFixture and annotating as benchmark: BenchmarkFixture.
conftest.py L33 entry.setdefault("message", {}) is never a non-dict for the given fixture; the isinstance(msg, dict) guard is redundant, but harmless.
baselines.json Committing an all-null snapshot is fine as a placeholder, but "updated": null, "machine": null could be dropped and documented as optional fields to reduce confusion.
test_export_bench.py The inner _run closure captures export_corpus from the outer scope. Works correctly; just worth a comment since it's non-obvious.

Summary table

# Severity File Issue
1 Medium conftest.py Module-level file I/O — fragile on missing fixture
2 Medium conftest.py export_corpus function-scoped, should be session
3 Low conftest.py bench_client_search_corpus function-scoped
4 Low conftest.py Timestamp cycles for large files
5 High test_parse_memory.py Missing tracemalloc.clear_traces() — contamination risk
6 Low test_parse_memory.py No assertion on parse_session return value
7 Medium pyproject.toml per-file-ignores glob misses tests/benchmarks/
8 Medium ci.yml Artifact upload not guarded by if: always()
9 Low ci.yml No continue-on-error: true for informational job
10 Low ci.yml No needs: [pytest] guard
11 Low benchmarks/README.md Doesn't state actual test file location

The most actionable fixes are #5 (tracemalloc contamination), #2 (fixture scope for corpus creation), #7 (ruff glob), and #8 (CI artifact upload guard). The rest are low-risk polish items.

@clean6378-max-it

Copy link
Copy Markdown
Collaborator Author

5, 6, 7, 11 are low priority, Rest are optional , no need to fix

@CodeRabbit

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

Tip

For best results, initiate chat on the files or code changes.

@clean6378-max-it — Understood, thanks for clarifying! All noted:

  • #5, #6, #7, #11 — acknowledged as low priority; can be revisited in a follow-up if needed.
  • #1, #2, #3, #4, #8, #9, #10 — optional polish items, won't be pursued in this PR.

The core benchmark harness, CI job, fixtures, and documentation are all in good shape. Happy to help if anything else comes up. 🐇

Reset tracemalloc peak before measuring large-file parse, assert non-empty
message count, extend E402 per-file-ignores to tests/**, and clarify README
that benchmark tests live under tests/benchmarks/.
Comment thread .github/workflows/ci.yml Outdated
Comment thread tests/benchmarks/test_search_bench.py Outdated
Comment thread tests/benchmarks/test_parse_memory.py
Comment thread tests/benchmarks/test_export_bench.py
Comment thread tests/benchmarks/conftest.py
Drop unnecessary actions: write on benchmarks CI job; assert explicit
search hit count on list response; document 10x memory ceiling and v1
template limitations in README; note NoopSink export rounds are stateless.
@timon0305 timon0305 requested a review from wpak-ai June 12, 2026 13:46
@wpak-ai wpak-ai merged commit 0435b2d into master Jun 12, 2026
16 checks passed
@wpak-ai wpak-ai deleted the chore/parse-export-benchmarks branch June 12, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

claude-code-chat-browser: Add parse/export/search performance benchmarks and CI artifacts

3 participants