Skip to content

refactor(scorer): score batches over typed change details#207

Closed
behinddwalls wants to merge 2 commits into
mainfrom
scorer
Closed

refactor(scorer): score batches over typed change details#207
behinddwalls wants to merge 2 commits into
mainfrom
scorer

Conversation

@behinddwalls
Copy link
Copy Markdown
Collaborator

@behinddwalls behinddwalls commented Jun 5, 2026

Summary

Summary

Why?

The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.

What?

  • Add entity.BatchChanges — the normalized, batch-level view of all changes
    in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
  • Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the
    composite scorer operate over it.
  • The score controller resolves each request's change records, flattens their
    details into BatchChanges, and scores the batch once — replacing the
    per-request multiplicative product over len(URIs).
  • Example wiring buckets by total lines changed.

Consumes the typed details persisted by the change-details change.

Test Plan

  • make build, make test, make lint, make check-mocks/gazelle/tidy
  • make integration-test, make e2e-test (start -> validate enrich ->
    score normalizes the batch and scores on real change size)

Test Plan

Issues

Stack

  1. @ refactor(scorer): score batches over typed change details #207
  2. feat(extensions): fake implementations with error injection #197
  3. feat(example): per-queue extension wiring; retire buildrunner/noop #193
  4. refactor(conflict): take a batch of changes as analyzer input #202

## Summary

### Why?

The change provider already produces rich per-URI facts (author, changed
files, line counts), but its value types lived in the extension layer and
the data was thrown away — validate fetched ChangeInfo only to log a file
count, and ChangeRecord stored an opaque Metadata JSON string that was never
written. Nothing downstream could read typed change facts.

### What?

- Move the change value types into entities: entity.User, entity.ChangedFile
  (now with LinesModified), entity.ChangeDetails (the facts), and
  entity.ChangeInfo (URI -> Details), with aggregation helpers. The
  changeprovider extension and GitHub impl now produce these.
- Replace ChangeRecord.Metadata (opaque string) with typed Details
  (ChangeDetails); the change table's metadata JSON column becomes details.
- Add ChangeStore.UpdateDetails — a version-guarded conditional write,
  following the optimistic-locking contract (arithmetic in the controller).
- validate now persists each fetched ChangeInfo onto the request's change
  records (per-URI, idempotent; ErrVersionMismatch is a benign no-op).

This is the producer half: typed details now exist and are persisted. The
score controller consumes them in a follow-up.

## Test Plan

- ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy`
- ✅ `make integration-test` (storage contract suite round-trips Details and
  covers UpdateDetails create/update/version-mismatch)
## Summary

### Why?

The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.

### What?

- Add entity.BatchChanges — the normalized, batch-level view of all changes
  in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
- Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the
  composite scorer operate over it.
- The score controller resolves each request's change records, flattens their
  details into BatchChanges, and scores the batch once — replacing the
  per-request multiplicative product over len(URIs).
- Example wiring buckets by total lines changed.

Consumes the typed details persisted by the change-details change.

## Test Plan

- ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy`
- ✅ `make integration-test`, `make e2e-test` (start -> validate enrich ->
  score normalizes the batch and scores on real change size)
@behinddwalls
Copy link
Copy Markdown
Collaborator Author

Closing as a duplicate. The scorer change already landed as #196 (merged into main); this draft was auto-created during a stack re-publish after #195/#196 merged. The remaining stack (#197#193#202) has been rebased directly onto main.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant