refactor(scorer): score batches over typed change details#196
Merged
Conversation
This was referenced Jun 4, 2026
89968ae to
2f34df7
Compare
This was referenced Jun 4, 2026
2f34df7 to
a0d29b9
Compare
a0d29b9 to
d991a29
Compare
d991a29 to
c979c6b
Compare
JamyDev
approved these changes
Jun 5, 2026
c979c6b to
3ebb7c8
Compare
behinddwalls
added a commit
that referenced
this pull request
Jun 5, 2026
## Summary ### Why? Two "change"-related stores were in the wrong shape. `ChangeStore` — the real, used store that records per-URI claims for in-flight requests and backs `start`'s URI claiming and `validate`'s overlap detection — lived as its own top-level extension and was injected into controllers as a separate dependency, bypassing the `storage.Storage` aggregator that owns every other entity store. Meanwhile `ChangeProviderStore` (exposed via `Storage.GetChangeProviderStore()`, with a mysql impl, mock, and schema) was dead code: no controller ever called it, and its `entity.ChangeProvider` was orphaned alongside it. ### What? - Move `ChangeStore` into `package storage`: the interface, mysql impl, and `change.sql` schema now live under `extension/storage[/mysql]`, and `ChangeStore` is a first-class member of the `Storage` aggregator via `GetChangeStore()` — matching every other store. - `start` and `validate` controllers drop their separate `changeStore` constructor param and read `store.GetChangeStore()`; the example orchestrator no longer constructs/injects it separately. - Delete the dead `ChangeProviderStore` (interface, mysql, mock, schema), `GetChangeProviderStore()`, and the orphaned `entity/change_provider.go`. - Fold the standalone changestore integration suite into the shared `StorageContractSuite` (driven through `GetChangeStore()`); e2e drops the now-redundant changestore schema apply. ## Test Plan - ✅ `make build`, `make test` (start/validate controllers pass against the storage-package mock) - ✅ `make lint`, `make check-gazelle`, `make check-mocks`, `make check-tidy` (no drift; `go.mod` / `MODULE.bazel` unchanged) - ✅ `make integration-test` (storage mysql suite now exercises the change store via `GetChangeStore()`) - ✅ `make e2e-test` (full land→validate flow: URI claim + overlap detection, `change` table applied from the storage schema dir) ## Issues ## Stack 1. @ #191 1. #195 1. #196 1. #197
3ebb7c8 to
e2441a7
Compare
## Summary ### Why? The change provider already produces rich per-URI facts (author, changed files, line counts), but its value types lived in the extension layer and the data was thrown away — validate fetched ChangeInfo only to log a file count, and ChangeRecord stored an opaque Metadata JSON string that was never written. Nothing downstream could read typed change facts. ### What? - Move the change value types into entities: entity.User, entity.ChangedFile (now with LinesModified), entity.ChangeDetails (the facts), and entity.ChangeInfo (URI -> Details), with aggregation helpers. The changeprovider extension and GitHub impl now produce these. - Replace ChangeRecord.Metadata (opaque string) with typed Details (ChangeDetails); the change table's metadata JSON column becomes details. - Add ChangeStore.UpdateDetails — a version-guarded conditional write, following the optimistic-locking contract (arithmetic in the controller). - validate now persists each fetched ChangeInfo onto the request's change records (per-URI, idempotent; ErrVersionMismatch is a benign no-op). This is the producer half: typed details now exist and are persisted. The score controller consumes them in a follow-up. ## Test Plan - ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy` - ✅ `make integration-test` (storage contract suite round-trips Details and covers UpdateDetails create/update/version-mismatch)
## Summary ### Why? The scorer took entity.Change (just URIs), so it could not score on real change size — the example heuristic counted URIs as a placeholder. With typed change details now persisted on change records, the scorer can score a batch on its actual lines/files changed. ### What? - Add entity.BatchChanges — the normalized, batch-level view of all changes in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers. - Scorer.Score now takes entity.BatchChanges; the heuristic ValueFunc and the composite scorer operate over it. - The score controller resolves each request's change records, flattens their details into BatchChanges, and scores the batch once — replacing the per-request multiplicative product over len(URIs). - Example wiring buckets by total lines changed. Consumes the typed details persisted by the change-details change. ## Test Plan - ✅ `make build`, `make test`, `make lint`, `make check-mocks/gazelle/tidy` - ✅ `make integration-test`, `make e2e-test` (start -> validate enrich -> score normalizes the batch and scores on real change size)
e2441a7 to
dc47a06
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Why?
The scorer took entity.Change (just URIs), so it could not score on real
change size — the example heuristic counted URIs as a placeholder. With
typed change details now persisted on change records, the scorer can score a
batch on its actual lines/files changed.
What?
in a batch (BatchID, Queue, []ChangeInfo) with aggregation helpers.
composite scorer operate over it.
details into BatchChanges, and scores the batch once — replacing the
per-request multiplicative product over len(URIs).
Consumes the typed details persisted by the change-details change.
Test Plan
make build,make test,make lint,make check-mocks/gazelle/tidymake integration-test,make e2e-test(start -> validate enrich ->score normalizes the batch and scores on real change size)
Issues
Stack