Skip to content

Canonicalize rule hash inputs (attributes + rule inputs)#379

Merged
tinder-maxwellelliott merged 1 commit into
masterfrom
claude/canonicalize-hash-inputs
Jun 19, 2026
Merged

Canonicalize rule hash inputs (attributes + rule inputs)#379
tinder-maxwellelliott merged 1 commit into
masterfrom
claude/canonicalize-hash-inputs

Conversation

@tinder-maxwellelliott

Copy link
Copy Markdown
Collaborator

What

Hash a rule's attributes in name-sorted order and dedupe + sort its (configured) rule inputs before mixing them into the target digest, so a target's hash is invariant to the order Bazel happens to emit them in.

Why

A rule's attributes and rule inputs are conceptually sets, but today BazelRule.digest() and ruleInputList() hash them in Bazel's emission order:

  • digest() iterates rule.attributeList as-emitted.
  • ruleInputList() concatenates configuredRuleInputList / ruleInputList with no global dedupe or sort.

That emission order is not guaranteed stable, so an otherwise-unchanged target can hash differently between two graphs and show up as a spurious diff. This is most acute on the configuration-aware cquery path (#359 / #363): configuredRuleInputList can surface the same dep label across multiple configurations, and cquery does not promise a stable order for those edges. Without canonicalization, flipping the emission order of two configured edges changes the parent's hash even though nothing changed.

Source

Backported from bazel-contrib/target-determinator commit d4b6125 ("Canonicalize target hash inputs"), which introduces the equivalent sortedAttributesForHashing + canonicalizeRuleInputs canonicalization in its Go hashing core. This brings our configuration-aware hashing (recently landed in #363) to parity.

Compatibility note

This is a one-time change to absolute hash values for any rule with ≥2 attributes or whose inputs weren't already emitted in sorted order. The standard bazel-diff workflow generates hashes for both the "before" and "after" revisions with the same binary, so diff results are unaffected — and unchanged targets now hash identically across runs that previously differed only by input/attribute ordering. Stale hashes generated by a prior version should be regenerated (as is already expected across versions).

Tests

Added to BazelRuleTest:

  • testDigestInvariantToAttributeOrder — same attribute set, different order → equal digest.
  • testDigestStillDetectsAttributeValueChange — sorting doesn't mask a real value change.
  • testNonCqueryRuleInputListDedupesAndSorts — non-cquery inputs come back deduped + sorted.
  • testCqueryRuleInputListInvariantToConfiguredInputOrder — reordered/duplicated configured inputs → identical ruleInputList().

BuildGraphHasherTest's pinned hashes mock BazelRule, so they're unaffected; full local CLI suite passes except E2ETest, which fails identically on master in this sandbox due to a missing nested Java runtime (environmental, unrelated to this change).

🤖 Generated with Claude Code

Hash a rule's attributes in name-sorted order and dedupe + sort its
(configured) rule inputs before mixing them into the digest, so a
target's hash is invariant to the order Bazel happens to emit them in.

A rule's attributes and rule inputs are conceptually sets, but
`BazelRule.digest()` / `ruleInputList()` hashed them in Bazel's emission
order. That order is not guaranteed stable, so an otherwise-unchanged
target could hash differently between two graphs. This is most acute on
the configuration-aware (#359) cquery path: `configuredRuleInputList`
can surface the same dep label across multiple configurations, and
cquery does not promise a stable order for those edges.

Backported from bazel-contrib/target-determinator commit d4b6125
("Canonicalize target hash inputs"), which adds the same
`sortedAttributesForHashing` + `canonicalizeRuleInputs` canonicalization.

Note: this is a one-time change to absolute hash values. The standard
workflow generates hashes for both the "before" and "after" revisions
with the same binary, so diff results are unaffected — and unchanged
targets now hash identically across runs that previously differed only
by input/attribute ordering.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tinder-maxwellelliott tinder-maxwellelliott force-pushed the claude/canonicalize-hash-inputs branch from 07dd9c8 to 9c928f4 Compare June 19, 2026 13:44
@tinder-maxwellelliott tinder-maxwellelliott merged commit f053abc into master Jun 19, 2026
15 checks passed
@tinder-maxwellelliott tinder-maxwellelliott deleted the claude/canonicalize-hash-inputs branch June 19, 2026 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant