efficient-did: thread vcov_type as narrow {hc1} contract per Chen-Sant'Anna-Xie 2025 (Phase 1b #4) by igerber · Pull Request #495 · igerber/diff-diff

igerber · 2026-05-26T14:37:07Z

Summary

Phase 1b interstitial Claude/setup pip install k2 m4j #4: thread vcov_type through EfficientDiD as permanently narrow to {"hc1"} per Chen-Sant'Anna-Xie (2025) EIF-based variance achieving the semiparametric efficiency bound. Analytical-sandwich families {classical, hc2, hc2_bm} and conley are rejected at __init__ / set_params with methodology-rooted messages (the per-unit EIF aggregation has no equivalent single design matrix on which hat-matrix leverage or Bell-McCaffrey Satterthwaite DOF can be defined).
BC change on EfficientDiDResults: cluster field renamed to cluster_name; new n_clusters + vcov_type fields added; new to_dict() method (mirrors TripleDifferenceResults / ImputationDiDResults). DiagnosticReport._pt_hausman updated to read the renamed field.
Summary block harmonized: emits Variance estimator: <HC1 / CR1 cluster-robust at <col>, G=<n>> after the survey block; suppressed under bootstrap (replaced with Inference method: bootstrap + Bootstrap replications: <n>). Default cluster=None renders "HC1 heteroskedasticity-robust" — methodologically correct (per-unit EIF SE is HC1-style with no Liang-Zeger G/(G-1) correction); diverges from ImputationDiD which auto-clusters at unit per BJS Theorem 3.
Defensive fix: survey-PSU multiplier bootstrap with <2 PSUs now returns NaN SE (was ≈0 from BLAS roundoff). Single-site guard in the survey-PSU branch + new _build_nan_bootstrap_results helper that keys NaN dicts to the same (g,t) / event-time / group reductions the downstream override loop expects.
Eager set_params validation: EfficientDiD.set_params(vcov_type=bad) raises immediately (matches existing pattern for pt_assumption/control_group/etc); intentional deviation from sibling IF-based estimators which use sklearn mutate-then-validate-at-use. Documented in REGISTRY.md.

Methodology references (required if estimator / math changes)

Method name(s): EfficientDiD (Chen-Sant'Anna-Xie 2025) IF-based variance for ATT(g,t); Liang-Zeger CR1 on cluster-aggregated EIF; Taylor Series Linearization (Binder 1983) on the combined IF under survey designs.
Paper / source link(s): Chen, J., Sant'Anna, P.H.C., & Xie, X. (2025). Efficient Difference-in-Differences with Multiple Time Periods; Liang, K.Y., & Zeger, S.L. (1986). Longitudinal data analysis using generalized linear models; Binder, D.A. (1983). On the variances of asymptotically normal estimators from complex surveys.
Any intentional deviations from the source (and why):
- Permanently narrow vcov_type={"hc1"} contract — structural property of the IF-based variance derivation (no global design matrix on which HC2 leverage or Bell-McCaffrey DOF can be defined). Documented in docs/methodology/REGISTRY.md "IF-based variance estimators vs analytical-sandwich estimators" section + EfficientDiD per-estimator Notes. Matches the established pattern across CallawaySantAnna / TripleDifference / ImputationDiD.
- Default cluster=None renders HC1, NOT CR1-at-unit — per-unit EIF SE sqrt(mean(EIF²)/n) is HC1-style (no Liang-Zeger G/(G-1) finite-sample correction). Diverges from ImputationDiD which auto-clusters at unit per BJS Theorem 3.
- Eager set_params(vcov_type) validation — EfficientDiD.set_params calls _validate_params() which now invokes _validate_vcov_type, so bad vcov raises immediately rather than at fit-time. Matches EfficientDiD's existing eager-validation pattern; diverges from sibling IF-based estimators (sklearn mutate-then-validate-at-use). Documented in REGISTRY.md.

Validation

Tests added/updated:
- tests/test_efficient_did.py — new TestEfficientDiDVcovType class with 40 tests across 7 surfaces (default / cluster / TSL-survey / replicate-survey bit-equal parametrized over aggregate ∈ {None, event_study, group, all}; bootstrap × cluster + bootstrap × survey bit-equal with per-horizon/per-group SE override branches asserted; set_params eager revalidation; bootstrap n_psu<2 NaN propagation; DR (covariates=) path bit-equality; 5 input-rejection pins; cluster + replicate_weights rejection via blanket guard; 7 introspection tests).
- tests/test_business_report.py::TestHausmanPretestPropagatesCluster — assertion updated from getattr(edid, "cluster", None) → getattr(edid, "cluster_name", None) to track the field rename.
Backtest / simulation / notebook evidence (if applicable): N/A — vcov_type="hc1" is bit-equal to current default across all aggregate modes, the DR covariates path, and all bootstrap branches; no R parity needed.

Security / privacy

Confirm no secrets/PII in this PR: Yes

…t'Anna-Xie 2025 (Phase 1b interstitial #4) Mirrors the IF-based narrow-contract template from CallawaySantAnna #487, TripleDifference #488, and ImputationDiD #492. EfficientDiD uses influence-function-based variance per Chen-Sant'Anna-Xie (2025) achieving the semiparametric efficiency bound; the per-unit EIF aggregation has no equivalent single design matrix, so analytical-sandwich families cannot be defined and vcov_type is permanently narrow to {"hc1"}. Input contract: - vcov_type kwarg threaded through __init__ / get_params / set_params - _validate_vcov_type rejects {classical, hc2, hc2_bm} with methodology-rooted message citing Chen-Sant'Anna-Xie (2025) and the missing design matrix - conley rejected as deferred (TODO.md follow-up row) - set_params(vcov_type=bad) raises immediately via _validate_params chain (intentional eager-validation; diverges from sibling IF-based estimators which defer to fit-time per sklearn mutate-then-validate-at-use) EfficientDiDResults harmonization (BC): - cluster field renamed to cluster_name (matches IF-based estimator naming) - New n_clusters field for the G=<n> suffix on the variance summary line - New vcov_type field - New to_dict() method (mirrors TripleDifferenceResults / ImputationDiDResults) - summary() now renders "Variance estimator: <label>" after the survey block, suppressed under bootstrap (renders "Inference method: bootstrap" instead) - Default cluster=None renders "HC1 heteroskedasticity-robust" — methodologically correct (per-unit EIF SE is HC1-style, no Liang-Zeger G/(G-1) correction); diverges from ImputationDiD which auto-clusters at unit per BJS Theorem 3 - DiagnosticReport._pt_hausman updated to read renamed cluster_name field Bootstrap defensive fix: - Survey-PSU multiplier bootstrap with G<2 PSUs collapses to constant draws → BLAS roundoff produces ≈0 SE (not NaN). Single-site guard at the survey-PSU branch returns NaN dicts keyed to the same (g,t)/event-time/ group reductions the downstream override loop expects (cluster path already guards n_clusters≥2 at fit-time; unit path guarded upstream by balanced-panel validator). - New _build_nan_bootstrap_results helper constructs the all-NaN object. Tests (40 new in TestEfficientDiDVcovType across 7 surfaces): - Default / cluster / TSL-survey / replicate-survey bit-equal parametrized over aggregate ∈ {None, event_study, group, all} - Bootstrap × cluster + bootstrap × survey bit-equal (per-horizon + per-group SE override branches asserted) - set_params eager revalidation - Bootstrap n_psu<2 NaN propagation (regression on defensive fix) - DR (covariates=) path bit-equal under explicit vcov_type="hc1" - 5 input-rejection pins (classical/hc2/hc2_bm/conley/unknown) - cluster + replicate_weights rejection (validates blanket guard at L357) - 7 introspection tests (default attr, get_params, Results carries, to_dict default vs cluster, summary HC1 default vs CR1 cluster vs bootstrap-suppressed, cluster_name suppression under both analytical and replicate survey, fit-clone idempotence) Documentation: - REGISTRY.md: IF-based taxonomy "Enforced today" expanded to four-way (CS+TD+ImputationDiD+EfficientDiD); 3 EfficientDiD Notes added - llms-full.txt: vcov_type added to EfficientDiD signature; shared staggered-results variance metadata section updated - CHANGELOG [Unreleased] entry surfaces both vcov_type threading AND the Results-field BC break - TODO.md Phase 1b row collapses to TwoStageDiD only (last remaining); new EfficientDiD Conley follow-up row added - EfficientDiD class + module + Results docstrings updated 210 tests pass across test_efficient_did.py + test_efficient_did_validation.py + test_diagnostics.py. black + ruff clean.

…plicate Bootstrap summary line Two follow-up fixes: 1. cluster → cluster_name propagation. The original PR renamed the EfficientDiDResults.cluster field to cluster_name but left two consumers still referencing the old name: - tests/test_business_report.py:1267 (TestHausmanPretestPropagatesCluster regression test asserting the field is persisted for the Hausman pretest replay) — updated assertion + class docstring. - docs/api/_autosummary/diff_diff.EfficientDiDResults.rst — the checked-in Sphinx autosummary still listed cluster and was missing cluster_name, n_clusters, vcov_type, and the new to_dict() method. 2. Suppress duplicate Bootstrap summary header. EfficientDiDResults.summary() previously emitted both the legacy "Bootstrap: <n> (<weights>)" header line (L250-L251 pre-PR) and the new "Inference method: bootstrap" + "Bootstrap replications: <n>" block. The legacy header is now gated on `bootstrap_results is None` so analytical-only fits keep the original render and bootstrap-overwritten fits use the canonical inference-method block alone (matches the sibling result-container convention). 211 tests pass across test_efficient_did.py, test_efficient_did_validation.py, test_diagnostics.py, and test_business_report.py::TestHausmanPretestPropagatesCluster. black + ruff clean on touched Python files.

github-actions · 2026-05-26T14:48:08Z

Overall Assessment

✅ Looks good

Executive Summary

No P0/P1 methodology defects found. The new vcov_type="hc1"-only contract is consistent with EfficientDiD’s EIF-based source framing, and the PR documents the estimator-specific variance behavior instead of changing it silently in code. docs/methodology/REGISTRY.md:L306-L367, docs/methodology/REGISTRY.md:L1052-L1059. citeturn1academia12
The new survey-bootstrap NaN override is wired through overall, group-time, event-study, and group outputs, so the changed path does not introduce the misleading finite-inference pattern from degenerate bootstrap draws. diff_diff/efficient_did_bootstrap.py:L116-L137, diff_diff/efficient_did.py:L1089-L1138
vcov_type propagation is otherwise complete across constructor params, get_params(), results metadata, summary()/to_dict(), and the visible Hausman replay call site.
One non-blocking issue remains: rejected set_params(vcov_type=...) calls mutate the estimator before validation fails.

Methodology

No findings. The hc1-only contract, the default unclustered HC1 labeling, and the documented IF-vs-sandwich taxonomy are explicitly recorded in docs/methodology/REGISTRY.md:L306-L367 and docs/methodology/REGISTRY.md:L1052-L1059; I did not find an undocumented mismatch against the cited EfficientDiD source framing. citeturn1academia12

Code Quality

Severity P2; Impact: EfficientDiD.set_params() assigns attributes first and validates afterward in diff_diff/efficient_did.py:L353-L360. A rejected call such as set_params(vcov_type="classical", alpha=0.1) therefore raises only after leaving self.vcov_type and any other batched kwargs partially applied. That weakens the new eager-validation surface for callers that catch ValueError and continue using the object. The added tests in tests/test_efficient_did.py:L2510-L2522 only pin the raise, not post-error state. Concrete fix: stage updates in a copy/snapshot and commit only on success, or rollback on exception; add a regression test mirroring the project’s existing atomic set_params() guards.

Performance

No findings.

Maintainability

Severity P3; Impact: the public results surface is now a hard rename from cluster to cluster_name in diff_diff/efficient_did_results.py:L161-L174. The visible internal caller was updated in diff_diff/diagnostic_report.py:L2437-L2446, so I do not see an internal break in this diff, but external consumers of results.cluster will break on upgrade. Concrete fix: none required if this release is intentionally breaking; otherwise add a one-release deprecated alias.

Tech Debt

No findings. The deferred vcov_type="conley" work is explicitly tracked in TODO.md:L107-L115, so it is not untracked blocker debt in this PR.

Security

No findings.

Documentation/Tests

No findings in the changed docs/tests themselves.
I could not execute the targeted tests in this review environment because pytest and numpy are not installed, so runtime validation here was static-only.

…ons on validation failure Previously, EfficientDiD.set_params iterated over the kwargs and called setattr() before invoking _validate_params(). A rejected batched call such as set_params(vcov_type="classical", alpha=0.1, anticipation=2) raised on validation but left every kwarg in the partially-mutated state — the rejected vcov_type plus the otherwise-valid alpha and anticipation. Callers that catch ValueError and continue using the estimator would then operate on a silently corrupted parameter configuration, defeating the eager-validation contract introduced when _validate_vcov_type was wired into _validate_params. Fix: snapshot original attribute values for every kwarg before applying mutations, run validation, and on exception restore the snapshot before re-raising. The snapshot pass also moves the "Unknown parameter" check ahead of any mutation so that even unknown-name rejections leave the estimator untouched. Regression test added: test_set_params_rollback_on_validation_failure fires the 3-kwarg batched call and pins all three attribute values to their pre-call snapshot after the raise.

github-actions · 2026-05-26T14:56:03Z

🔁 AI review rerun (requested by @igerber)

Head SHA: f808ae69a18363104e65bf3bb92932e084a172c3

Overall Assessment

✅ Looks good

Executive Summary

Previous review’s set_params() partial-mutation concern is resolved: EfficientDiD.set_params() now snapshots state, validates, and rolls back on failure, with regression coverage in diff_diff/efficient_did.py:L353-L377 and tests/test_efficient_did.py:L2510-L2540.
No P0/P1 methodology defects found. The current Chen-Sant'Anna-Xie paper frames EfficientDiD as an efficient staggered-DiD/event-study estimator under multiple-period parallel-trends assumptions, and this PR’s vcov_type="hc1"-only contract is explicitly documented in the registry/docstrings rather than introduced silently. docs/methodology/REGISTRY.md:L1057-L1059, diff_diff/efficient_did.py:L14-L21. citeturn0view0
The new survey-PSU <2 bootstrap guard returns all-NaN inference and keys every downstream aggregation override, so the old misleading near-zero-SE path no longer leaks through. diff_diff/efficient_did_bootstrap.py:L116-L137, diff_diff/efficient_did_bootstrap.py:L383-L448, diff_diff/efficient_did.py:L1106-L1155.
Rename propagation from cluster to cluster_name is complete in the visible Hausman replay path and is covered by diff_diff/diagnostic_report.py:L2438-L2446 and tests/test_business_report.py:L1243-L1285.
Static review only: I could not run pytest here because it is not installed in the review environment.

Methodology

No findings. The paper’s current framing is an efficient staggered-DiD/event-study estimator under multiple-period parallel-trends assumptions, and the PR’s narrow {"hc1"} contract plus cluster/survey notes are documented in the repository rather than constituting an undocumented methodology change. docs/methodology/REGISTRY.md:L1054-L1059, diff_diff/efficient_did.py:L14-L21. citeturn0view0

Code Quality

No findings. The prior rollback issue is fixed by the new atomic set_params() path and directly tested in diff_diff/efficient_did.py:L353-L377 and tests/test_efficient_did.py:L2524-L2540.

Performance

No findings.

Maintainability

Severity P3; Impact: EfficientDiDResults.cluster was renamed to cluster_name without a compatibility alias, so out-of-tree consumers of result.cluster will break even though in-tree callers were updated. diff_diff/efficient_did_results.py:L163-L174, diff_diff/diagnostic_report.py:L2438-L2446. Concrete fix: if this release is intended to be non-breaking, add a deprecated read-only cluster alias for one release; otherwise no action is required.

Tech Debt

No findings. The deferred vcov_type="conley" work is explicitly tracked in TODO.md:L107-L115, so it is properly recorded rather than silently deferred.

Security

No findings.

Documentation/Tests

No findings. Coverage added in tests/test_efficient_did.py:L2330-L2785 and tests/test_business_report.py:L1243-L1285 is aligned with the changed surfaces: constructor/setter validation, aggregation modes, survey/bootstrap branches, NaN propagation, result metadata, and Hausman replay wiring.
Runtime validation was not possible in this environment because pytest is unavailable.

igerber added 2 commits May 26, 2026 10:18

igerber added the ready-for-ci Triggers CI test workflows label May 26, 2026

igerber merged commit 2656e44 into main May 26, 2026
33 of 34 checks passed

igerber deleted the feature/efficient-did-vcov-type-phase1b branch May 26, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

efficient-did: thread vcov_type as narrow {hc1} contract per Chen-Sant'Anna-Xie 2025 (Phase 1b #4)#495

efficient-did: thread vcov_type as narrow {hc1} contract per Chen-Sant'Anna-Xie 2025 (Phase 1b #4)#495
igerber merged 3 commits into
mainfrom
feature/efficient-did-vcov-type-phase1b

igerber commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented May 26, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

github-actions Bot commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant