Skip to content

Thread vcov_type="conley" through SunAbraham + WooldridgeDiD (defer StackedDiD)#503

Merged
igerber merged 1 commit into
mainfrom
feature/conley-vcov-sandwich
May 31, 2026
Merged

Thread vcov_type="conley" through SunAbraham + WooldridgeDiD (defer StackedDiD)#503
igerber merged 1 commit into
mainfrom
feature/conley-vcov-sandwich

Conversation

@igerber
Copy link
Copy Markdown
Owner

@igerber igerber commented May 31, 2026

Summary

  • Thread vcov_type="conley" (Conley 1999 spatial-HAC) through SunAbraham and WooldridgeDiD-OLS, mirroring the verified DiD/MPD/TWFE conley path. Pure threading — reuses the existing, already-conleyreg-validated solve_ols / conley.py machinery (no new variance code).
  • Both estimators: five conley_* constructor params + get_params/set_params; the shared _validate_conley_estimator_inputs front-door gate; reject conley + survey_design / weights / n_bootstrap>0; route conley through the within-transform (FWL) path with row-aligned coord/time/unit arrays built from the post-filter regression frame; drop the unit auto-cluster on the conley path (an explicit cluster= enables the spatial+cluster product kernel); Results gain conley_lag_cutoff + a Conley variance-label line in summary() (SunAbrahamResults also gains cluster_name). WooldridgeDiD conley is OLS-path-only (method ∈ {logit, poisson} stays rejected). Existing SEs are bit-identical (conley is purely additive on the math).
  • StackedDiD conley is NOT threaded — deferred for a methodology reason (the stacked design replicates each control unit across sub-experiments, so Conley would see same-unit copies at distance 0; no conleyreg analogue; paper-gated). Its prior "same shape as the SunAbraham follow-up" framing is corrected in the rejection message, REGISTRY, and TODO.

Methodology references (required if estimator / math changes)

  • Method name(s): Conley (1999) spatial-HAC variance, threaded through SunAbraham (Sun & Abraham 2021 interaction-weighted) and WooldridgeDiD (Wooldridge 2021/2023 ETWFE) OLS paths.
  • Paper / source link(s): Conley, T. G. (1999), GMM Estimation with Cross-Sectional Dependence, J. Econometrics 92(1). R-parity anchor for the underlying machinery: conleyreg (Düsterhöft 2021, CRAN v0.1.9). See docs/methodology/REGISTRY.md §ConleySpatialHAC and the SunAbraham / WooldridgeDiD variance-family sections.
  • Any intentional deviations from the source (and why): None new — reuses the existing conley machinery and its documented deviations (1-D radial Bartlett kernel; time-label normalization). StackedDiD conley deferred for methodology (unit replication breaks the spatial-distance interpretation; no reference implementation).

Validation

  • Tests added/updated: tests/test_conley_vcov.py (TestConleySunAbraham / TestConleyWooldridge) — the FWL-composability anchor (estimator within-transform conley SE == a hand-built full-dummy [intercept, treated*post, C(unit), C(time)] solve_ols conley SE at atol≈1e-7), within-period-spatial vs panel-serial paths, spatial+cluster product kernel, every rejection (survey / weights / n_bootstrap / logit / malformed validator inputs), unbalanced-panel alignment, cohort_trends=True (full-dummy), aggregations (group/calendar/event, incl. the cohort_trends combination), multi-cohort IW + delta-method event-study, control_group="never_treated", and get/set_params round-trip. Updated two stale conley-rejected-at-init tests (test_sun_abraham.py, test_wooldridge.py). 403 tests pass.
  • Backtest / simulation / notebook evidence (if applicable): N/A
  • Local AI review: codex (full standard mode, --force-fresh) — ✅ approved and defect-free across 5 rounds; every finding was test-coverage or doc-wording (all addressed).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown

Overall Assessment

✅ Looks good

Executive Summary

  • Affected methods: SunAbraham and WooldridgeDiD(method="ols") now thread the existing Conley (1999) spatial-HAC machinery through their OLS paths.
  • I did not find an undocumented methodology deviation, missing assumption check, or incorrect variance/SE path in the new Conley routing.
  • The implementation is consistent with the Methodology Registry and reuses the shared Conley validator and existing solve_ols/conley.py machinery rather than introducing new variance math.
  • Parameter propagation looks complete across constructor args, get_params(), fit-time routing, result metadata, and Wooldridge aggregation/summary surfaces.
  • StackedDiD non-support is now correctly documented as methodology-gated and tracked, so it is not a blocker.
  • P3 only: a few internal docstrings still describe the pre-PR routing and should be updated.
  • I could not execute tests locally because python -m pytest fails in this environment (No module named pytest).

Methodology

  • No findings. The shipped behavior documented in docs/methodology/REGISTRY.md:L1114-L1123 and docs/methodology/REGISTRY.md:L1590-L1595 matches the implementation in diff_diff/sun_abraham.py:L692-L705, diff_diff/sun_abraham.py:L1509-L1542, diff_diff/wooldridge.py:L685-L710, and diff_diff/wooldridge.py:L958-L1161. Unsupported compositions are fail-closed, not silently downgraded.
  • No findings. The StackedDiD Conley deferral is explicitly documented as a methodology limitation, not an implementation omission, in diff_diff/stacked_did.py:L208-L221, docs/methodology/REGISTRY.md:L1480-L1484, and TODO.md:L106-L106.

Code Quality

  • No findings. The new conley_* parameters are threaded through constructor state, get_params(), fit-time validation, and results metadata for both estimators in the changed code.

Performance

  • No findings. This PR reuses the existing Conley engine and only adds lightweight row-aligned array assembly; it does not introduce a second spatial-HAC implementation or a new high-cost variance path.

Maintainability

  • Severity: P3
    Impact: A few internal docstrings still describe the pre-PR routing, which can mislead future reviewers about where Conley is supported and which path is used. See diff_diff/sun_abraham.py:L1351-L1359, diff_diff/wooldridge.py:L946-L953, and diff_diff/conley.py:L213-L215.
    Concrete fix: Update those docstrings to mention vcov_type="conley", the Wooldridge cohort_trends=True full-dummy exception, and the expanded estimator consumers of _validate_conley_estimator_inputs.

Tech Debt

  • No findings. Deferred work is properly tracked, and the newly clarified StackedDiD limitation is already recorded in TODO.md, so it should not block this PR.

Security

  • No findings.

Documentation/Tests

  • No PR-level findings. The added coverage is aimed at the risky surfaces: FWL composability, rejected survey/bootstrap/non-OLS combinations, unbalanced-panel alignment, explicit cluster product-kernel behavior, Wooldridge aggregation, and cohort_trends=True in tests/test_conley_vcov.py:L3251-L3400 and tests/test_conley_vcov.py:L3435-L3640.
  • Reviewer note: I could not run the tests in this environment because pytest is not installed.

@igerber igerber force-pushed the feature/conley-vcov-sandwich branch from b05a57b to 5c378f5 Compare May 31, 2026 10:22
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 5c378f59881003f22cc3407d6d243cf57249f359


Overall Assessment

✅ Looks good

Executive Summary

  • Affected methods: Conley (1999) spatial-HAC is now threaded through SunAbraham and WooldridgeDiD(method="ols"); StackedDiD remains correctly fail-closed for a documented methodology reason.
  • I did not find an undocumented methodology deviation, missing assumption check, incorrect variance/SE path, or misleading NaN/inference behavior in the new Conley routing.
  • The implementation reuses the existing solve_ols / conley.py machinery rather than introducing new variance math, and the Methodology Registry now documents the shipped surfaces and the StackedDiD deferral.
  • Parameter propagation looks complete across constructor state, get_params(), fit-time validation, variance routing, and results metadata on both estimators.
  • Prior P3 docstring drift in diff_diff/conley.py, the SunAbraham routing docstring, and the Wooldridge OLS-routing docstring appears resolved.
  • P3 only: two class-level docstrings still describe the pre-PR status in diff_diff/sun_abraham.py:L480-L481 and diff_diff/stacked_did.py:L107-L109.
  • I could not execute the tests locally because python -m pytest fails in this environment with No module named pytest.

Methodology

  • No findings. The PR’s methodology change is Conley (1999) variance threading through Sun-Abraham IW and Wooldridge ETWFE-OLS, and the implementation stays within the documented registry contract: shared front-door validation in diff_diff/conley.py:L201-L315, SunAbraham Conley routing in diff_diff/sun_abraham.py:L687-L712 and diff_diff/sun_abraham.py:L1510-L1543, Wooldridge Conley routing in diff_diff/wooldridge.py:L685-L711, diff_diff/wooldridge.py:L1014-L1015, and diff_diff/wooldridge.py:L1130-L1164, with the shipped behavior documented in docs/methodology/REGISTRY.md:L1114-L1123 and docs/methodology/REGISTRY.md:L1590-L1594. The StackedDiD non-support is also properly documented as methodology-gated, not plumbing, in docs/methodology/REGISTRY.md:L1479-L1484 and TODO.md:L106-L106.

Code Quality

  • No findings. I did not see any new inline inference anti-patterns, and the new conley_* parameters are propagated through estimator state and results surfaces in diff_diff/sun_abraham.py:L550-L605, diff_diff/sun_abraham.py:L1317-L1318, diff_diff/sun_abraham.py:L2230-L2258, diff_diff/wooldridge.py:L361-L406, diff_diff/wooldridge.py:L480-L537, and diff_diff/wooldridge.py:L1475-L1505.

Performance

  • No findings. This PR adds lightweight row-aligned Conley array assembly on top of the existing Conley engine; it does not introduce a second spatial-HAC implementation. The only full-dummy Conley route is the already-documented cohort_trends=True Wooldridge path in diff_diff/wooldridge.py:L1014-L1078.

Maintainability

  • Severity: P3 [Newly identified]
    Impact: Two class docstrings still describe the pre-PR surface and can mislead future users/reviewers. diff_diff/sun_abraham.py:L480-L481 still says Conley is “not yet wired up,” and diff_diff/stacked_did.py:L107-L109 still describes the StackedDiD rejection as a plumbing follow-up rather than the now-documented methodology deferral.
    Concrete fix: Update those docstrings to match the shipped SunAbraham Conley support and the methodology-gated StackedDiD rejection already documented in docs/methodology/REGISTRY.md and TODO.md.

Tech Debt

  • No findings. The only intentional non-support in scope, StackedDiD(vcov_type="conley"), is explicitly tracked and documented as deferred methodology work rather than left as silent behavior: TODO.md:L106-L106.

Security

  • No findings.

Documentation/Tests

  • No blocker findings. The added tests cover the risky new surfaces: FWL composability, Conley + explicit cluster product kernel, survey/bootstrap/non-OLS rejection gates, unbalanced-panel alignment, Wooldridge aggregation paths, cohort_trends=True, and control_group="never_treated" in tests/test_conley_vcov.py:L3219-L3639, with constructor-surface updates in tests/test_sun_abraham.py:L1657-L1667 and tests/test_wooldridge.py:L1755-L1762.
  • Reviewer note: I could not run the test suite locally because python -m pytest -q tests/test_conley_vcov.py -k 'ConleySunAbraham or ConleyWooldridge' fails with No module named pytest.

@igerber igerber force-pushed the feature/conley-vcov-sandwich branch from 5c378f5 to e6e1771 Compare May 31, 2026 10:30
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e6e1771997e76d2187d9cf74f43ca38d75f7f1df


Overall Assessment

✅ Looks good

Executive Summary

  • Affected methods: Conley (1999) spatial-HAC is now threaded through Sun-Abraham IW and Wooldridge ETWFE-OLS; StackedDiD remains correctly fail-closed for a documented methodology reason.
  • I did not find an undocumented methodology deviation, missing assumption check, incorrect variance/SE path, or misleading NaN/inference behavior in the changed code.
  • Both estimators reuse the existing solve_ols / conley.py machinery rather than introducing new variance math, and the Methodology Registry now matches the shipped surfaces.
  • Parameter propagation looks complete across constructor state, get_params(), fit-time validation, variance routing, and results metadata/summary.
  • The prior P3 docstring drift from the last AI review appears resolved.
  • I could not execute the tests in this environment because numpy and pytest are unavailable; syntax-only compile() checks on the changed library and test files passed.

Methodology

  • No findings. The updated registry contract and the implementation are aligned for both affected estimators: Sun-Abraham Conley routing is documented in REGISTRY.md, implemented via the shared Conley validator and within-transform routing in conley.py and sun_abraham.py, and threaded into the regression call in sun_abraham.py. Wooldridge’s OLS-only Conley contract, including the cohort_trends=True full-dummy route, is documented in REGISTRY.md and implemented in wooldridge.py, wooldridge.py, and wooldridge.py. The StackedDiD non-support is also now consistently documented as methodology-gated, not plumbing, in REGISTRY.md and stacked_did.py.

Code Quality

Performance

  • No findings. This PR adds only lightweight row-aligned Conley array assembly on top of the existing Conley backend; it does not introduce a second spatial-HAC implementation or an extra estimator-specific variance routine, as seen in sun_abraham.py and wooldridge.py.

Maintainability

Tech Debt

  • No findings. The only intentional non-support in scope, StackedDiD(vcov_type="conley"), is explicitly tracked in TODO.md, so it is correctly treated as deferred methodology work rather than a blocker.

Security

  • No findings.

Documentation/Tests

  • No findings. The added tests cover the risky new surfaces: FWL composability, explicit cluster product-kernel behavior, survey/bootstrap rejection, unbalanced-panel alignment, Wooldridge aggregation paths, cohort_trends=True, and control_group="never_treated" in tests/test_conley_vcov.py. I could not run them locally because this environment lacks numpy and pytest; I was only able to perform syntax-only compile() checks, which passed for the changed library and test files.

@igerber igerber added the ready-for-ci Triggers CI test workflows label May 31, 2026
… (defer StackedDiD)

Thread Conley (1999) spatial-HAC variance through SunAbraham and WooldridgeDiD-OLS,
mirroring the verified DiD/MPD/TWFE conley path. Pure threading — reuses the existing,
already-conleyreg-validated solve_ols / conley.py machinery; no new variance code.

Both estimators:
- accept vcov_type="conley" + 5 conley_* constructor params (conley_coords,
  conley_cutoff_km, conley_metric, conley_kernel, conley_lag_cutoff) in __init__ +
  get_params/set_params
- call the shared conley._validate_conley_estimator_inputs front-door gate (rejects
  conley + survey_design; checks coords/cutoff/unit/lag/cluster columns) with a literal
  inference="analytical" (neither estimator has a self.inference attr)
- reject conley + n_bootstrap>0 (the multiplier/pairs bootstrap overrides the analytical
  sandwich; the shared validator only knows about wild_bootstrap)
- route conley through the within-transform (FWL) path; build coord/time/unit arrays from
  the post-filter regression frame, row-aligned to the design
- drop the unit auto-cluster on the conley path (an explicit cluster= enables the
  spatial+cluster product kernel)
- carry conley_lag_cutoff on Results + a Conley variance-label line in summary()
  (SunAbrahamResults also gains cluster_name)

WooldridgeDiD conley is OLS-path-only (the method != "ols" guard keeps it off
logit/poisson). Existing SEs stay bit-identical (conley is purely additive).

StackedDiD conley is NOT threaded — deferred for a methodology reason (the stacked design
replicates units across sub-experiments, so Conley would see same-unit copies at distance
0; no conleyreg anchor; paper-gated). The prior "same shape as the SunAbraham follow-up"
framing is corrected in the rejection message, REGISTRY, and TODO.

Tests (tests/test_conley_vcov.py TestConleySunAbraham / TestConleyWooldridge): smoke
(cross-sectional + panel finite SE, label, metadata), FWL-composability (the
within-transform conley SE equals a full-dummy [intercept, treated*post, C(unit), C(time)]
solve_ols conley SE at atol 1e-7 — the primary, non-tautological correctness anchor),
conley+cluster product kernel, rejections (survey / n_bootstrap / weights / malformed
validator inputs / logit), unbalanced-panel alignment, get/set_params round-trip.

Docs: REGISTRY SA + WooldridgeDiD sections flipped to supported + a distinct non-survey
OLS-sandwich entry in the Conley deferral-status note; CHANGELOG Added; TODO rows resolved
+ StackedDiD reclassified; llms-full.txt + param docstrings.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@igerber igerber force-pushed the feature/conley-vcov-sandwich branch from e6e1771 to 2fa90f4 Compare May 31, 2026 13:54
@github-actions
Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 2fa90f4c85d322839af1f6e2aa338090ccfadc3f


Overall Assessment

✅ Looks good

Executive Summary

  • Affected methods: Conley (1999) spatial-HAC is now threaded through SunAbraham and WooldridgeDiD(method="ols"); StackedDiD remains correctly fail-closed for a documented methodology reason.
  • No unmitigated P0/P1 issue surfaced in the changed estimator code, registry updates, or tests.
  • The PR reuses the existing solve_ols / conley.py machinery rather than introducing new variance math, and the shipped restrictions match the Methodology Registry.
  • Parameter propagation is complete across constructor state, fit-time validation, variance routing, result metadata, and summary labeling on both estimator surfaces.
  • The added tests hit the risky branches: FWL composability, explicit cluster product-kernel behavior, survey/bootstrap rejection, unbalanced-panel alignment, cohort_trends=True, aggregation paths, and control_group="never_treated".
  • Runtime verification was not possible here because numpy, pandas, scipy, and pytest are unavailable; validation is source inspection plus syntax-only compilation of the touched Python files.

Methodology

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

Security

  • No findings.

Documentation/Tests

@igerber igerber merged commit dd18dc5 into main May 31, 2026
26 checks passed
@igerber igerber deleted the feature/conley-vcov-sandwich branch May 31, 2026 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant