PowerAnalysis methodology review (PR-A): Bloom 1995 + Burlig 2020 source audits#506
Conversation
|
Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
|
f70b9d6 to
aa64e26
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Path to Approval
|
…rce audits Step-1 fidelity artifact for the PowerAnalysis methodology validation (the paper-review PR of the 2-PR sequence; source validation + code reconciliation + tests are deferred to PR-B). No estimator logic, formula, or test changes. New source audits under docs/methodology/papers/ (sourced only from the papers): - bloom-1995-review.md: the MDE-multiplier framing — MDE = M*SE with M from the NORMAL distribution (Bloom builds 1.65 + 0.84 = 2.49 from z-quantiles, p.548-549), the cross-sectional impact-estimator SE sigma*sqrt((1-R^2)/(n*T*(1-T))) (Eq. 1/2), the T(1-T) allocation factor (optimal at 50/50), one- and two-sided multipliers; documents that Bloom explicitly excludes clustering/multi-site design effects (Note 1). - burlig-preonas-woerman-2020-review.md: the serial-correlation-robust (SCR) panel-DD variance (Eq. 2, verbatim — three covariance terms psi^B/psi^A/psi^X over m pre- and r post-periods, psi^X entering negatively), the McKenzie special case (Eq. 3), the increase-MDE condition (Eq. 4); Eq. 1 uses the t-distribution; pcpanel is the panel parity reference. The audits surfaced discrepancies between the authoritative PowerAnalysis surfaces and the source material. Per the agreed approach these are DOCUMENTED as under-review now (not yet fixed — reconciliation is deferred to PR-B and tracked in TODO.md): - REGISTRY.md ## PowerAnalysis: umbrella **Note:** enumerating the four discrepancies (t-vs-normal-z multiplier; SE R^2/cluster-m terms; missing T(1-T) allocation factor in the displayed sample-size formula; panel (1+(T-1)rho)/T is an equicorrelated/Moulton design effect, NOT Burlig SCR — an attribution overclaim). - REGISTRY.md R-equivalents table: annotate the PowerAnalysis row as under-review (analytical path is normal-based, so pwr.t.test is not the faithful parity target; panel parity ref is Stata pcpanel) — resolves the cross-reference inconsistency the audits introduced. - power.py: docstring notes on PowerAnalysis and simulate_power flagging the panel attribution and normal-vs-t approximation as under review; the class docstring panel-variance display corrected from a self-canceling factor to the implemented (1/N_treated + 1/N_control) * (1+(T-1)rho)/T (docstring-only; no logic change). - references.rst: clarify the analytical panel path uses an equicorrelated approximation, not Burlig's SCR formula. - TODO.md: tracker row (Methodology/Correctness) for the PR-B reconciliation. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
aa64e26 to
2470be6
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality No findings. The docstring correction in Performance No findings. The diff is documentation-only for runtime purposes. Maintainability No findings. The new paper-review files are explicitly marked non-authoritative, and the follow-up ownership is clear in Tech Debt No findings. The deferred reconciliation is properly tracked under “Tech Debt from Code Reviews,” so the remaining methodology cleanup is visible and non-blocking for this PR. See Security No findings. Documentation/Tests No findings in the changed diff. I did not run tests because this environment is missing |
… Eq.2 equicorrelated; tracker -> Complete Reconciles diff_diff/power.py with the Bloom (1995) + Burlig, Preonas & Woerman (2020) source audits (paper reviews added in PR-A #506). Behavior change: the analytical panel-DiD variance was the Moulton (1+(T-1)rho)/T factor (wrong period-scaling, ~4x too small at rho=0/m=r=5, AND opposite rho-sign). Replaced with the within-unit equicorrelated special case of Burlig Eq. 2, sigma^2 (1/n_T+1/n_C)(1/m+1/r)(1-rho), so within-unit correlation now LOWERS the MDE. The MDE multiplier stays the normal-z Bloom multiplier (documented as a deliberate large-sample approximation to Burlig's t). - power.py: equicorrelated variance in _compute_variance + _compute_required_n; input validation for ALL designs (n_pre>=1, n_post>=1, rho in [-1/(T-1), 1)) enforced BEFORE the 2x2-vs-panel router, so invalid two-period shapes no longer fall through silently; the (1-rho) factor applies at T=2 too (Burlig footnote 11, the m=r=1 case), so rho is never silently ignored and rho=0 recovers Bloom's 2*sigma^2; docstrings rewritten; PR-A under-review notes removed. - REGISTRY ## PowerAnalysis equation block rewritten (z not t; unified equicorrelated SE with the 2x2 as the m=r=1 special case; cluster-m and inverted-R^2 terms removed; both reference surfaces; checklist ticked). - New tests/test_methodology_power.py (Bloom Table 1; 2x2 + panel closed forms; literal-equicorrelated Monte-Carlo; sample_size<->mde round-trip; input-guard + rho-at-T=2 + compute_* wrapper validation; base-R qnorm parity). - benchmarks/R/generate_power_golden.R + benchmarks/data/r_power_golden.json. - tests/test_power.py: inverted test_icc_effect + test_extreme_icc to Burlig's sign. - references.rst: + Frison & Pocock (1992), McKenzie (2012) lineage. - docs/tutorials/06_power_analysis.ipynb: corrected rho cells + summary. - METHODOLOGY_REVIEW.md row -> Complete; TODO row removed; CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
… Eq.2 equicorrelated; tracker -> Complete Reconciles diff_diff/power.py with the Bloom (1995) + Burlig, Preonas & Woerman (2020) source audits (paper reviews added in PR-A #506). Behavior change: the analytical panel-DiD variance was the Moulton (1+(T-1)rho)/T factor (wrong period-scaling, ~4x too small at rho=0/m=r=5, AND opposite rho-sign). Replaced with the within-unit equicorrelated special case of Burlig Eq. 2, sigma^2 (1/n_T+1/n_C)(1/m+1/r)(1-rho), so within-unit correlation now LOWERS the MDE. The MDE multiplier stays the normal-z Bloom multiplier (documented as a deliberate large-sample approximation to Burlig's t). - power.py: equicorrelated variance in _compute_variance + _compute_required_n; input validation for ALL designs (n_pre>=1, n_post>=1, rho in [-1/(T-1), 1)) enforced BEFORE the 2x2-vs-panel router, so invalid two-period shapes no longer fall through silently; the (1-rho) factor applies at T=2 too (Burlig footnote 11, the m=r=1 case), so rho is never silently ignored and rho=0 recovers Bloom's 2*sigma^2; docstrings rewritten; PR-A under-review notes removed. - REGISTRY ## PowerAnalysis equation block rewritten (z not t; unified equicorrelated SE with the 2x2 as the m=r=1 special case; cluster-m and inverted-R^2 terms removed; both reference surfaces; checklist ticked). - New tests/test_methodology_power.py (Bloom Table 1; 2x2 + panel closed forms; literal-equicorrelated Monte-Carlo; sample_size<->mde round-trip; input-guard + rho-at-T=2 + compute_* wrapper validation; base-R qnorm parity). - benchmarks/R/generate_power_golden.R + benchmarks/data/r_power_golden.json. - tests/test_power.py: inverted test_icc_effect + test_extreme_icc to Burlig's sign. - references.rst: + Frison & Pocock (1992), McKenzie (2012) lineage. - docs/tutorials/06_power_analysis.ipynb: corrected rho cells + summary. - METHODOLOGY_REVIEW.md row -> Complete; TODO row removed; CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Summary
docs/methodology/papers/, sourced only from the papers.PowerAnalysissurfaces and the source material; reconciliation is deferred to PR-B and tracked inTODO.md._compute_variancealready implements — docstring only.)Methodology references (required if estimator / math changes)
**Note:**+power.pydocstring notes + areferences.rstclarification, with reconciliation tracked inTODO.mdfor PR-B:1/√(1−R²)and cluster-sizemterms are not whatpower.pyimplements;T(1−T)allocation factor the code applies;(1+(T−1)ρ)/Tfactor is an equicorrelated/Moulton design effect, not Burlig's serial-correlation-robust (SCR) variance (Eq. 2) — an attribution overclaim pending PR-B re-attribution or implementation.Validation
pcpanelfor the panel path; a normal-based reference for the analytical path) are deferred to PR-B.Security / privacy
Generated with Claude Code