Prepare v0.2.0 release by igerber · Pull Request #7 · igerber/diff-diff

igerber · 2026-01-02T16:04:14Z

Add MultiPeriodDiD documentation to README with usage examples and API reference
Bump version from 0.1.0 to 0.2.0
Add feature bullet for multi-period analysis
Fix license format for newer setuptools compatibility

- Add MultiPeriodDiD documentation to README with usage examples and API reference - Bump version from 0.1.0 to 0.2.0 - Add feature bullet for multi-period analysis - Fix license format for newer setuptools compatibility

Addresses axis B findings #6 and #7 from the silent-failures audit: trop_global.py:448 outer alternating-min loop, trop_global.py:466 hard-coded range(20) inner FISTA loop, and trop_local.py:680 alternating-minimization loop all exited silently on max_iter exhaustion, returning the current iterate as if converged. - trop_global._solve_global_with_lowrank: thread a converged flag through the outer loop; count non-convergence events from the inner FISTA and surface the count in the outer warning for diagnostic context. One warn_if_not_converged call per solver invocation. - trop_local._estimate_model: thread a converged flag through the outer alternating-min loop; call warn_if_not_converged on exhaustion. - REGISTRY updated under TROP. New TestTROPConvergenceWarnings class (4 tests) exercises both global and local paths with forced non-convergence (max_iter=1, tol=1e-15) and a convergent negative control. Notable: the default TROP local config (max_iter=100, tol=1e-6) does not converge within max_iter on typical synthetic panels, so this PR surfaces a previously silent non-convergence that affected routine user fits. No numerical change in the returned iterate; the warning is additive. Axis-B regression-lint baseline: 5 -> 2 silent range(max_iter) loops remaining (minor loops in honest_did/power not yet addressed). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Wave 3 #6+#7 of the dCDH by_path follow-up sequence (after PR #378 shipped #5 by_path + controls). Removes the two NotImplementedError gates at chaisemartin_dhaultfoeuille.py:1014-1023 and adds: - New `path_cumulated_event_study` Results field surfacing the cumulated level effect delta_l per path under trends_linear=True (mirrors the global linear_trends_effects cumulation; this is what R returns under did_multiplegt_dyn(..., by_path, trends_lin)). - `set_ids` parameter threaded through the four per-path IF helpers so trends_nonparam's set-restricted control pool reaches per-path analytical SE, bootstrap, placebos, and sup-t bands automatically. - to_dataframe(level="by_path") gains cumulated_effect / cumulated_se columns (always present, NaN-when-None — mirrors cband_*). - summary() renders a per-path "Cumulated Level Effects" sub-section. Validated against R DIDmultiplegtDYN 2.3.3 via two new golden scenarios: - single_baseline_multi_path_by_path_trends_lin (custom DGP: F_g >= 4, cohort-single-path, n_periods=13) — per-path cumulated point estimates match R bit-exactly (POINT_RTOL=1e-9), cumulated SE within ±20% - multi_path_reversible_by_path_trends_nonparam — per-path point estimates AND placebos match R bit-exactly, SE within ±15% Placebo parity for trends_linear is intentionally skipped: R's per-path placebo re-runs on the path-restricted subsample with different control eligibility than Python's global-then-disaggregate architecture, so the divergence is methodological, not a tolerance issue. Internal regression covers placebo + trends_linear (finite values, bootstrap inheritance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

dCDH by_path: lift trends_linear + trends_nonparam gates (Wave 3 #6+#7)

Closes BR/DR foundation gap igerber#6 from project_br_dr_foundation.md: BusinessReport and DiagnosticReport now name what the headline scalar actually represents as an estimand, for each of the 16 result classes. Baker et al. (2025) Step 2 ("define the target parameter") was previously in BR's next_steps list but not done by BR itself — this PR closes that gap. New top-level ``target_parameter`` block (additive schema change; experimental per REPORTING.md stability policy): { "name": str, # stakeholder-facing name "definition": str, # plain-English description "aggregation": str, # machine-readable dispatch tag "headline_attribute": str, # which raw result attribute "reference": str, # REGISTRY.md citation pointer } Schema placement: top-level block (user preference, selected via AskUserQuestion in planning). Aggregation tags include "simple", "event_study", "group", "2x2", "twfe", "iw", "stacked", "ddd", "staggered_ddd", "synthetic", "factor_model", "M", "l", "l_x", "l_fd", "l_x_fd", "dose_overall", "pt_all_combined", "pt_post_single_baseline", "unknown". Per-estimator dispatch lives in the new ``diff_diff/_reporting_helpers.py::describe_target_parameter`` (own module rather than business_report / diagnostic_report to avoid circular-import risk — plan-review LOW igerber#7). All 17 result classes covered (16 from _APPLICABILITY + BaconDecompositionResults); exhaustiveness locked in by TestTargetParameterCoversEveryResultClass. Fit-time config reads: - ``EfficientDiDResults.pt_assumption`` branches the aggregation tag between pt_all_combined and pt_post_single_baseline. - ``StackedDiDResults.clean_control`` varies the definition clause (never_treated / strict / not_yet_treated). - ``ChaisemartinDHaultfoeuilleResults.L_max`` + ``covariate_residuals`` + ``linear_trends_effects`` branches the dCDH estimand between DID_M / DID_l / DID^X_l / DID^{fd}_l / DID^{X,fd}_l. Fixed-tag branches (per plan-review CRITICAL igerber#1 and igerber#2): - ``CallawaySantAnna`` / ``ImputationDiD`` / ``TwoStageDiD`` / ``WooldridgeDiD``: the fit-time ``aggregate`` kwarg does not change the ``overall_att`` scalar — it only populates additional horizon / group tables on the result object. Disambiguating those tables in prose is tracked under gap igerber#9. - ``ContinuousDiDResults``: the PT-vs-SPT regime is a user-level assumption, not a library setting. Emits a single "dose_overall" tag with disjunctive definition naming both regime readings (ATT^loc under PT, ATT^glob under SPT). Prose rendering: - BR ``_render_summary``: emits "Target parameter: <name>." after the headline sentence (short name only; full definition lives in the full_report and schema). - BR ``_render_full_report``: "## Target Parameter" section between "## Headline" and "## Identifying Assumption". - DR ``_render_overall_interpretation``: mirror sentence. - DR ``_render_dr_full_report``: "## Target Parameter" section with name, definition, aggregation tag, headline attribute, and reference. Cross-surface parity: both BR and DR consume the same helper (the single source of truth), so their ``target_parameter`` blocks are byte-identical (verified by TestTargetParameterCrossSurfaceParity). Tests: 37 new (TestTargetParameterPerEstimator + TestTargetParameterFitConfigReads + TestTargetParameterCoversEveryResultClass + TestTargetParameterCrossSurfaceParity + TestTargetParameterProseRendering). Existing BR/DR top-level-key contract tests updated to include ``target_parameter``. Total 319 tests pass (282 prior + 37 new). Docs: REPORTING.md gains a "Target parameter" section documenting the per-estimator dispatch and schema shape. business_report.rst and diagnostic_report.rst note the new field with a pointer to REPORTING.md. CHANGELOG entry under Unreleased. Out of scope: REGISTRY.md per-estimator "Target parameter" sub-sections (plan-review additional-note); the reporting-layer doc in REPORTING.md is the current source of truth. A follow-up docs PR can land those sub-sections if maintainers want the registry to own the canonical wording directly. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Closes the WooldridgeDiD (ETWFE) methodology-review-tracker promotion in METHODOLOGY_REVIEW.md (In Progress → Complete), following the primary-source review for Wooldridge (2025) merged in PR-A (#484). Adds two paper-driven implementation surfaces and extends R-parity goldens to the nonlinear paths. Implementation: - `aggregate(weights="cohort_share")` on WooldridgeDiDResults implements paper Eqs. 7.4 (simple-overall) and 7.6 (event-time, restricted to k>=0) cohort-share aggregation weights as an opt-in alternative to the default cell-count weighting (matching Stata `jwdid_estat`). Inference fields fail-closed to NaN with UserWarning per paper Section 7.5 conditional-on-shares semantics; raises on `survey_design` (design-consistent totals deferred); raises on `type ∈ {"group","calendar"}` (no paper closed-form); raises on bootstrap fits (no matching bootstrap variant). Closes TODO row 95. - `cohort_trends=True` on `WooldridgeDiD.__init__` adds linear `dg_i · t` cohort-specific trend interactions (paper Section 8 / Eq. 8.1) for the OLS path. Rejects on logit/poisson per paper Section 8 OLS scope; rejects on survey_design pending full-dummy/TSL validation; enforces per-cohort pre-period identification check (≥ 2 observed pre-periods per treated cohort). Auto-routes to full-dummy mode regardless of vcov_type. Closes the PR-A Requirements Checklist heterogeneous-trends gap. Tests: - `tests/test_methodology_wooldridge.py` extended with 6 paper-equation-numbered methodology classes (Theorem 3.1, Proposition 5.1, Section 6 event study, Section 7 aggregation paths, Section 8 heterogeneous trends, Section 10 unbalanced panels) + `TestW2025LibraryDeviations` consolidating 5 surviving deviations. Mirrors the HAD PR #473 precedent. - Two new R-parity surface classes (`TestWooldridgeParityRPoisson`, `TestWooldridgeParityRLogit`) lock the structural surface against R `etwfe(family=...)` log-link goldens. - 209 tests total (60 methodology + 149 R-parity + unit regressions). R Goldens: - `benchmarks/R/generate_wooldridge_golden.R` extended with Poisson + logit DGPs via R `etwfe`; augmented panel CSV retains the same seed-generated `y_pois` + `y_logit` columns for cross-language reproducibility. - `benchmarks/R/requirements.R` pins `etwfe >= 0.5.0`. Tracker promotion: - METHODOLOGY_REVIEW.md L52 status flip with merge date; detail section L583-605 rewritten to the Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns template mirroring HAD / ContinuousDiD / DCDH. L27 example re-pointed; priority queue items #7-#10 renumbered to #6-#9. - REGISTRY.md `## WooldridgeDiD (ETWFE)` extended with `### Deviations from the paper / from R / library extensions` block consolidating 7 surviving deviations + opt-in notes for cohort_share + cohort_trends + survey rejection + bootstrap cohort_share rejection contracts. - CHANGELOG.md `[Unreleased]` `### Added` documents the new parameters, R-parity extension, and tracker flip. - `docs/methodology/papers/wooldridge-2025-review.md` Requirements Checklist + Gaps & Uncertainties items 1 + 11 marked `**Status:** Closed in PR-B`. - `docs/api/wooldridge_etwfe.rst` updated with weighting-scheme notes alongside the existing aggregation table. Second of two PRs for the WooldridgeDiD methodology-review-tracker promotion. PR-A merged at e416aed (#484). Co-Authored-By: Claude Opus 4.7 <[email protected]>

Flip the ChaisemartinDHaultfoeuille (DCDH) row from In Progress to Complete. Adds the Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns detail section mirroring the ContinuousDiD (PR igerber#476) and HAD (PR igerber#473) precedents. Consolidates 7 DCDH deviations from the paper, from R DIDmultiplegtDYN, and library extensions into a labeled REGISTRY surface per the AI-review "Documenting Deviations" convention. CHANGELOG [Unreleased] gains a new Added entry. L27 In Progress example re-pointed to WooldridgeDiD; L1289 priority-order queue item igerber#6 removed and items igerber#7-igerber#11 renumbered to igerber#6-igerber#10. No source code changes, no new tests, no new docstrings — documentation consolidation only. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Prepare v0.2.0 release

07ed82b

- Add MultiPeriodDiD documentation to README with usage examples and API reference - Bump version from 0.1.0 to 0.2.0 - Add feature bullet for multi-period analysis - Fix license format for newer setuptools compatibility

igerber merged commit 0cf0687 into main Jan 2, 2026

igerber deleted the claude/prepare-release-CWW9Z branch January 2, 2026 16:04

igerber mentioned this pull request Feb 16, 2026

Add workflow improvements to reduce PR review rounds #157

Merged

igerber mentioned this pull request Apr 18, 2026

Surface PowerAnalysis simulation-failure count and narrow except clause #326

Merged

igerber mentioned this pull request Apr 26, 2026

dCDH by_path: lift trends_linear + trends_nonparam gates (Wave 3 #6+#7) #393

Merged

6 tasks

igerber added a commit that referenced this pull request Apr 26, 2026

Merge pull request #393 from igerber/dcdh-by-path-trends

c6cb864

dCDH by_path: lift trends_linear + trends_nonparam gates (Wave 3 #6+#7)

igerber mentioned this pull request May 21, 2026

ChaisemartinDHaultfoeuille (DCDH) methodology-review-tracker promotion: In Progress -> Complete #481

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prepare v0.2.0 release#7

Prepare v0.2.0 release#7
igerber merged 1 commit into
mainfrom
claude/prepare-release-CWW9Z

igerber commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

igerber commented Jan 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants