Prepare v0.2.0 release#7
Merged
Merged
Conversation
Owner
igerber
commented
Jan 2, 2026
- Add MultiPeriodDiD documentation to README with usage examples and API reference
- Bump version from 0.1.0 to 0.2.0
- Add feature bullet for multi-period analysis
- Fix license format for newer setuptools compatibility
- Add MultiPeriodDiD documentation to README with usage examples and API reference - Bump version from 0.1.0 to 0.2.0 - Add feature bullet for multi-period analysis - Fix license format for newer setuptools compatibility
igerber
added a commit
that referenced
this pull request
Apr 18, 2026
Addresses axis B findings #6 and #7 from the silent-failures audit: trop_global.py:448 outer alternating-min loop, trop_global.py:466 hard-coded range(20) inner FISTA loop, and trop_local.py:680 alternating-minimization loop all exited silently on max_iter exhaustion, returning the current iterate as if converged. - trop_global._solve_global_with_lowrank: thread a converged flag through the outer loop; count non-convergence events from the inner FISTA and surface the count in the outer warning for diagnostic context. One warn_if_not_converged call per solver invocation. - trop_local._estimate_model: thread a converged flag through the outer alternating-min loop; call warn_if_not_converged on exhaustion. - REGISTRY updated under TROP. New TestTROPConvergenceWarnings class (4 tests) exercises both global and local paths with forced non-convergence (max_iter=1, tol=1e-15) and a convergent negative control. Notable: the default TROP local config (max_iter=100, tol=1e-6) does not converge within max_iter on typical synthetic panels, so this PR surfaces a previously silent non-convergence that affected routine user fits. No numerical change in the returned iterate; the warning is additive. Axis-B regression-lint baseline: 5 -> 2 silent range(max_iter) loops remaining (minor loops in honest_did/power not yet addressed). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
6 tasks
igerber
added a commit
that referenced
this pull request
Apr 26, 2026
Wave 3 #6+#7 of the dCDH by_path follow-up sequence (after PR #378 shipped #5 by_path + controls). Removes the two NotImplementedError gates at chaisemartin_dhaultfoeuille.py:1014-1023 and adds: - New `path_cumulated_event_study` Results field surfacing the cumulated level effect delta_l per path under trends_linear=True (mirrors the global linear_trends_effects cumulation; this is what R returns under did_multiplegt_dyn(..., by_path, trends_lin)). - `set_ids` parameter threaded through the four per-path IF helpers so trends_nonparam's set-restricted control pool reaches per-path analytical SE, bootstrap, placebos, and sup-t bands automatically. - to_dataframe(level="by_path") gains cumulated_effect / cumulated_se columns (always present, NaN-when-None — mirrors cband_*). - summary() renders a per-path "Cumulated Level Effects" sub-section. Validated against R DIDmultiplegtDYN 2.3.3 via two new golden scenarios: - single_baseline_multi_path_by_path_trends_lin (custom DGP: F_g >= 4, cohort-single-path, n_periods=13) — per-path cumulated point estimates match R bit-exactly (POINT_RTOL=1e-9), cumulated SE within ±20% - multi_path_reversible_by_path_trends_nonparam — per-path point estimates AND placebos match R bit-exactly, SE within ±15% Placebo parity for trends_linear is intentionally skipped: R's per-path placebo re-runs on the path-restricted subsample with different control eligibility than Python's global-then-disaggregate architecture, so the divergence is methodological, not a tolerance issue. Internal regression covers placebo + trends_linear (finite values, bootstrap inheritance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
igerber
added a commit
that referenced
this pull request
Apr 26, 2026
Wave 3 #6+#7 of the dCDH by_path follow-up sequence (after PR #378 shipped #5 by_path + controls). Removes the two NotImplementedError gates at chaisemartin_dhaultfoeuille.py:1014-1023 and adds: - New `path_cumulated_event_study` Results field surfacing the cumulated level effect delta_l per path under trends_linear=True (mirrors the global linear_trends_effects cumulation; this is what R returns under did_multiplegt_dyn(..., by_path, trends_lin)). - `set_ids` parameter threaded through the four per-path IF helpers so trends_nonparam's set-restricted control pool reaches per-path analytical SE, bootstrap, placebos, and sup-t bands automatically. - to_dataframe(level="by_path") gains cumulated_effect / cumulated_se columns (always present, NaN-when-None — mirrors cband_*). - summary() renders a per-path "Cumulated Level Effects" sub-section. Validated against R DIDmultiplegtDYN 2.3.3 via two new golden scenarios: - single_baseline_multi_path_by_path_trends_lin (custom DGP: F_g >= 4, cohort-single-path, n_periods=13) — per-path cumulated point estimates match R bit-exactly (POINT_RTOL=1e-9), cumulated SE within ±20% - multi_path_reversible_by_path_trends_nonparam — per-path point estimates AND placebos match R bit-exactly, SE within ±15% Placebo parity for trends_linear is intentionally skipped: R's per-path placebo re-runs on the path-restricted subsample with different control eligibility than Python's global-then-disaggregate architecture, so the divergence is methodological, not a tolerance issue. Internal regression covers placebo + trends_linear (finite values, bootstrap inheritance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
HanomicsIMF
pushed a commit
to HanomicsIMF/diff-diff
that referenced
this pull request
Apr 27, 2026
Closes BR/DR foundation gap igerber#6 from project_br_dr_foundation.md: BusinessReport and DiagnosticReport now name what the headline scalar actually represents as an estimand, for each of the 16 result classes. Baker et al. (2025) Step 2 ("define the target parameter") was previously in BR's next_steps list but not done by BR itself — this PR closes that gap. New top-level ``target_parameter`` block (additive schema change; experimental per REPORTING.md stability policy): { "name": str, # stakeholder-facing name "definition": str, # plain-English description "aggregation": str, # machine-readable dispatch tag "headline_attribute": str, # which raw result attribute "reference": str, # REGISTRY.md citation pointer } Schema placement: top-level block (user preference, selected via AskUserQuestion in planning). Aggregation tags include "simple", "event_study", "group", "2x2", "twfe", "iw", "stacked", "ddd", "staggered_ddd", "synthetic", "factor_model", "M", "l", "l_x", "l_fd", "l_x_fd", "dose_overall", "pt_all_combined", "pt_post_single_baseline", "unknown". Per-estimator dispatch lives in the new ``diff_diff/_reporting_helpers.py::describe_target_parameter`` (own module rather than business_report / diagnostic_report to avoid circular-import risk — plan-review LOW igerber#7). All 17 result classes covered (16 from _APPLICABILITY + BaconDecompositionResults); exhaustiveness locked in by TestTargetParameterCoversEveryResultClass. Fit-time config reads: - ``EfficientDiDResults.pt_assumption`` branches the aggregation tag between pt_all_combined and pt_post_single_baseline. - ``StackedDiDResults.clean_control`` varies the definition clause (never_treated / strict / not_yet_treated). - ``ChaisemartinDHaultfoeuilleResults.L_max`` + ``covariate_residuals`` + ``linear_trends_effects`` branches the dCDH estimand between DID_M / DID_l / DID^X_l / DID^{fd}_l / DID^{X,fd}_l. Fixed-tag branches (per plan-review CRITICAL igerber#1 and igerber#2): - ``CallawaySantAnna`` / ``ImputationDiD`` / ``TwoStageDiD`` / ``WooldridgeDiD``: the fit-time ``aggregate`` kwarg does not change the ``overall_att`` scalar — it only populates additional horizon / group tables on the result object. Disambiguating those tables in prose is tracked under gap igerber#9. - ``ContinuousDiDResults``: the PT-vs-SPT regime is a user-level assumption, not a library setting. Emits a single "dose_overall" tag with disjunctive definition naming both regime readings (ATT^loc under PT, ATT^glob under SPT). Prose rendering: - BR ``_render_summary``: emits "Target parameter: <name>." after the headline sentence (short name only; full definition lives in the full_report and schema). - BR ``_render_full_report``: "## Target Parameter" section between "## Headline" and "## Identifying Assumption". - DR ``_render_overall_interpretation``: mirror sentence. - DR ``_render_dr_full_report``: "## Target Parameter" section with name, definition, aggregation tag, headline attribute, and reference. Cross-surface parity: both BR and DR consume the same helper (the single source of truth), so their ``target_parameter`` blocks are byte-identical (verified by TestTargetParameterCrossSurfaceParity). Tests: 37 new (TestTargetParameterPerEstimator + TestTargetParameterFitConfigReads + TestTargetParameterCoversEveryResultClass + TestTargetParameterCrossSurfaceParity + TestTargetParameterProseRendering). Existing BR/DR top-level-key contract tests updated to include ``target_parameter``. Total 319 tests pass (282 prior + 37 new). Docs: REPORTING.md gains a "Target parameter" section documenting the per-estimator dispatch and schema shape. business_report.rst and diagnostic_report.rst note the new field with a pointer to REPORTING.md. CHANGELOG entry under Unreleased. Out of scope: REGISTRY.md per-estimator "Target parameter" sub-sections (plan-review additional-note); the reporting-layer doc in REPORTING.md is the current source of truth. A follow-up docs PR can land those sub-sections if maintainers want the registry to own the canonical wording directly. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
igerber
added a commit
that referenced
this pull request
May 22, 2026
Closes the WooldridgeDiD (ETWFE) methodology-review-tracker promotion in METHODOLOGY_REVIEW.md (In Progress → Complete), following the primary-source review for Wooldridge (2025) merged in PR-A (#484). Adds two paper-driven implementation surfaces and extends R-parity goldens to the nonlinear paths. Implementation: - `aggregate(weights="cohort_share")` on WooldridgeDiDResults implements paper Eqs. 7.4 (simple-overall) and 7.6 (event-time, restricted to k>=0) cohort-share aggregation weights as an opt-in alternative to the default cell-count weighting (matching Stata `jwdid_estat`). Inference fields fail-closed to NaN with UserWarning per paper Section 7.5 conditional-on-shares semantics; raises on `survey_design` (design-consistent totals deferred); raises on `type ∈ {"group","calendar"}` (no paper closed-form); raises on bootstrap fits (no matching bootstrap variant). Closes TODO row 95. - `cohort_trends=True` on `WooldridgeDiD.__init__` adds linear `dg_i · t` cohort-specific trend interactions (paper Section 8 / Eq. 8.1) for the OLS path. Rejects on logit/poisson per paper Section 8 OLS scope; rejects on survey_design pending full-dummy/TSL validation; enforces per-cohort pre-period identification check (≥ 2 observed pre-periods per treated cohort). Auto-routes to full-dummy mode regardless of vcov_type. Closes the PR-A Requirements Checklist heterogeneous-trends gap. Tests: - `tests/test_methodology_wooldridge.py` extended with 6 paper-equation-numbered methodology classes (Theorem 3.1, Proposition 5.1, Section 6 event study, Section 7 aggregation paths, Section 8 heterogeneous trends, Section 10 unbalanced panels) + `TestW2025LibraryDeviations` consolidating 5 surviving deviations. Mirrors the HAD PR #473 precedent. - Two new R-parity surface classes (`TestWooldridgeParityRPoisson`, `TestWooldridgeParityRLogit`) lock the structural surface against R `etwfe(family=...)` log-link goldens. - 209 tests total (60 methodology + 149 R-parity + unit regressions). R Goldens: - `benchmarks/R/generate_wooldridge_golden.R` extended with Poisson + logit DGPs via R `etwfe`; augmented panel CSV retains the same seed-generated `y_pois` + `y_logit` columns for cross-language reproducibility. - `benchmarks/R/requirements.R` pins `etwfe >= 0.5.0`. Tracker promotion: - METHODOLOGY_REVIEW.md L52 status flip with merge date; detail section L583-605 rewritten to the Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns template mirroring HAD / ContinuousDiD / DCDH. L27 example re-pointed; priority queue items #7-#10 renumbered to #6-#9. - REGISTRY.md `## WooldridgeDiD (ETWFE)` extended with `### Deviations from the paper / from R / library extensions` block consolidating 7 surviving deviations + opt-in notes for cohort_share + cohort_trends + survey rejection + bootstrap cohort_share rejection contracts. - CHANGELOG.md `[Unreleased]` `### Added` documents the new parameters, R-parity extension, and tracker flip. - `docs/methodology/papers/wooldridge-2025-review.md` Requirements Checklist + Gaps & Uncertainties items 1 + 11 marked `**Status:** Closed in PR-B`. - `docs/api/wooldridge_etwfe.rst` updated with weighting-scheme notes alongside the existing aggregation table. Second of two PRs for the WooldridgeDiD methodology-review-tracker promotion. PR-A merged at e416aed (#484). Co-Authored-By: Claude Opus 4.7 <[email protected]>
HanomicsIMF
pushed a commit
to HanomicsIMF/diff-diff
that referenced
this pull request
May 22, 2026
Flip the ChaisemartinDHaultfoeuille (DCDH) row from In Progress to Complete. Adds the Verified Components / Test Coverage / Corrections Made / Deviations / Outstanding Concerns detail section mirroring the ContinuousDiD (PR igerber#476) and HAD (PR igerber#473) precedents. Consolidates 7 DCDH deviations from the paper, from R DIDmultiplegtDYN, and library extensions into a labeled REGISTRY surface per the AI-review "Documenting Deviations" convention. CHANGELOG [Unreleased] gains a new Added entry. L27 In Progress example re-pointed to WooldridgeDiD; L1289 priority-order queue item igerber#6 removed and items igerber#7-igerber#11 renumbered to igerber#6-igerber#10. No source code changes, no new tests, no new docstrings — documentation consolidation only. Co-Authored-By: Claude Opus 4.7 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.