WooldridgeDiD: outcome-fit hint for OLS on binary/count outcomes#513
Merged
Conversation
|
Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Review note: this was a static review. I could not execute |
Add a non-fatal UserWarning when WooldridgeDiD(method="ols") is used on a
binary ({0,1}) or non-negative integer-count outcome. Following Wooldridge
(2023), the warning notes that a matching nonlinear model (logit / Poisson)
is often the MORE APPROPRIATE specification for such outcomes: it imposes
parallel trends on the link/index scale rather than in levels (level-PT is
only valid for continuous/unbounded outcomes), and the paper's Section 5
simulations show the linear model both biased and less precise where the
nonlinear mean holds. It is a different identifying assumption than linear
OLS -- which fits depends on which parallel-trends restriction holds -- so
the hint is a recommended comparison, not an automatic switch or a free
efficiency upgrade. OLS remains a valid QMLE for any response (Table 1).
- _suggest_nonlinear_method: pure, non-fatal detector (binary -> logit,
non-negative count with >2 distinct -> poisson; fractional/continuous/
negative/non-numeric -> None). Bounded binomial-style integers are not
separately distinguished from unbounded counts (documented heuristic).
- fit() gate 0g (OLS-only, stacklevel=2) emits the hint on the full
outcome column before sample filtering; never alters the fit or raises
- TestOutcomeFitHint: detector units (incl. the bounded-support case) +
gate behavior + suppression + paper-faithful framing guard
(more-appropriate / biased / different assumption / recommended-comparison)
- Docs: REGISTRY Note (LPT-vs-IPT + Section 5 evidence + two-sided framing
provenance), method docstring, llms-full.txt, wooldridge_etwfe.rst, CHANGELOG
- TODO: remove the Tier A + Methodology/Correctness rows (#216)
Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
4b3215a to
a214168
Compare
|
🔁 AI review rerun (requested by @igerber) Head SHA: Overall Assessment ✅ Looks good Executive Summary
Methodology
Code Quality
Performance
Maintainability
Tech Debt
Security
Documentation/Tests
Static review only; I could not execute |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UserWarning(fit-time gate0g, OLS-only) toWooldridgeDiD.fit()whenmethod="ols"is used on a binary ({0, 1}) or non-negative integer-count outcome, noting that a matching nonlinear model (logit / Poisson) is often the more appropriate specification for such outcomes._suggest_nonlinear_method(binary → logit; non-negative count with >2 distinct → poisson; fractional / continuous / negative / non-numeric → None). Advisory only — it never alters the ATT/SE/inference paths and never raises._filter_sampleis row-preserving (the control group is expressed via the design matrix, not by dropping rows), so the full column and the estimation sample always share the same outcome support (pinned by a regression test).TODO.md(Feature/wooldridge did #216), migrating the corrected-framing provenance (PR Wave 3 estimator observability: HonestDiD M=0 test, Wooldridge canonical-link warning, ARP vertex diagnostic #453 R1) into the REGISTRY note.Methodology references (required if estimator / math changes)
WooldridgeDiD(ETWFE) — advisory outcome-fit hint only; no change to estimation or inference.docs/methodology/papers/wooldridge-2023-review.md.docs/methodology/REGISTRY.md§WooldridgeDiD "Nonlinear extensions": detection is a high-signal heuristic (binary{0,1}; non-negative integer count with >2 distinct → Poisson) that does not separate bounded binomial from unbounded counts or detect fractional outcomes — that would require user-supplied outcome-family metadata.Validation
tests/test_wooldridge.py— newTestOutcomeFitHint(detector unit cases incl. bool dtype / bounded-support / non-numeric; binary → logit and count → poisson emission; continuous + logit silence;cohort_trendssmoke;warnings.filterwarningssuppression; paper-faithful framing guard; and a_filter_samplerow-preserving invariant). Fulltests/test_wooldridge.pypasses.Security / privacy
Generated with Claude Code