Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
14 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Added
- **New estimator: `SyntheticControl` — classic Synthetic Control Method (Abadie, Diamond & Hainmueller 2010; Abadie & Gardeazabal 2003).** Standalone estimator (`diff_diff/synthetic_control.py`) + `SyntheticControlResults` (`diff_diff/synthetic_control_results.py`) + `synthetic_control()` convenience function, exported from `diff_diff`. Builds a single treated unit's counterfactual as a convex combination of never-treated donor units — **donor (unit) weights only**, no time weights or ridge, distinct from `SyntheticDiD`. The inner simplex-constrained weighted-LS solve `W*(V)` reuses `utils._sc_weight_fw` (folding `V^½` into the predictor matrix, `intercept=False`, `zeta=0`); the diagonal predictor-importance matrix `V` is selected data-driven by minimizing pre-period outcome MSPE (`v_method="nested"`, softmax-on-simplex multistart Nelder-Mead + Powell polish) or supplied by the user (`v_method="custom"`). Predictors are built from `predictors`/`predictor_window`/`predictors_op`, `special_predictors`, and per-period outcome lags (`pre_period_outcomes`), in the R `Synth::dataprep` row order; per-row standardization (SD over donors+treated, ddof=1) matches the R `Synth::synth` source. Reports the gap path (`α̂_1t = Y_1t − Σ_j w_j Y_jt`), `att` (mean post-period gap), `pre_rmspe`, donor weights, `v_weights`, and a predictor-balance table. **No analytical standard error** — `se`/`t_stat`/`p_value`/`conf_int` are NaN (in-space placebo permutation inference with the post/pre RMSPE-ratio statistic is planned for a follow-up release; `_placebo_gaps`/`_rmspe_ratio`/`_fit_snapshot` are reserved on the results object). Ten validation gates baked in: predictor-period leakage, absorbing post-period suffix + no-anticipation cross-check against the treatment column, post-period canonicalization, donor-pool filtering before period derivation, empty-window rejection, poor-pre-fit `UserWarning` (RMSPE > SD of treated pre-outcomes), duplicate-predictor-label rejection, inner-solve non-convergence warning, order-independent gap-path rebuild, and the `standardize="none"` deviation; plus fail-closed `custom_v` cross-field rules and degenerate single-donor / single-pre-period handling. **R-`Synth` parity** (`tests/test_methodology_synthetic_control.py`, fixtures generated by `benchmarks/R/generate_synth_basque_golden.R` into `tests/data/`): two-tier on the Basque Country study — Tier-1 feeds R's `solution.v` via `custom_v` and reproduces the published donor weights (region 10 Cataluña 0.851 + region 14 Madrid 0.149) to `atol=1e-3` deterministically; Tier-2 (`@pytest.mark.slow`) checks the data-driven nested fit lands in a tolerance band (the nested `V` legitimately differs because the outer objective uses all pre periods, not R's `time.optimize.ssr` window). Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (with `**Deviation from R:** standardize="none"` and `**Note:**` labels for the standardization formula, objective window, softmax `V` parametrization, and 1×SD poor-fit threshold), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`.
- **ConleySpatialHAC methodology-review-tracker promotion: In Progress → Complete.** Closes the Conley (1999) *Journal of Econometrics* 92(1) primary-source review on the methodology-review tracker. The paper review on file at `docs/methodology/papers/conley-1999-review.md` was previously merged (2026-05-09); this PR is the F.L.I.P. consolidation — new `tests/test_methodology_conley.py` with paper-equation-numbered Verified Components walk-through (~1600 LoC; 10 classes; 60 tests, 5 of them `@pytest.mark.slow`). Coverage: Eq. 4.2 cross-sectional sandwich (pairwise-distance specialization; the project's paper review identifies Eq. 4.2 page 18 as the real-valued/pairwise form, with Eq. 3.13 reserved for the lattice-indexed form), Eq. 4.2 HC0 + rank-1 limits, Andrews (1991) HAC lag truncation matching `conleyreg::time_dist.cpp`, haversine convention with Earth radius 6371.01 km, Phase 2 panel block-decomposed sandwich at `atol=1e-12`, sparse k-d-tree dense-vs-sparse bit-identity (Wave A #120 numerical correctness), and R `conleyreg` v0.1.9 parity at `atol=1e-6` on 6 fixtures (3 cross-sectional + 3 panel) plus the sparse-forced and time-asymmetric kernel parity contracts. Three dedicated deviations-area classes: `TestConleyLibraryExtensions` (Wave A library extensions — combined spatial+cluster product kernel #119, callable conley_metric validation #123, sparse k-d-tree activation #120, indefiniteness guard), `TestConleyDeviationsFromR` (1-D radial Bartlett vs paper's 2-D separable Eq. 3.14, time-label normalization via `np.unique`, independent temporal kernel deferred), and `TestConleyDeferrals` (5 fail-closed `NotImplementedError`/`TypeError` contracts: LinearRegression + survey_design, DiD/MPD/TWFE + survey_design, Conley + weights, SyntheticDiD + Conley, wild_bootstrap + Conley). Methodology-anchored tests extracted from `tests/test_conley_vcov.py`: full classes `TestConleyDirectHelper`, `TestConleyReductions`, `TestConleyReductionsAddendum`, `TestConleyParityR`, `TestConleyParitySpacetime`, `TestConleyPanelHelper`, `TestConleySparseRParityForced`; plus methodology-anchored tests from `TestConleyKernels`, `TestConleyDistanceMetrics`, `TestConleySparse`. File drops 4248 → 3113 lines after extraction. Defensive surface preserved: input validation, NaN/inf guards, dispatch-level validity, estimator-level integration smoke tests, set_params atomicity, sparse-path activation thresholds + density-gate fallback. `METHODOLOGY_REVIEW.md` row L91 promoted to **Complete** with `Last Review = 2026-05-26`; detail block rewritten with Verified Components / Test Coverage / R Comparison Results inline table / Corrections Made / Deviations / Outstanding Concerns. Priority queue at L1386 pruned: PreTrendsPower removed (already Complete since 2026-05-19) and ConleySpatialHAC removed (this PR); substantive-review-blocked renumbered #2-#5 → #1-#4 and consolidation-pass-blocked renumbered #6-#8 → #5-#6.

### Added / Changed
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`.
- [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance
- [SpilloverDiD](https://diff-diff.readthedocs.io/en/stable/api/spillover.html) - Butts (2021) ring-indicator spillover-aware DiD identifying direct effect on treated + per-ring spillover on near-control units; handles non-staggered and staggered timing; supports survey-design variance under `survey_design=` for HC1 / CR1 (Wave E.1 Binder TSL) and Conley (Wave E.2 panel-aware stratified-Conley sandwich on per-period PSU totals; extended in Wave E.2 follow-up to `conley_lag_cutoff > 0` via panel-block composition with within-PSU serial Bartlett HAC — `lag>0` requires an effective PSU via explicit `survey_design.psu` or injected `cluster=<col>`); `SurveyDesign.subpopulation()` preserves full-design `n_psu` / `df_survey` via zero-padded scores (Wave E.3, R `svyrecvar(subset())` form)
- [SyntheticDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - Synthetic DiD combining standard DiD and synthetic control for few treated units
- [SyntheticControl](https://diff-diff.readthedocs.io/en/stable/api/synthetic_control.html) - Abadie, Diamond & Hainmueller (2010) classic synthetic control for a single treated unit (donor-weight counterfactual, nested/custom V; no inference in this release — permutation/placebo planned)
- [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html) - triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility
- [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html) - Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves
- [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html) - de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲ = 0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲ > 0`, continuous-near-d̲ or mass-point), with a multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release - repeated cross-sections rejected by the validator. Alias `HAD`.
Expand Down
1 change: 1 addition & 0 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ Deferred items from PR reviews that were not addressed before merge.
| ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) |
| Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium |
| Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low |
| SyntheticControl: `SyntheticControlResults` not wired into the practitioner / DiagnosticReport / BusinessReport routing, so routing SCM results through those tools yields generic parallel-trends/HonestDiD guidance that doesn't fit SCM. Add SCM to the native-routed rejection sets (mirror SDiD/TROP) and surface SCM-native diagnostics (pre-fit / in-space placebo / in-time placebo / leave-one-out). Deferred to PR-2, where it pairs with the placebo-inference layer those reports would surface. | `practitioner.py`, `diagnostic_report.py`, `business_report.py` | SCM PR-1 → PR-2 | Medium |
| ContinuousDiD deferred CGBS 2024 extensions: (a) `covariates=` kwarg not implemented (matches R `contdid` v0.1.0); (b) discrete-treatment saturated regression deferred (integer-valued dose currently warned, not routed to per-level coefficients); (c) lowest-dose-as-control per CGBS 2024 Remark 3.1 (when `P(D=0) = 0`) not implemented — estimator requires never-treated controls. REGISTRY `## ContinuousDiD` → Implementation Checklist marks these as deferred `[ ]` items. | `diff_diff/continuous_did.py` | — | Low |
| Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega* — `_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` | — | Low |
| TROP: extend Wave 4's `_setup_trop_data` helper to also cover the duplicated bootstrap resampling loop in `_bootstrap_variance` / `_bootstrap_variance_global` (~40 LoC dedup; mirrors the data-setup helper pattern with a `fit_callable` parameter for the per-draw refit step). | `trop_local.py`, `trop_global.py` | follow-up | Low |
Expand Down
127 changes: 127 additions & 0 deletions benchmarks/R/generate_synth_basque_golden.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
#!/usr/bin/env Rscript
# Generate the Basque Country (Abadie & Gardeazabal 2003) R `Synth` golden fixture
# for the SyntheticControl estimator's two-tier R-parity test.
#
# Run from the repo root:
# Rscript benchmarks/R/generate_synth_basque_golden.R
#
# Writes (into tests/data/ so the deterministic Tier-1 parity test runs in
# isolated-install CI without R):
# tests/data/synth_basque_panel.csv verbatim Synth::basque, regions != 1
# (Spain aggregate dropped), long format,
# plus an absorbing `treated` indicator.
# tests/data/synth_basque_golden.json R Synth solution.v / solution.w, losses,
# the standardization divisor, X1/X0, and
# the treated/synthetic/gap paths.
#
# Provenance: the panel is a verbatim export of R `Synth::basque`; the V-selection
# numerics (standardization divisor, optimizer) are pinned from the `Synth` source,
# not from Abadie-Diamond-Hainmueller (2010) — see docs/methodology/REGISTRY.md.

suppressMessages({
library(Synth)
library(jsonlite)
})

data(basque)

predictors <- c(
"school.illit", "school.prim", "school.med",
"school.high", "school.post.high", "invest"
)
special <- list(
list("gdpcap", 1960:1969, "mean"),
list("sec.agriculture", seq(1961, 1969, 2), "mean"),
list("sec.energy", seq(1961, 1969, 2), "mean"),
list("sec.industry", seq(1961, 1969, 2), "mean"),
list("sec.construction", seq(1961, 1969, 2), "mean"),
list("sec.services.venta", seq(1961, 1969, 2), "mean"),
list("sec.services.nonventa", seq(1961, 1969, 2), "mean"),
list("popdens", 1969, "mean")
)
controls <- c(2:16, 18)

invisible(capture.output({
dp <- dataprep(
foo = basque,
predictors = predictors,
predictors.op = "mean",
time.predictors.prior = 1964:1969,
special.predictors = special,
dependent = "gdpcap",
unit.variable = "regionno",
unit.names.variable = "regionname",
time.variable = "year",
treatment.identifier = 17,
controls.identifier = controls,
time.optimize.ssr = 1960:1969,
time.plot = 1955:1997
)
so <- synth(dp)
}))

# Standardization divisor exactly as computed inside synth():
# divisor <- sqrt(apply(cbind(X0, X1), 1, var))
big <- cbind(dp$X0, dp$X1)
divisor <- sqrt(apply(big, 1, var))

pred_names <- rownames(dp$X1)
v <- as.numeric(so$solution.v)
w <- as.numeric(so$solution.w)

# X0 as predictor -> {control -> value} so Python can verify matrix construction.
X0_list <- setNames(
lapply(seq_len(nrow(dp$X0)), function(i) as.list(setNames(dp$X0[i, ], colnames(dp$X0)))),
pred_names
)

synthetic_path <- as.numeric(dp$Y0plot %*% so$solution.w)
treated_path <- as.numeric(dp$Y1plot)
years <- as.integer(rownames(dp$Y1plot))

golden <- list(
config = list(
treated_regionno = 17,
controls = controls,
treatment_year = 1970,
predictors = predictors,
predictors_op = "mean",
predictor_window = 1964:1969,
special = lapply(special, function(s) {
list(var = s[[1]], periods = s[[2]], op = s[[3]])
}),
time_optimize_ssr = 1960:1969,
time_plot = c(1955, 1997)
),
predictor_names = pred_names,
solution_v = setNames(v, pred_names),
solution_w = as.list(setNames(w, colnames(dp$X0))),
loss_v = as.numeric(so$loss.v),
loss_w = as.numeric(so$loss.w),
divisor = setNames(as.numeric(divisor), pred_names),
X1 = setNames(as.numeric(dp$X1), pred_names),
X0 = X0_list,
years = years,
treated_path = treated_path,
synthetic_path = synthetic_path,
gap = treated_path - synthetic_path
)

dir.create("tests/data", showWarnings = FALSE, recursive = TRUE)
write_json(
golden, "tests/data/synth_basque_golden.json",
auto_unbox = TRUE, digits = 12, pretty = TRUE
)

# Panel CSV: drop region 1 (Spain aggregate); long format + absorbing treated.
panel <- basque[basque$regionno != 1, ]
panel$treated <- as.integer(panel$regionno == 17 & panel$year >= 1970)
stopifnot(!any(is.na(panel$gdpcap))) # outcome must be complete (balanced panel)
write.csv(panel, "tests/data/synth_basque_panel.csv", row.names = FALSE)

cat("Wrote tests/data/synth_basque_golden.json and synth_basque_panel.csv\n")
cat("nvarsV:", length(v), " n_controls:", length(w), "\n")
cat("loss.v:", format(so$loss.v, digits = 6), " loss.w:", format(so$loss.w, digits = 6), "\n")
nz <- setNames(round(w, 4), colnames(dp$X0))
cat("solution.w (nonzero):\n")
print(nz[nz > 1e-4])
1 change: 1 addition & 0 deletions benchmarks/R/requirements.R
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ required_packages <- c(
"DIDHAD", # de Chaisemartin et al. (2025) HAD estimator (HAD Phase 4 R-parity)
"YatchewTest", # Yatchew (1997) linearity test (HAD yatchew R-parity)
"nprobust", # Calonico-Cattaneo-Farrell local-linear (DIDHAD dependency)
"Synth", # Abadie-Diamond-Hainmueller (2010) synthetic control (SyntheticControl R-parity; ships data(basque))

# Utilities
"jsonlite", # JSON output for Python interop
Expand Down
8 changes: 8 additions & 0 deletions diff_diff/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,11 @@
TROPResults,
trop,
)
from diff_diff.synthetic_control import (
SyntheticControl,
synthetic_control,
)
from diff_diff.synthetic_control_results import SyntheticControlResults
from diff_diff.wooldridge import WooldridgeDiD
from diff_diff.wooldridge_results import WooldridgeDiDResults
from diff_diff.utils import (
Expand Down Expand Up @@ -309,6 +314,7 @@
"SpilloverDiD",
"TripleDifference",
"TROP",
"SyntheticControl",
"StackedDiD",
# Estimator aliases (short names)
"DiD",
Expand Down Expand Up @@ -355,6 +361,8 @@
"StaggeredTripleDiffResults",
"TROPResults",
"trop",
"SyntheticControlResults",
"synthetic_control",
"StackedDiDResults",
"stacked_did",
# EfficientDiD
Expand Down
4 changes: 4 additions & 0 deletions diff_diff/estimators.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
Additional estimators are in separate modules:
- TwoWayFixedEffects: See diff_diff.twfe
- SyntheticDiD: See diff_diff.synthetic_did
- SyntheticControl: See diff_diff.synthetic_control

For backward compatibility, all estimators are re-exported from this module.
"""
Expand Down Expand Up @@ -2042,6 +2043,8 @@ def summary(self) -> str:
# These can also be imported directly from their respective modules:
# - from diff_diff.twfe import TwoWayFixedEffects
# - from diff_diff.synthetic_did import SyntheticDiD
# - from diff_diff.synthetic_control import SyntheticControl
from diff_diff.synthetic_control import SyntheticControl # noqa: E402
from diff_diff.synthetic_did import SyntheticDiD # noqa: E402
from diff_diff.twfe import TwoWayFixedEffects # noqa: E402

Expand All @@ -2050,4 +2053,5 @@ def summary(self) -> str:
"MultiPeriodDiD",
"TwoWayFixedEffects",
"SyntheticDiD",
"SyntheticControl",
]
Loading
Loading