diff --git a/CHANGELOG.md b/CHANGELOG.md index 3783708a..2317bf34 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] ### Added +- **New estimator: `SyntheticControl` — classic Synthetic Control Method (Abadie, Diamond & Hainmueller 2010; Abadie & Gardeazabal 2003).** Standalone estimator (`diff_diff/synthetic_control.py`) + `SyntheticControlResults` (`diff_diff/synthetic_control_results.py`) + `synthetic_control()` convenience function, exported from `diff_diff`. Builds a single treated unit's counterfactual as a convex combination of never-treated donor units — **donor (unit) weights only**, no time weights or ridge, distinct from `SyntheticDiD`. The inner simplex-constrained weighted-LS solve `W*(V)` reuses `utils._sc_weight_fw` (folding `V^½` into the predictor matrix, `intercept=False`, `zeta=0`); the diagonal predictor-importance matrix `V` is selected data-driven by minimizing pre-period outcome MSPE (`v_method="nested"`, softmax-on-simplex multistart Nelder-Mead + Powell polish) or supplied by the user (`v_method="custom"`). Predictors are built from `predictors`/`predictor_window`/`predictors_op`, `special_predictors`, and per-period outcome lags (`pre_period_outcomes`), in the R `Synth::dataprep` row order; per-row standardization (SD over donors+treated, ddof=1) matches the R `Synth::synth` source. Reports the gap path (`α̂_1t = Y_1t − Σ_j w_j Y_jt`), `att` (mean post-period gap), `pre_rmspe`, donor weights, `v_weights`, and a predictor-balance table. **No analytical standard error** — `se`/`t_stat`/`p_value`/`conf_int` are NaN (in-space placebo permutation inference with the post/pre RMSPE-ratio statistic is planned for a follow-up release; `_placebo_gaps`/`_rmspe_ratio`/`_fit_snapshot` are reserved on the results object). Ten validation gates baked in: predictor-period leakage, absorbing post-period suffix + no-anticipation cross-check against the treatment column, post-period canonicalization, donor-pool filtering before period derivation, empty-window rejection, poor-pre-fit `UserWarning` (RMSPE > SD of treated pre-outcomes), duplicate-predictor-label rejection, inner-solve non-convergence warning, order-independent gap-path rebuild, and the `standardize="none"` deviation; plus fail-closed `custom_v` cross-field rules and degenerate single-donor / single-pre-period handling. **R-`Synth` parity** (`tests/test_methodology_synthetic_control.py`, fixtures generated by `benchmarks/R/generate_synth_basque_golden.R` into `tests/data/`): two-tier on the Basque Country study — Tier-1 feeds R's `solution.v` via `custom_v` and reproduces the published donor weights (region 10 Cataluña 0.851 + region 14 Madrid 0.149) to `atol=1e-3` deterministically; Tier-2 (`@pytest.mark.slow`) checks the data-driven nested fit lands in a tolerance band (the nested `V` legitimately differs because the outer objective uses all pre periods, not R's `time.optimize.ssr` window). Documented in `docs/methodology/REGISTRY.md` §SyntheticControl (with `**Deviation from R:** standardize="none"` and `**Note:**` labels for the standardization formula, objective window, softmax `V` parametrization, and 1×SD poor-fit threshold), `docs/api/synthetic_control.rst`, the LLM guides, and `README.md`. - **ConleySpatialHAC methodology-review-tracker promotion: In Progress → Complete.** Closes the Conley (1999) *Journal of Econometrics* 92(1) primary-source review on the methodology-review tracker. The paper review on file at `docs/methodology/papers/conley-1999-review.md` was previously merged (2026-05-09); this PR is the F.L.I.P. consolidation — new `tests/test_methodology_conley.py` with paper-equation-numbered Verified Components walk-through (~1600 LoC; 10 classes; 60 tests, 5 of them `@pytest.mark.slow`). Coverage: Eq. 4.2 cross-sectional sandwich (pairwise-distance specialization; the project's paper review identifies Eq. 4.2 page 18 as the real-valued/pairwise form, with Eq. 3.13 reserved for the lattice-indexed form), Eq. 4.2 HC0 + rank-1 limits, Andrews (1991) HAC lag truncation matching `conleyreg::time_dist.cpp`, haversine convention with Earth radius 6371.01 km, Phase 2 panel block-decomposed sandwich at `atol=1e-12`, sparse k-d-tree dense-vs-sparse bit-identity (Wave A #120 numerical correctness), and R `conleyreg` v0.1.9 parity at `atol=1e-6` on 6 fixtures (3 cross-sectional + 3 panel) plus the sparse-forced and time-asymmetric kernel parity contracts. Three dedicated deviations-area classes: `TestConleyLibraryExtensions` (Wave A library extensions — combined spatial+cluster product kernel #119, callable conley_metric validation #123, sparse k-d-tree activation #120, indefiniteness guard), `TestConleyDeviationsFromR` (1-D radial Bartlett vs paper's 2-D separable Eq. 3.14, time-label normalization via `np.unique`, independent temporal kernel deferred), and `TestConleyDeferrals` (5 fail-closed `NotImplementedError`/`TypeError` contracts: LinearRegression + survey_design, DiD/MPD/TWFE + survey_design, Conley + weights, SyntheticDiD + Conley, wild_bootstrap + Conley). Methodology-anchored tests extracted from `tests/test_conley_vcov.py`: full classes `TestConleyDirectHelper`, `TestConleyReductions`, `TestConleyReductionsAddendum`, `TestConleyParityR`, `TestConleyParitySpacetime`, `TestConleyPanelHelper`, `TestConleySparseRParityForced`; plus methodology-anchored tests from `TestConleyKernels`, `TestConleyDistanceMetrics`, `TestConleySparse`. File drops 4248 → 3113 lines after extraction. Defensive surface preserved: input validation, NaN/inf guards, dispatch-level validity, estimator-level integration smoke tests, set_params atomicity, sparse-path activation thresholds + density-gate fallback. `METHODOLOGY_REVIEW.md` row L91 promoted to **Complete** with `Last Review = 2026-05-26`; detail block rewritten with Verified Components / Test Coverage / R Comparison Results inline table / Corrections Made / Deviations / Outstanding Concerns. Priority queue at L1386 pruned: PreTrendsPower removed (already Complete since 2026-05-19) and ConleySpatialHAC removed (this PR); substantive-review-blocked renumbered #2-#5 → #1-#4 and consolidation-pass-blocked renumbered #6-#8 → #5-#6. ### Added / Changed diff --git a/README.md b/README.md index 1f38e2e3..a4ce9993 100644 --- a/README.md +++ b/README.md @@ -108,6 +108,7 @@ Full guide: `diff_diff.get_llm_guide("practitioner")`. - [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html) - Gardner (2022) two-stage estimator with GMM sandwich variance - [SpilloverDiD](https://diff-diff.readthedocs.io/en/stable/api/spillover.html) - Butts (2021) ring-indicator spillover-aware DiD identifying direct effect on treated + per-ring spillover on near-control units; handles non-staggered and staggered timing; supports survey-design variance under `survey_design=` for HC1 / CR1 (Wave E.1 Binder TSL) and Conley (Wave E.2 panel-aware stratified-Conley sandwich on per-period PSU totals; extended in Wave E.2 follow-up to `conley_lag_cutoff > 0` via panel-block composition with within-PSU serial Bartlett HAC — `lag>0` requires an effective PSU via explicit `survey_design.psu` or injected `cluster=`); `SurveyDesign.subpopulation()` preserves full-design `n_psu` / `df_survey` via zero-padded scores (Wave E.3, R `svyrecvar(subset())` form) - [SyntheticDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html) - Synthetic DiD combining standard DiD and synthetic control for few treated units +- [SyntheticControl](https://diff-diff.readthedocs.io/en/stable/api/synthetic_control.html) - Abadie, Diamond & Hainmueller (2010) classic synthetic control for a single treated unit (donor-weight counterfactual, nested/custom V; no inference in this release — permutation/placebo planned) - [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html) - triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility - [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html) - Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves - [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html) - de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲ = 0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲ > 0`, continuous-near-d̲ or mass-point), with a multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release - repeated cross-sections rejected by the validator. Alias `HAD`. diff --git a/TODO.md b/TODO.md index 6383036e..edcabcf1 100644 --- a/TODO.md +++ b/TODO.md @@ -82,6 +82,7 @@ Deferred items from PR reviews that were not addressed before merge. | ImputationDiD dense `(A0'A0).toarray()` scales O((U+T+K)^2), OOM risk on large panels | `imputation.py` | #141 | Medium (deferred — only triggers when sparse solver fails) | | Multi-absorb weighted demeaning needs iterative alternating projections for N > 1 absorbed FE with survey weights; unweighted multi-absorb also uses single-pass (pre-existing, exact only for balanced panels) | `estimators.py` | #218 | Medium | | Survey design resolution/collapse patterns are inconsistent across panel estimators — ContinuousDiD rebuilds unit-level design in SE code, EfficientDiD builds once in fit(), StackedDiD re-resolves on stacked data; extract shared helpers for panel-to-unit collapse, post-filter re-resolution, and metadata recomputation | `continuous_did.py`, `efficient_did.py`, `stacked_did.py` | #226 | Low | +| SyntheticControl: `SyntheticControlResults` not wired into the practitioner / DiagnosticReport / BusinessReport routing, so routing SCM results through those tools yields generic parallel-trends/HonestDiD guidance that doesn't fit SCM. Add SCM to the native-routed rejection sets (mirror SDiD/TROP) and surface SCM-native diagnostics (pre-fit / in-space placebo / in-time placebo / leave-one-out). Deferred to PR-2, where it pairs with the placebo-inference layer those reports would surface. | `practitioner.py`, `diagnostic_report.py`, `business_report.py` | SCM PR-1 → PR-2 | Medium | | ContinuousDiD deferred CGBS 2024 extensions: (a) `covariates=` kwarg not implemented (matches R `contdid` v0.1.0); (b) discrete-treatment saturated regression deferred (integer-valued dose currently warned, not routed to per-level coefficients); (c) lowest-dose-as-control per CGBS 2024 Remark 3.1 (when `P(D=0) = 0`) not implemented — estimator requires never-treated controls. REGISTRY `## ContinuousDiD` → Implementation Checklist marks these as deferred `[ ]` items. | `diff_diff/continuous_did.py` | — | Low | | Survey-weighted Silverman bandwidth in EfficientDiD conditional Omega* — `_silverman_bandwidth()` uses unweighted mean/std for bandwidth selection; survey-weighted statistics would better reflect the population distribution but is a second-order refinement | `efficient_did_covariates.py` | — | Low | | TROP: extend Wave 4's `_setup_trop_data` helper to also cover the duplicated bootstrap resampling loop in `_bootstrap_variance` / `_bootstrap_variance_global` (~40 LoC dedup; mirrors the data-setup helper pattern with a `fit_callable` parameter for the per-draw refit step). | `trop_local.py`, `trop_global.py` | follow-up | Low | diff --git a/benchmarks/R/generate_synth_basque_golden.R b/benchmarks/R/generate_synth_basque_golden.R new file mode 100644 index 00000000..3f524ff1 --- /dev/null +++ b/benchmarks/R/generate_synth_basque_golden.R @@ -0,0 +1,127 @@ +#!/usr/bin/env Rscript +# Generate the Basque Country (Abadie & Gardeazabal 2003) R `Synth` golden fixture +# for the SyntheticControl estimator's two-tier R-parity test. +# +# Run from the repo root: +# Rscript benchmarks/R/generate_synth_basque_golden.R +# +# Writes (into tests/data/ so the deterministic Tier-1 parity test runs in +# isolated-install CI without R): +# tests/data/synth_basque_panel.csv verbatim Synth::basque, regions != 1 +# (Spain aggregate dropped), long format, +# plus an absorbing `treated` indicator. +# tests/data/synth_basque_golden.json R Synth solution.v / solution.w, losses, +# the standardization divisor, X1/X0, and +# the treated/synthetic/gap paths. +# +# Provenance: the panel is a verbatim export of R `Synth::basque`; the V-selection +# numerics (standardization divisor, optimizer) are pinned from the `Synth` source, +# not from Abadie-Diamond-Hainmueller (2010) — see docs/methodology/REGISTRY.md. + +suppressMessages({ + library(Synth) + library(jsonlite) +}) + +data(basque) + +predictors <- c( + "school.illit", "school.prim", "school.med", + "school.high", "school.post.high", "invest" +) +special <- list( + list("gdpcap", 1960:1969, "mean"), + list("sec.agriculture", seq(1961, 1969, 2), "mean"), + list("sec.energy", seq(1961, 1969, 2), "mean"), + list("sec.industry", seq(1961, 1969, 2), "mean"), + list("sec.construction", seq(1961, 1969, 2), "mean"), + list("sec.services.venta", seq(1961, 1969, 2), "mean"), + list("sec.services.nonventa", seq(1961, 1969, 2), "mean"), + list("popdens", 1969, "mean") +) +controls <- c(2:16, 18) + +invisible(capture.output({ + dp <- dataprep( + foo = basque, + predictors = predictors, + predictors.op = "mean", + time.predictors.prior = 1964:1969, + special.predictors = special, + dependent = "gdpcap", + unit.variable = "regionno", + unit.names.variable = "regionname", + time.variable = "year", + treatment.identifier = 17, + controls.identifier = controls, + time.optimize.ssr = 1960:1969, + time.plot = 1955:1997 + ) + so <- synth(dp) +})) + +# Standardization divisor exactly as computed inside synth(): +# divisor <- sqrt(apply(cbind(X0, X1), 1, var)) +big <- cbind(dp$X0, dp$X1) +divisor <- sqrt(apply(big, 1, var)) + +pred_names <- rownames(dp$X1) +v <- as.numeric(so$solution.v) +w <- as.numeric(so$solution.w) + +# X0 as predictor -> {control -> value} so Python can verify matrix construction. +X0_list <- setNames( + lapply(seq_len(nrow(dp$X0)), function(i) as.list(setNames(dp$X0[i, ], colnames(dp$X0)))), + pred_names +) + +synthetic_path <- as.numeric(dp$Y0plot %*% so$solution.w) +treated_path <- as.numeric(dp$Y1plot) +years <- as.integer(rownames(dp$Y1plot)) + +golden <- list( + config = list( + treated_regionno = 17, + controls = controls, + treatment_year = 1970, + predictors = predictors, + predictors_op = "mean", + predictor_window = 1964:1969, + special = lapply(special, function(s) { + list(var = s[[1]], periods = s[[2]], op = s[[3]]) + }), + time_optimize_ssr = 1960:1969, + time_plot = c(1955, 1997) + ), + predictor_names = pred_names, + solution_v = setNames(v, pred_names), + solution_w = as.list(setNames(w, colnames(dp$X0))), + loss_v = as.numeric(so$loss.v), + loss_w = as.numeric(so$loss.w), + divisor = setNames(as.numeric(divisor), pred_names), + X1 = setNames(as.numeric(dp$X1), pred_names), + X0 = X0_list, + years = years, + treated_path = treated_path, + synthetic_path = synthetic_path, + gap = treated_path - synthetic_path +) + +dir.create("tests/data", showWarnings = FALSE, recursive = TRUE) +write_json( + golden, "tests/data/synth_basque_golden.json", + auto_unbox = TRUE, digits = 12, pretty = TRUE +) + +# Panel CSV: drop region 1 (Spain aggregate); long format + absorbing treated. +panel <- basque[basque$regionno != 1, ] +panel$treated <- as.integer(panel$regionno == 17 & panel$year >= 1970) +stopifnot(!any(is.na(panel$gdpcap))) # outcome must be complete (balanced panel) +write.csv(panel, "tests/data/synth_basque_panel.csv", row.names = FALSE) + +cat("Wrote tests/data/synth_basque_golden.json and synth_basque_panel.csv\n") +cat("nvarsV:", length(v), " n_controls:", length(w), "\n") +cat("loss.v:", format(so$loss.v, digits = 6), " loss.w:", format(so$loss.w, digits = 6), "\n") +nz <- setNames(round(w, 4), colnames(dp$X0)) +cat("solution.w (nonzero):\n") +print(nz[nz > 1e-4]) diff --git a/benchmarks/R/requirements.R b/benchmarks/R/requirements.R index fa516d2b..a5499687 100644 --- a/benchmarks/R/requirements.R +++ b/benchmarks/R/requirements.R @@ -17,6 +17,7 @@ required_packages <- c( "DIDHAD", # de Chaisemartin et al. (2025) HAD estimator (HAD Phase 4 R-parity) "YatchewTest", # Yatchew (1997) linearity test (HAD yatchew R-parity) "nprobust", # Calonico-Cattaneo-Farrell local-linear (DIDHAD dependency) + "Synth", # Abadie-Diamond-Hainmueller (2010) synthetic control (SyntheticControl R-parity; ships data(basque)) # Utilities "jsonlite", # JSON output for Python interop diff --git a/diff_diff/__init__.py b/diff_diff/__init__.py index ca3d53db..724eb944 100644 --- a/diff_diff/__init__.py +++ b/diff_diff/__init__.py @@ -222,6 +222,11 @@ TROPResults, trop, ) +from diff_diff.synthetic_control import ( + SyntheticControl, + synthetic_control, +) +from diff_diff.synthetic_control_results import SyntheticControlResults from diff_diff.wooldridge import WooldridgeDiD from diff_diff.wooldridge_results import WooldridgeDiDResults from diff_diff.utils import ( @@ -309,6 +314,7 @@ "SpilloverDiD", "TripleDifference", "TROP", + "SyntheticControl", "StackedDiD", # Estimator aliases (short names) "DiD", @@ -355,6 +361,8 @@ "StaggeredTripleDiffResults", "TROPResults", "trop", + "SyntheticControlResults", + "synthetic_control", "StackedDiDResults", "stacked_did", # EfficientDiD diff --git a/diff_diff/estimators.py b/diff_diff/estimators.py index aa8a3ef5..8c71dd76 100644 --- a/diff_diff/estimators.py +++ b/diff_diff/estimators.py @@ -8,6 +8,7 @@ Additional estimators are in separate modules: - TwoWayFixedEffects: See diff_diff.twfe - SyntheticDiD: See diff_diff.synthetic_did +- SyntheticControl: See diff_diff.synthetic_control For backward compatibility, all estimators are re-exported from this module. """ @@ -2042,6 +2043,8 @@ def summary(self) -> str: # These can also be imported directly from their respective modules: # - from diff_diff.twfe import TwoWayFixedEffects # - from diff_diff.synthetic_did import SyntheticDiD +# - from diff_diff.synthetic_control import SyntheticControl +from diff_diff.synthetic_control import SyntheticControl # noqa: E402 from diff_diff.synthetic_did import SyntheticDiD # noqa: E402 from diff_diff.twfe import TwoWayFixedEffects # noqa: E402 @@ -2050,4 +2053,5 @@ def summary(self) -> str: "MultiPeriodDiD", "TwoWayFixedEffects", "SyntheticDiD", + "SyntheticControl", ] diff --git a/diff_diff/guides/llms-full.txt b/diff_diff/guides/llms-full.txt index e89ee907..014bb073 100644 --- a/diff_diff/guides/llms-full.txt +++ b/diff_diff/guides/llms-full.txt @@ -572,6 +572,65 @@ results.print_summary() weights_df = results.get_unit_weights_df() ``` +### SyntheticControl + +Classic Synthetic Control Method (Abadie, Diamond & Hainmueller 2010; Abadie & Gardeazabal 2003). Builds a single treated unit's counterfactual as a convex combination of never-treated "donor" units. **Donor weights only** (no time weights, no ridge) — distinct from `SyntheticDiD`. + +```python +SyntheticControl( + v_method: str = "nested", # "nested" (data-driven V) or "custom" + custom_v=None, # diagonal V (len = #predictors); required iff v_method="custom" + optimizer_options: dict | None = None, # merged into scipy.optimize.minimize (outer V search) + n_starts: int = 4, # multistart count for the outer V search + inner_max_iter: int = 10000, # Frank-Wolfe inner-solve cap + inner_min_decrease: float = 1e-5, # inner-solve convergence scale (scale-aware) + standardize: str = "std", # "std" (per-row SD, ddof=1) or "none" (deviation from R) + alpha: float = 0.05, + seed: int | None = None, # seeds the multistart Dirichlet draws +) +``` + +**fit() parameters:** + +```python +scm.fit( + data: pd.DataFrame, + outcome: str, + treatment: str, # ABSORBING 0/1 indicator (treated unit, post periods) + unit: str, + time: str, + *, + post_periods: list | None = None, # inferred from D==1 periods if None + treated_unit=None, # inferred as the single ever-treated unit if None + predictors: list[str] | None = None, # columns averaged over predictor_window + predictors_op: str = "mean", # "mean" | "sum" (linear combinations only, ADH 2010 §2.3) + predictor_window: list | None = None, # pre-periods to average over (default: all pre) + special_predictors=None, # list of (var, periods, op) triples + pre_period_outcomes=None, # "all" or list; per-period outcome lags + donor_pool: list | None = None, # explicit never-treated donors (default: all) +) -> SyntheticControlResults +``` + +**Inference:** NONE analytical — `se`/`t_stat`/`p_value`/`conf_int` are NaN. `att` is the mean post-period gap; use placebo/permutation inference for significance (planned follow-up). Predictor periods must lie within the pre window; `post_periods` must be a contiguous suffix cross-checked against `D` (no anticipation). + +**Usage:** + +```python +from diff_diff import SyntheticControl + +scm = SyntheticControl(v_method='nested', seed=0) +# Set predictor_window explicitly when a covariate is observed on only a subset of +# the pre periods — the default averages over ALL pre periods and fails closed on +# non-finite cells. +results = scm.fit(data, outcome='gdpcap', treatment='treated', unit='region', time='year', + predictors=['invest', 'school.high'], + predictor_window=[1964, 1965, 1966, 1967, 1968, 1969], + special_predictors=[('gdpcap', [1960, 1965, 1969], 'mean')]) +results.print_summary() +gap_df = results.get_gap_df() # period, gap, phase +weights_df = results.get_weights_df() # unit, weight (descending) +``` + ### TripleDifference Triple Difference (DDD) estimator following Ortiz-Villavicencio & Sant'Anna (2025). @@ -1202,6 +1261,29 @@ Returned by `SyntheticDiD.fit()`. - `in_time_placebo()` - re-estimate on shifted fake treatment dates in the pre-period; near-zero placebo ATTs indicate a credible design - `sensitivity_to_zeta_omega()` - re-estimate across a grid of unit-weight regularization values; checks ATT robustness to the auto-selected zeta_omega +### SyntheticControlResults + +Returned by `SyntheticControl.fit()`. + +| Attribute | Type | Description | +|-----------|------|-------------| +| `att` | `float` | Mean post-period gap (reported point estimate) | +| `se`, `t_stat`, `p_value`, `conf_int` | `float` / tuple | Always NaN — no analytical SE (permutation inference planned) | +| `n_obs` | `int` | Treated + donor rows over all periods | +| `n_donors` | `int` | Donor units in the (post-filter) pool | +| `n_pre_periods`, `n_post_periods` | `int` | Period counts | +| `donor_weights` | `dict` | `{donor_id: weight}` on the simplex (near-zeros dropped) | +| `v_weights` | `dict` | `{predictor_label: v}`, trace-normalized | +| `predictor_balance` | `pd.DataFrame` | treated vs synthetic vs donor-mean per predictor | +| `gap_path` | `dict` | `{period: gap}` for all periods | +| `pre_rmspe` | `float` | Pre-treatment fit diagnostic | +| `mspe_v` | `float \| None` | Outer-objective MSPE (nested) | +| `treated_unit` | `Any` | Treated unit identifier | +| `pre_periods`, `post_periods` | `list` | Calendar-sorted periods | +| `v_method`, `standardize` | `str` | Echoed configuration | + +**Methods:** `summary()`, `print_summary()`, `to_dict()`, `to_dataframe()`, `get_gap_df()`, `get_weights_df()` + ### TripleDifferenceResults Returned by `TripleDifference.fit()`. diff --git a/diff_diff/guides/llms.txt b/diff_diff/guides/llms.txt index 754a72e6..f56e63cd 100644 --- a/diff_diff/guides/llms.txt +++ b/diff_diff/guides/llms.txt @@ -2,7 +2,7 @@ > A Python library for Difference-in-Differences (DiD) causal inference analysis. Provides sklearn-like estimators with statsmodels-style summary output for econometric analysis. -diff-diff offers 17 estimators covering basic 2x2 DiD, modern staggered adoption methods, reversible (non-absorbing) treatments, advanced panel estimators, nonlinear models, and diagnostic tools. It supports robust and cluster-robust standard errors, wild cluster bootstrap, formula and column-name interfaces, fixed effects (dummy and absorbed), complex survey designs (strata/PSU/FPC, replicate weights, design-based variance), and publication-ready output. The optional Rust backend accelerates compute-intensive estimators like Synthetic DiD and TROP. +diff-diff offers 18 estimators covering basic 2x2 DiD, modern staggered adoption methods, reversible (non-absorbing) treatments, advanced panel estimators, nonlinear models, and diagnostic tools. It supports robust and cluster-robust standard errors, wild cluster bootstrap, formula and column-name interfaces, fixed effects (dummy and absorbed), complex survey designs (strata/PSU/FPC, replicate weights, design-based variance), and publication-ready output. The optional Rust backend accelerates compute-intensive estimators like Synthetic DiD and TROP. - Install: `pip install diff-diff` - License: MIT @@ -60,6 +60,7 @@ Full practitioner guide: call `diff_diff.get_llm_guide("practitioner")` - [TwoStageDiD](https://diff-diff.readthedocs.io/en/stable/api/two_stage.html): Gardner (2022) two-stage estimator with GMM sandwich variance - [SpilloverDiD](https://diff-diff.readthedocs.io/en/stable/api/spillover.html): Butts (2021) ring-indicator spillover-aware DiD identifying direct effect on treated + per-ring spillover-on-control; reuses `conley_coords` for ring construction; handles non-staggered and staggered timing; supports `SurveyDesign(weights, strata, psu, fpc)` under `vcov_type="hc1"` with optional `cluster=` for CR1 via Gerber (2026) Binder TSL (Wave E.1) and under `vcov_type="conley"` via a panel-aware stratified-Conley sandwich on per-period PSU totals (Wave E.2 cross-sectional `conley_lag_cutoff=0`) extended in Wave E.2 follow-up to `conley_lag_cutoff > 0` via panel-block composition with within-PSU serial Bartlett HAC (Newey-West 1987 separable form; `lag>0` requires an effective PSU via explicit `survey_design.psu` or injected `cluster=`), both composed with the Wave D Gardner GMM correction; `SurveyDesign.subpopulation()` preserves full-design `n_psu` / `df_survey` via zero-padded scores at the meat-helper boundary (Wave E.3, R `svyrecvar(subset())` form) (replicate weights queued as follow-up) - [SyntheticDiD](https://diff-diff.readthedocs.io/en/stable/api/estimators.html): Synthetic DiD combining standard DiD and synthetic control methods for few treated units +- [SyntheticControl](https://diff-diff.readthedocs.io/en/stable/api/synthetic_control.html): Abadie, Diamond & Hainmueller (2010) classic synthetic control for ONE treated unit — donor-weight counterfactual, nested or custom predictor-importance V, gap path + pre-RMSPE; no inference in this release (inference fields are NaN — permutation/placebo planned) - [TripleDifference](https://diff-diff.readthedocs.io/en/stable/api/triple_diff.html): Triple difference (DDD) estimator for designs requiring two criteria for treatment eligibility - [ContinuousDiD](https://diff-diff.readthedocs.io/en/stable/api/continuous_did.html): Callaway, Goodman-Bacon & Sant'Anna (2024) continuous treatment DiD with dose-response curves - [HeterogeneousAdoptionDiD](https://diff-diff.readthedocs.io/en/stable/api/had.html): de Chaisemartin, Ciccia, D'Haultfœuille & Knau (2026) for designs where **no unit remains untreated**; local-linear estimator at the dose support boundary returning Weighted Average Slope (WAS) on Design 1' (`d̲=0` / QUG) or `WAS_{d̲}` on Design 1 (`d̲>0`, continuous-near-d̲ or mass-point), with multi-period event-study extension (last-treatment cohort, pointwise CIs). **Panel-only** in this release (repeated cross-sections rejected by the validator). Alias `HAD`. diff --git a/diff_diff/synthetic_control.py b/diff_diff/synthetic_control.py new file mode 100644 index 00000000..8f349124 --- /dev/null +++ b/diff_diff/synthetic_control.py @@ -0,0 +1,1143 @@ +""" +Classic Synthetic Control Method (SCM) estimator. + +Implements Abadie, Diamond & Hainmueller (2010), "Synthetic Control Methods for +Comparative Case Studies: Estimating the Effect of California's Tobacco Control +Program," *JASA* 105(490):493-505. The method originates in Abadie & +Gardeazabal (2003). + +A single treated unit's counterfactual is built as a convex combination of +"donor" (never-treated) units. Donor weights ``W*(V)`` solve a simplex-constrained +weighted least-squares fit of the treated unit's predictors; the predictor-importance +matrix ``V`` (diagonal, PSD) is either chosen data-driven by minimizing pre-period +outcome MSPE ("nested") or supplied by the user ("custom"). The treatment-effect +path is the gap ``α̂_1t = Y_1t − Σ_j w_j · Y_jt`` over the post periods. + +Distinct from :class:`~diff_diff.SyntheticDiD` (Arkhangelsky et al. 2021), which adds +time weights and ridge regularization: classic SCM uses **donor weights only** and a +level-matching estimator, plus the outer ``V`` search SyntheticDiD has no analog for. + +Inference: classic SCM has **no analytical standard error** — the paper proposes +permutation/placebo inference (a later PR). ``se``/``t_stat``/``p_value``/``conf_int`` +are always NaN here; ``att`` (mean post-period gap) is the reported estimate. + +Numerics provenance: the standardization divisor and the inner/outer optimization +scheme are NOT specified in ADH (2010) — they are pinned from the R ``Synth`` package +source / Abadie-Gardeazabal (2003) App. B. See ``docs/methodology/REGISTRY.md`` +§SyntheticControl for the deviation/Note labels. +""" + +import warnings +from dataclasses import dataclass +from typing import Any, Dict, List, Optional, Tuple, cast + +import numpy as np +import pandas as pd +from scipy.optimize import minimize + +from diff_diff.synthetic_control_results import SyntheticControlResults +from diff_diff.utils import _sc_weight_fw, safe_inference, warn_if_not_converged + +__all__ = ["SyntheticControl", "synthetic_control"] + +# Aggregation operators allowed for predictor / special-predictor rows. Restricted to +# LINEAR combinations of pre-period values, matching ADH (2010) §2.3 +# (`Ȳ_i^{K_m} = Σ_s k_s Y_is`): mean (k_s = 1/T0) and sum (k_s = 1). Non-linear +# aggregations (e.g. median) are intentionally NOT supported. +_OP_FUNCS: Dict[str, Any] = {"mean": np.mean, "sum": np.sum} + +# Interpretability floor: donor weights below this are dropped from the reported +# ``donor_weights`` dict (mirrors prep.py's 1e-6 sparsification). The full weight +# vector is still used for the gap path / ATT. +_MIN_REPORT_WEIGHT = 1e-6 + + +@dataclass +class _PredictorSpec: + """Normalized predictor specification (one matrix row of X1/X0).""" + + label: str + kind: str # "predictor_avg" | "special" | "lag" + var: str + periods: List[Any] + op: str # "mean" | "sum" | "identity" + + +def _softmax(theta: np.ndarray) -> np.ndarray: + """Map an unconstrained vector to the unit simplex (v >= 0, sum v = 1).""" + e = np.exp(theta - np.max(theta)) + return e / np.sum(e) + + +class SyntheticControl: + """ + Classic Synthetic Control Method estimator (Abadie-Diamond-Hainmueller 2010). + + Parameters + ---------- + v_method : {"nested", "custom"}, default "nested" + How the predictor-importance matrix V is chosen. ``"nested"`` selects V + data-driven by minimizing the pre-period outcome MSPE of ``W*(V)`` + (ADH 2010 §2.3). ``"custom"`` uses the user-supplied ``custom_v`` and + skips the outer search. + custom_v : array-like, optional + Diagonal of V (length = number of predictors). Required iff + ``v_method="custom"``; must be None when ``v_method="nested"``. Must be + finite and non-negative; trace-normalized internally. + optimizer_options : dict, optional + Extra options merged into every ``scipy.optimize.minimize`` call in the + outer V search (e.g. ``maxiter``, ``xatol``, ``fatol``). + n_starts : int, default 4 + Number of starting points for the multistart outer V search. + inner_max_iter : int, default 10000 + Max iterations for the inner Frank-Wolfe simplex solve. + inner_min_decrease : float, default 1e-5 + Inner-solve convergence scale (matches the SDID/Frank-Wolfe precedent in + ``prep.py``). The Frank-Wolfe stop threshold is + ``(inner_min_decrease * max(||b||, 1e-12))**2`` where ``b`` is the + V^½-scaled treated predictor vector — scale-aware so convergence is + meaningful at any data magnitude. 1e-5 reproduces R ``Synth``'s donor + weights to ~1e-4 on the Basque benchmark while still signalling + convergence; tighter values (e.g. 1e-6) can exhaust ``inner_max_iter``. + standardize : {"std", "none"}, default "std" + Predictor standardization. ``"std"`` divides each predictor row by its + standard deviation across donors+treated (ddof=1), matching R ``Synth``. + ``"none"`` is a deviation from R (see REGISTRY). + alpha : float, default 0.05 + Significance level recorded for downstream (placebo) inference. + seed : int, optional + Seed for the multistart random (Dirichlet) starting points. + """ + + def __init__( + self, + v_method: str = "nested", + custom_v: Optional[Any] = None, + optimizer_options: Optional[Dict[str, Any]] = None, + n_starts: int = 4, + inner_max_iter: int = 10000, + inner_min_decrease: float = 1e-5, + standardize: str = "std", + alpha: float = 0.05, + seed: Optional[int] = None, + ): + self.v_method = v_method + self.custom_v = custom_v + self.optimizer_options = optimizer_options + self.n_starts = n_starts + self.inner_max_iter = inner_max_iter + self.inner_min_decrease = inner_min_decrease + self.standardize = standardize + self.alpha = alpha + self.seed = seed + + self._validate_config() + + # Internal state + self.results_: Optional[SyntheticControlResults] = None + self.is_fitted_: bool = False + + # ========================================================================= + # sklearn-like API + # ========================================================================= + + def _validate_config(self) -> None: + """Validate hyperparameters; shared by ``__init__`` and ``set_params``.""" + if self.v_method not in ("nested", "custom"): + raise ValueError(f"v_method must be one of ('nested', 'custom'), got {self.v_method!r}") + if self.standardize not in ("std", "none"): + raise ValueError( + f"standardize must be one of ('std', 'none'), got {self.standardize!r}" + ) + # custom_v cross-field rules — fail-closed, never silently ignore. + if self.v_method == "custom" and self.custom_v is None: + raise ValueError("custom_v is required when v_method='custom'.") + if self.v_method == "nested" and self.custom_v is not None: + raise ValueError( + "custom_v must be None when v_method='nested' " + "(it would be silently ignored otherwise)." + ) + if self.custom_v is not None: + v = np.asarray(self.custom_v, dtype=float).ravel() + if v.size == 0: + raise ValueError("custom_v must be non-empty.") + if not np.all(np.isfinite(v)): + raise ValueError("custom_v must be finite.") + if np.any(v < 0): + raise ValueError("custom_v must be non-negative.") + if v.sum() <= 0: + raise ValueError("custom_v must have a positive sum.") + if not isinstance(self.n_starts, (int, np.integer)) or self.n_starts < 1: + raise ValueError(f"n_starts must be a positive integer, got {self.n_starts!r}") + if not isinstance(self.inner_max_iter, (int, np.integer)) or self.inner_max_iter < 1: + raise ValueError( + f"inner_max_iter must be a positive integer, got {self.inner_max_iter!r}" + ) + if not (np.isfinite(self.inner_min_decrease) and self.inner_min_decrease > 0): + raise ValueError( + f"inner_min_decrease must be a positive float, got {self.inner_min_decrease!r}" + ) + if not (0 < self.alpha < 1): + raise ValueError(f"alpha must be in (0, 1), got {self.alpha!r}") + + def get_params(self) -> Dict[str, Any]: + """Get estimator parameters.""" + return { + "v_method": self.v_method, + "custom_v": self.custom_v, + "optimizer_options": self.optimizer_options, + "n_starts": self.n_starts, + "inner_max_iter": self.inner_max_iter, + "inner_min_decrease": self.inner_min_decrease, + "standardize": self.standardize, + "alpha": self.alpha, + "seed": self.seed, + } + + def set_params(self, **params) -> "SyntheticControl": + """Set estimator parameters. + + Applies updates transactionally: if ``_validate_config()`` rejects the + post-update state, the instance is rolled back to its pre-call values so + a raised ``ValueError`` leaves the object consistent. + """ + _rollback: Dict[str, Any] = {} + for key in params: + if hasattr(self, key): + _rollback[key] = getattr(self, key) + try: + for key, value in params.items(): + if hasattr(self, key): + setattr(self, key, value) + else: + raise ValueError(f"Unknown parameter: {key}") + self._validate_config() + except (ValueError, TypeError): + for key, prev in _rollback.items(): + setattr(self, key, prev) + raise + return self + + # ========================================================================= + # fit + # ========================================================================= + + def fit( + self, + data: pd.DataFrame, + outcome: str, + treatment: str, + unit: str, + time: str, + *, + post_periods: Optional[List[Any]] = None, + treated_unit: Optional[Any] = None, + predictors: Optional[List[str]] = None, + predictors_op: str = "mean", + predictor_window: Optional[List[Any]] = None, + special_predictors: Optional[List[Tuple[str, List[Any], str]]] = None, + pre_period_outcomes: Optional[Any] = None, + donor_pool: Optional[List[Any]] = None, + survey_design: Any = None, + ) -> SyntheticControlResults: + """ + Fit the classic synthetic control model. + + Parameters + ---------- + data : pandas.DataFrame + Balanced panel. + outcome, treatment, unit, time : str + Column names. ``treatment`` is the ABSORBING treatment indicator + (0/1): 1 for the treated unit in its treated periods, 0 otherwise. + post_periods : list, optional + Explicit post-treatment period values. If None, inferred from the + treated unit's treatment column (the D==1 periods). + treated_unit : Any, optional + Identifier of the treated unit. If None, inferred as the single + ever-treated unit. + predictors : list of str, optional + Columns averaged over ``predictor_window`` (using ``predictors_op``) + to form predictor rows. + predictors_op : {"mean", "sum"}, default "mean" + Aggregation operator for ``predictors`` (linear combinations only, per + ADH 2010 §2.3). + predictor_window : list, optional + Pre-periods over which ``predictors`` are averaged. Defaults to all + pre periods. Must be a non-empty subset of the pre periods. + special_predictors : list of (var, periods, op), optional + Per-variable special predictors, each averaged over its own periods + with its own operator (mirrors R ``Synth`` ``special.predictors``). + pre_period_outcomes : "all" or list, optional + Use individual pre-period outcomes as predictor rows ("all" = every + pre period). When no predictor arguments are given at all, defaults + to all pre-period outcomes. + donor_pool : list, optional + Explicit donor unit identifiers (must be never-treated). Defaults to + all never-treated units. + survey_design : optional + Not yet supported — raises ``NotImplementedError`` if provided. + + Returns + ------- + SyntheticControlResults + """ + if survey_design is not None: + raise NotImplementedError( + "SyntheticControl does not yet support survey_design. " + "Survey integration is planned for a later release." + ) + + # --- column validation --- + required = [outcome, treatment, unit, time] + if predictors: + required += list(predictors) + if special_predictors: + required += [sp[0] for sp in special_predictors] + missing = [c for c in dict.fromkeys(required) if c not in data.columns] + if missing: + raise ValueError(f"Missing columns: {missing}") + + # Reject missing structural values up front: a NaN treatment status would be + # silently dropped by groupby(...).max() (a partially-missing donor history + # could be misclassified as never-treated), and NaN unit/time break the panel + # structure. Done BEFORE classification so donor-pool composition is honest. + for col in (treatment, unit, time): + if bool(data[col].isna().to_numpy().any()): + raise ValueError( + f"Column {col!r} contains missing (NaN) values; treatment, unit, and " + "time must be fully observed (a missing treatment status would " + "silently change donor classification)." + ) + + # Validate the treatment indicator on the FULL input BEFORE classifying + # units: a non-{0,1} history would otherwise be silently dropped from both + # the treated and never-treated sets by _resolve_treated_and_donors, quietly + # changing the donor pool / weights / ATT. + _check_binary(data[treatment].to_numpy(), treatment) + + # --- treated unit + donor pool --- + treated_id, donor_ids = _resolve_treated_and_donors( + data, treatment, unit, treated_unit, donor_pool + ) + + # --- restrict to treated + donors; all_periods derived AFTER donor filter --- + keep_ids = [treated_id] + donor_ids + sub = cast(pd.DataFrame, data[data[unit].isin(keep_ids)]) + all_periods = sorted(sub[time].unique()) + + # balanced panel check (on the analysis subset) + _check_balanced(sub, unit, time, keep_ids, all_periods) + + # --- pre/post period resolution + canonicalization + cross-checks --- + pre_periods, post_periods = _resolve_periods( + sub, time, treatment, unit, treated_id, all_periods, post_periods, self.v_method + ) + + # --- predictor specs (validations 1, 5, 7) --- + specs = _validate_predictors( + predictors, + predictors_op, + predictor_window, + special_predictors, + pre_period_outcomes, + outcome, + pre_periods, + ) + + # --- pivots, predictor matrices, outcome matrices --- + needed_vars = {outcome} | {s.var for s in specs} + pivots = { + v: sub.pivot(index=time, columns=unit, values=v).reindex(index=all_periods) + for v in needed_vars + } + Y = pivots[outcome] + Z1 = Y.loc[pre_periods, treated_id].to_numpy(dtype=float) + Z0 = Y.loc[pre_periods, donor_ids].to_numpy(dtype=float) # (n_pre, J) + + # Fail closed on non-finite outcomes for the treated/donor panel: NaN/inf in + # the outcome would silently propagate into the gap path / ATT (and into Z for + # the nested objective). + outcome_block = Y.loc[all_periods, [treated_id] + donor_ids].to_numpy(dtype=float) + if not np.all(np.isfinite(outcome_block)): + raise ValueError( + f"Outcome {outcome!r} contains non-finite (NaN/inf) values for the " + "treated unit or a donor over the analysis periods; synthetic control " + "requires a complete outcome panel." + ) + + X1, X0, labels = _build_predictor_matrix(pivots, specs, treated_id, donor_ids) + # Fail closed on non-finite predictor cells (e.g. a covariate that is only + # observed on a subset of the pre periods averaged over the full pre window). + bad = [ + labels[i] + for i in range(len(labels)) + if not (np.isfinite(X1[i]) and np.all(np.isfinite(X0[i, :]))) + ] + if bad: + raise ValueError( + f"Predictor(s) {bad} have non-finite (NaN/inf) values for the treated " + "unit or a donor over their selected periods. Restrict predictor_window / " + "special_predictors periods to where the variable is observed." + ) + X1s, X0s, _ = _standardize(X1, X0, self.standardize) + + k = X1s.shape[0] + J = len(donor_ids) + + # --- solve for V and donor weights --- + # mspe_v is the OUTER-objective value; it is populated only when a nested V + # search actually ran (None on the custom and single-donor paths). + mspe_v: Optional[float] = None + if self.v_method == "custom": + v = self._prepare_custom_v(k) + w, converged = _inner_solve_W(X1s, X0s, v, self.inner_max_iter, self.inner_min_decrease) + elif J == 1: + # Degenerate: a single donor forces w = [1.0]; V is irrelevant. + warnings.warn( + "Only one donor unit is available; the synthetic control is that " + "single donor (w = 1) and the V search is skipped. SCM is degenerate " + "with a single donor.", + UserWarning, + stacklevel=2, + ) + v = np.ones(k) / k + w = np.array([1.0]) + converged = True + else: + v, w, converged, mspe_v = _outer_solve_V( + X1, + X0, + X1s, + X0s, + Z1, + Z0, + self.n_starts, + self.seed, + self.optimizer_options, + self.inner_max_iter, + self.inner_min_decrease, + ) + + # --- gap path, fit diagnostics, ATT --- + gap_path = _compute_gap_path(Y, w, treated_id, donor_ids, all_periods) + pre_gaps = np.array([gap_path[p] for p in pre_periods], dtype=float) + post_gaps = np.array([gap_path[p] for p in post_periods], dtype=float) + pre_rmspe = float(np.sqrt(np.mean(pre_gaps**2))) + att = float(np.mean(post_gaps)) + + # Poor-fit warning (REGISTRY contract: warn when pre-RMSPE exceeds the SD of + # the treated unit's pre-period outcomes). This includes a FLAT treated pre-path + # (pre_sd == 0): any non-trivial RMSPE then means the synthetic cannot reproduce + # a constant series. A scale-aware absolute floor (`_fit_tol`) guards against a + # spurious warning on a near-perfect flat fit (RMSPE ~ roundoff). + pre_sd = float(np.std(Z1, ddof=1)) if Z1.size > 1 else 0.0 + _fit_tol = 1e-8 * max(float(np.max(np.abs(Z1))) if Z1.size else 0.0, 1.0) + if pre_rmspe > pre_sd and pre_rmspe > _fit_tol: + warnings.warn( + f"Pre-treatment fit is poor: RMSPE ({pre_rmspe:.4f}) exceeds the " + f"standard deviation of treated pre-treatment outcomes ({pre_sd:.4f}). " + f"The synthetic control may not adequately reproduce the treated unit's " + f"pre-treatment trajectory; consider a different donor pool or predictors.", + UserWarning, + stacklevel=2, + ) + + # Inner-solve non-convergence warning — fires on nested / custom / k==1 paths. + warn_if_not_converged( + converged, + "Synthetic control inner weight solver (Frank-Wolfe)", + self.inner_max_iter, + stacklevel=2, + ) + + # No analytical SE: all inference fields NaN. + t_stat, p_value, conf_int = safe_inference(att, np.nan, self.alpha) + + # --- reporting structures --- + donor_weights = {donor_ids[j]: float(w[j]) for j in range(J) if w[j] > _MIN_REPORT_WEIGHT} + v_weights = {labels[i]: float(v[i]) for i in range(k)} + synthetic_X = X0 @ w + donor_mean = X0.mean(axis=1) + predictor_balance = pd.DataFrame( + { + "predictor": labels, + "treated": X1, + "synthetic": synthetic_X, + "donor_mean": donor_mean, + } + ) + + results = SyntheticControlResults( + att=att, + se=np.nan, + t_stat=t_stat, + p_value=p_value, + conf_int=conf_int, + n_obs=int(len(sub)), + n_donors=J, + n_pre_periods=len(pre_periods), + n_post_periods=len(post_periods), + donor_weights=donor_weights, + v_weights=v_weights, + predictor_balance=predictor_balance, + gap_path=gap_path, + pre_rmspe=pre_rmspe, + treated_unit=treated_id, + pre_periods=list(pre_periods), + post_periods=list(post_periods), + v_method=self.v_method, + standardize=self.standardize, + alpha=self.alpha, + mspe_v=mspe_v, + ) + # Reserved for PR-2 (placebo inference) / PR-3 (conformal). Set as plain + # attributes so dataclasses.asdict/fields cannot reach them. + results._placebo_gaps = None + results._rmspe_ratio = None + results._fit_snapshot = None + + self.results_ = results + self.is_fitted_ = True + return results + + def _prepare_custom_v(self, k: int) -> np.ndarray: + """Validate ``custom_v`` against the predictor count and trace-normalize.""" + v = np.asarray(self.custom_v, dtype=float).ravel() + if v.shape[0] != k: + raise ValueError( + f"custom_v has length {v.shape[0]} but there are {k} predictors. " + "custom_v must have exactly one entry per predictor row." + ) + # Finiteness / non-negativity already enforced in _validate_config. + return v / v.sum() + + +def synthetic_control( + data: pd.DataFrame, + outcome: str, + treatment: str, + unit: str, + time: str, + **kwargs, +) -> SyntheticControlResults: + """ + Convenience function for classic synthetic control estimation. + + Constructor-only keyword arguments (``v_method``, ``custom_v``, ``n_starts``, + ``standardize``, ``alpha``, ``seed``, ``optimizer_options``, + ``inner_max_iter``, ``inner_min_decrease``) and ``fit`` keyword arguments + (``post_periods``, ``treated_unit``, ``predictors``, ``special_predictors``, + ...) may both be passed via ``**kwargs``. + + Examples + -------- + >>> from diff_diff import synthetic_control + >>> res = synthetic_control(data, "y", "treated", "unit", "year", + ... predictors=["x1", "x2"]) + >>> print(f"ATT: {res.att:.3f}, pre-RMSPE: {res.pre_rmspe:.3f}") + """ + ctor_keys = set(SyntheticControl().get_params().keys()) + ctor_kwargs = {k: v for k, v in kwargs.items() if k in ctor_keys} + fit_kwargs = {k: v for k, v in kwargs.items() if k not in ctor_keys} + estimator = SyntheticControl(**ctor_kwargs) + return estimator.fit(data, outcome, treatment, unit, time, **fit_kwargs) + + +# ============================================================================= +# module-private helpers +# ============================================================================= + + +def _check_binary(arr: np.ndarray, name: str) -> None: + """Raise ValueError if ``arr`` contains values other than 0/1.""" + uniq = np.unique(arr[~np.isnan(arr)]) if arr.dtype.kind == "f" else np.unique(arr) + if not np.all(np.isin(uniq, [0, 1])): + raise ValueError(f"{name} must be binary (0 or 1). Found values: {uniq}") + + +def _check_balanced( + sub: pd.DataFrame, unit: str, time: str, keep_ids: List[Any], all_periods: List[Any] +) -> None: + """Raise ValueError if the analysis subset is not a balanced panel.""" + n_periods = len(all_periods) + counts = sub.groupby(unit)[time].nunique() + bad = [u for u in keep_ids if counts.get(u, 0) != n_periods] + if bad: + raise ValueError( + f"Unbalanced panel: units {bad} are not observed at all {n_periods} " + "periods. Synthetic control requires a balanced panel." + ) + if len(sub) != len(keep_ids) * n_periods: + raise ValueError( + "Panel has duplicate (unit, time) rows; synthetic control requires " + "exactly one observation per unit-period." + ) + + +def _resolve_treated_and_donors( + data: pd.DataFrame, + treatment: str, + unit: str, + treated_unit: Optional[Any], + donor_pool: Optional[List[Any]], +) -> Tuple[Any, List[Any]]: + """Identify the single treated unit and the (never-treated) donor pool.""" + ever_treated = cast(pd.Series, data.groupby(unit)[treatment].max()) + treated_units = [u for u, v in ever_treated.items() if v == 1] + never_treated = [u for u, v in ever_treated.items() if v == 0] + + if treated_unit is not None: + if treated_unit not in set(data[unit]): + raise ValueError(f"treated_unit {treated_unit!r} not found in the data.") + if treated_unit not in treated_units: + raise ValueError( + f"treated_unit {treated_unit!r} has no treated (D==1) periods. " + "The treatment column must mark the treated unit's post periods." + ) + treated_id = treated_unit + else: + if len(treated_units) == 0: + raise ValueError( + "No treated unit found (no unit has any D==1 period). Synthetic " + "control requires exactly one treated unit." + ) + if len(treated_units) > 1: + raise ValueError( + f"Found {len(treated_units)} ever-treated units {treated_units}; " + "synthetic control requires exactly one. Pass treated_unit= to " + "select one, and exclude the others from the donor pool." + ) + treated_id = treated_units[0] + + if donor_pool is not None: + donor_ids = list(dict.fromkeys(donor_pool)) + if treated_id in donor_ids: + raise ValueError("donor_pool must not contain the treated unit.") + missing = [u for u in donor_ids if u not in set(data[unit])] + if missing: + raise ValueError(f"donor_pool units not found in the data: {missing}") + contaminated = [u for u in donor_ids if u not in set(never_treated)] + if contaminated: + raise ValueError( + f"donor_pool contains ever-treated units {contaminated}; donors " + "must be never-treated." + ) + else: + donor_ids = [u for u in never_treated if u != treated_id] + + if len(donor_ids) == 0: + raise ValueError("No donor units available (the donor pool is empty).") + + return treated_id, donor_ids + + +def _resolve_periods( + sub: pd.DataFrame, + time: str, + treatment: str, + unit: str, + treated_id: Any, + all_periods: List[Any], + post_periods: Optional[List[Any]], + v_method: str, +) -> Tuple[List[Any], List[Any]]: + """Resolve and validate pre/post periods (absorbing + no-anticipation).""" + period_index = {p: i for i, p in enumerate(all_periods)} + treated_rows = sub[sub[unit] == treated_id].set_index(time)[treatment] + treated_d = {p: int(treated_rows.loc[p]) for p in all_periods} + treated_active = [p for p in all_periods if treated_d[p] == 1] + + if post_periods is None: + if not treated_active: + raise ValueError( + "Cannot infer post periods: the treated unit has no D==1 periods. " + "Pass post_periods= explicitly." + ) + post = sorted(treated_active, key=lambda p: period_index[p]) + else: + # Canonicalize: unique + calendar-sorted (validation 3). + seen = set() + canon = [] + for p in post_periods: + if p not in period_index: + raise ValueError(f"post_periods value {p!r} is not a period in the data.") + if p not in seen: + seen.add(p) + canon.append(p) + if not canon: + raise ValueError("post_periods must be non-empty.") + post = sorted(canon, key=lambda p: period_index[p]) + + # Contiguous suffix (absorbing assignment): post must be the last len(post) + # periods of the calendar axis (validation 2). + suffix = all_periods[len(all_periods) - len(post) :] + if post != suffix: + raise ValueError( + "post_periods must be a contiguous suffix of the time axis (absorbing " + f"treatment). Got {post}, expected the trailing periods {suffix}." + ) + + pre_periods = all_periods[: len(all_periods) - len(post)] + + # Absorbing-treatment cross-check vs the treated unit's D column (validation 2), + # enforced on BOTH the inferred and explicit branches: + # - no anticipation: the treated unit must be untreated (D==0) in every pre period; + # - uninterrupted exposure: the treated unit must be treated (D==1) in every post + # period (an explicit suffix like [t2, t3] over a path 0,0,1,0 would otherwise + # average a treated period with an untreated one). + anticipated = [p for p in pre_periods if treated_d[p] == 1] + if anticipated: + raise ValueError( + f"Treated unit has D==1 in pre periods {anticipated}; this violates the " + "no-anticipation / absorbing-treatment assumption. Redefine post_periods " + "to begin at the first treated period." + ) + untreated_post = [p for p in post if treated_d[p] != 1] + if untreated_post: + raise ValueError( + f"Treated unit has D==0 in post periods {untreated_post}; absorbing " + "treatment requires uninterrupted exposure (D==1) in every post period. " + "Check post_periods against the treatment column." + ) + + if len(pre_periods) == 0: + raise ValueError( + "No pre-treatment periods: synthetic control cannot fit a counterfactual " + "without at least one pre period." + ) + if v_method == "nested" and len(pre_periods) == 1: + warnings.warn( + "Only one pre-treatment period: data-driven V selection (v_method='nested') " + "is unreliable with a single pre period (the outer MSPE degenerates to one " + "term). Consider v_method='custom' or more pre periods.", + UserWarning, + stacklevel=2, + ) + + return pre_periods, post + + +def _validate_predictors( + predictors: Optional[List[str]], + predictors_op: str, + predictor_window: Optional[List[Any]], + special_predictors: Optional[List[Tuple[str, List[Any], str]]], + pre_period_outcomes: Optional[Any], + outcome: str, + pre_periods: List[Any], +) -> List[_PredictorSpec]: + """Build the normalized, collision-checked predictor specification list. + + Canonical row ORDER: covariate/predictor averages -> special predictors -> + per-period outcome lags (the row order matches R ``Synth::dataprep``). NOTE: the + aggregation itself fails closed on any non-finite cell (handled in ``fit``), + whereas R ``dataprep`` uses ``na.rm=TRUE`` — a documented deviation (see REGISTRY). + """ + pre_set = set(pre_periods) + pre_index = {p: i for i, p in enumerate(pre_periods)} + specs: List[_PredictorSpec] = [] + labels: set = set() + + def _canon(periods: List[Any]) -> List[Any]: + # Unique + calendar-sorted, so reordered/duplicated period lists are + # equivalent (e.g. [c,b,a] == [a,b,c]; a repeated period does not + # re-weight a "mean"). Assumes membership already checked by _check_periods. + return sorted(dict.fromkeys(periods), key=lambda p: pre_index[p]) + + def _add(label: str, kind: str, var: str, periods: List[Any], op: str) -> None: + if label in labels: + raise ValueError( + f"Duplicate predictor label {label!r}. Each predictor (including " + "per-period outcome lags) must have a unique label." + ) + labels.add(label) + specs.append(_PredictorSpec(label=label, kind=kind, var=var, periods=periods, op=op)) + + def _check_periods(periods: List[Any], ctx: str) -> None: + if len(periods) == 0: + raise ValueError(f"{ctx} period list must not be empty.") + leak = [p for p in periods if p not in pre_set] + if leak: + raise ValueError( + f"{ctx} references periods {leak} outside the pre-treatment window " + "(would leak post-treatment information)." + ) + + # 1) predictor averages + if predictors: + if predictors_op not in _OP_FUNCS: + raise ValueError( + f"predictors_op must be one of {sorted(_OP_FUNCS)}, got {predictors_op!r}" + ) + if predictor_window is not None: + window = list(predictor_window) + _check_periods(window, "predictor_window") + window = _canon(window) + else: + window = list(pre_periods) + for var in predictors: + _add(var, "predictor_avg", var, window, predictors_op) + + # 2) special predictors + if special_predictors: + for entry in special_predictors: + if len(entry) != 3: + raise ValueError( + "Each special predictor must be a (var, periods, op) tuple; " f"got {entry!r}." + ) + var, periods, op = entry + if op not in _OP_FUNCS: + raise ValueError( + f"special predictor op must be one of {sorted(_OP_FUNCS)}, got {op!r}" + ) + periods = list(periods) + _check_periods(periods, f"special predictor {var!r}") + periods = _canon(periods) # reordered/duplicated -> same canonical spec + # Full ordered period list in the label so distinct period sets sharing + # the same endpoints/length (e.g. [2000,2002,2004] vs [2000,2003,2004]) + # do not collide — the label is also the v_weights / balance key. + label = f"{var}@{op}[{','.join(str(p) for p in periods)}]" + _add(label, "special", var, periods, op) + + # 3) per-period outcome lags + if pre_period_outcomes is not None: + if isinstance(pre_period_outcomes, str): + if pre_period_outcomes != "all": + raise ValueError( + f"pre_period_outcomes string must be 'all', got {pre_period_outcomes!r}" + ) + lag_periods = list(pre_periods) + else: + lag_periods = list(pre_period_outcomes) + _check_periods(lag_periods, "pre_period_outcomes") + lag_periods = _canon(lag_periods) + for p in lag_periods: + _add(f"{outcome}_{p}", "lag", outcome, [p], "identity") + + # default: nothing specified -> use all pre-period outcomes as lag predictors + if not specs: + for p in pre_periods: + _add(f"{outcome}_{p}", "lag", outcome, [p], "identity") + + return specs + + +def _build_predictor_matrix( + pivots: Dict[str, pd.DataFrame], + specs: List[_PredictorSpec], + treated_id: Any, + donor_ids: List[Any], +) -> Tuple[np.ndarray, np.ndarray, List[str]]: + """Build X1 (k,) and X0 (k, J) from the normalized predictor specs.""" + k = len(specs) + J = len(donor_ids) + X1 = np.empty(k, dtype=float) + X0 = np.empty((k, J), dtype=float) + labels: List[str] = [] + + for i, spec in enumerate(specs): + pv = pivots[spec.var] + block = pv.loc[spec.periods] # rows: periods, cols: units + if spec.op == "identity": + X1[i] = float(block[treated_id].iloc[0]) + X0[i, :] = block[donor_ids].iloc[0].to_numpy(dtype=float) + else: + op_func = _OP_FUNCS[spec.op] + X1[i] = float(op_func(block[treated_id].to_numpy(dtype=float))) + X0[i, :] = op_func(block[donor_ids].to_numpy(dtype=float), axis=0) + labels.append(spec.label) + + return X1, X0, labels + + +def _standardize( + X1: np.ndarray, X0: np.ndarray, mode: str +) -> Tuple[np.ndarray, np.ndarray, np.ndarray]: + """Row-standardize predictors by SD across donors+treated (ddof=1). + + Matches R ``Synth``: ``divisor <- sqrt(apply(cbind(X0, X1), 1, var))``. + Zero-variance rows keep divisor 1.0 (with a warning). + """ + k = X1.shape[0] + if mode == "none": + return X1.copy(), X0.copy(), np.ones(k) + combined = np.column_stack([X0, X1.reshape(-1, 1)]) # (k, J+1) + divisor = np.sqrt(np.var(combined, axis=1, ddof=1)) + zero = divisor <= 0 + if np.any(zero): + warnings.warn( + f"{int(np.sum(zero))} predictor row(s) have zero variance across " + "donors+treated; their standardization divisor is set to 1.0.", + UserWarning, + stacklevel=2, + ) + divisor = divisor.copy() + divisor[zero] = 1.0 + X1s = X1 / divisor + X0s = X0 / divisor[:, None] + return X1s, X0s, divisor + + +def _inner_solve_W( + X1s: np.ndarray, + X0s: np.ndarray, + v: np.ndarray, + inner_max_iter: int, + inner_min_decrease: float, +) -> Tuple[np.ndarray, bool]: + """Solve W*(V) = argmin_W (X1-X0 W)' diag(V) (X1-X0 W) on the unit simplex. + + Folds V^½ into the predictors and delegates to the Frank-Wolfe simplex solver. + Returns the RAW simplex weights (no sparsification) so the outer objective is + not perturbed; reporting-level cleanup happens in ``fit``. + """ + J = X0s.shape[1] + if J == 1: + return np.array([1.0]), True + vh = np.sqrt(v) + A = vh[:, None] * X0s # (k, J) + b = vh * X1s # (k,) + packed = np.column_stack([A, b]) # (k, J+1); last column is the target + min_decrease = inner_min_decrease * max(float(np.linalg.norm(b)), 1e-12) + with warnings.catch_warnings(): + warnings.filterwarnings("ignore", message=r".*did not converge.*", category=UserWarning) + w, converged = _sc_weight_fw( + packed, + zeta=0.0, + intercept=False, + min_decrease=min_decrease, + max_iter=inner_max_iter, + return_convergence=True, + ) + return np.asarray(w, dtype=float), bool(converged) + + +def _v_starts( + k: int, + X1: np.ndarray, + X0: np.ndarray, + X1s: np.ndarray, + X0s: np.ndarray, + Z1: np.ndarray, + Z0: np.ndarray, + n_starts: int, + rng: np.random.Generator, + inner_max_iter: int, + inner_min_decrease: float, +) -> Tuple[List[np.ndarray], int, int]: + """Build a list of DISTINCT starting ``theta`` vectors for the outer V search. + + Returns ``(candidates, n_inner_solves, n_inner_nonconverged)`` — the latter two + count the inner Frank-Wolfe solves run by the univariate-fit heuristic so the + caller can surface aggregate intermediate non-convergence. + + Heuristic starts: uniform V; inverse-row-variance V (computed from the + UNSTANDARDIZED predictors ``X1``/``X0`` — on the standardized rows every variance + is 1 by construction, so it would collapse to the uniform start); univariate-fit V + (v_i ∝ 1/MSPE_i from solving with mass concentrated on predictor i). Remaining + starts are random Dirichlet draws. Candidates are de-duplicated so the multistart + never runs the same Nelder-Mead seed twice; non-finite candidates are dropped + (validation 10); uniform is always retained. + """ + + def _to_theta(v: np.ndarray) -> Optional[np.ndarray]: + v = np.asarray(v, dtype=float) + # Keep only finite, positive mass; reject candidates with no usable mass + # (validation 10). Floor before log so a near-zero (or underflowed-to-0) + # Dirichlet draw cannot yield log(0) = -inf and force an infinite retry. + v = np.where(np.isfinite(v) & (v > 0), v, 0.0) + total = float(np.sum(v)) + if total <= 0: + return None + v = np.maximum(v / total, 1e-12) + theta = np.log(v) + theta = theta - np.mean(theta) + return theta if np.all(np.isfinite(theta)) else None + + def _add_unique(t: Optional[np.ndarray], pool: List[np.ndarray]) -> None: + # Append only DISTINCT, finite candidates so the multistart never runs the same + # Nelder-Mead seed twice (codex: a degenerate heuristic must not waste a start). + if t is not None and not any(np.allclose(t, e, atol=1e-9) for e in pool): + pool.append(t) + + # Candidates are generated lazily and we stop as soon as `target` DISTINCT starts are + # collected, so a small n_starts does not pay for heuristic starts it would only + # discard. In particular n_starts=1 returns the uniform start without running the + # O(k) univariate inner-solve loop below. + target = max(n_starts, 1) + candidates: List[np.ndarray] = [np.zeros(k)] # uniform V + + # inverse row variance of the UNSTANDARDIZED predictors over donors+treated. + # (On the standardized rows every variance is ~1, so this would collapse to the + # uniform start — using the raw scales makes it a genuinely different seed.) + if len(candidates) < target: + combined = np.column_stack([X0, X1.reshape(-1, 1)]) + row_var = np.var(combined, axis=1, ddof=1) + inv_var = np.where(row_var > 0, 1.0 / np.maximum(row_var, 1e-12), 0.0) + if np.sum(inv_var) > 0: + _add_unique(_to_theta(inv_var / np.sum(inv_var)), candidates) + + # univariate-fit start: v_i ∝ 1 / (pre-outcome MSPE of W solved with V=e_i). + # Skipped entirely when enough candidates are already collected (saves k inner solves). + inner_total = 0 + inner_nonconv = 0 + if len(candidates) < target: + uni_mspe = np.empty(k) + for i in range(k): + e = np.zeros(k) + e[i] = 1.0 + w_i, conv_i = _inner_solve_W(X1s, X0s, e, inner_max_iter, inner_min_decrease) + inner_total += 1 + if not conv_i: + # Don't trust a truncated solve: inf -> 0 inverse-MSPE weight, so this + # predictor doesn't shape the heuristic start. + inner_nonconv += 1 + uni_mspe[i] = np.inf + else: + uni_mspe[i] = float(np.mean((Z1 - Z0 @ w_i) ** 2)) + inv_mspe = np.where(uni_mspe > 0, 1.0 / np.maximum(uni_mspe, 1e-12), 0.0) + if np.sum(inv_mspe) > 0: + _add_unique(_to_theta(inv_mspe / np.sum(inv_mspe)), candidates) + + # random Dirichlet draws to fill the remaining slots with DISTINCT starts + attempts = 0 + max_attempts = 20 * n_starts + 20 + while len(candidates) < target and attempts < max_attempts: + attempts += 1 + _add_unique(_to_theta(rng.dirichlet(np.ones(k))), candidates) + + return candidates[:target], inner_total, inner_nonconv + + +def _outer_solve_V( + X1: np.ndarray, + X0: np.ndarray, + X1s: np.ndarray, + X0s: np.ndarray, + Z1: np.ndarray, + Z0: np.ndarray, + n_starts: int, + seed: Optional[int], + optimizer_options: Optional[Dict[str, Any]], + inner_max_iter: int, + inner_min_decrease: float, +) -> Tuple[np.ndarray, np.ndarray, bool, float]: + """Data-driven V selection: minimize pre-period outcome MSPE of W*(V). + + Multistart Nelder-Mead over ``theta`` (V = softmax(theta)) plus a derivative-free + Powell polish from the best point. Returns ``(v_star, w_star, converged, mspe)``. + """ + k = X1s.shape[0] + if k == 1: + v = np.array([1.0]) + w, converged = _inner_solve_W(X1s, X0s, v, inner_max_iter, inner_min_decrease) + return v, w, converged, float(np.mean((Z1 - Z0 @ w) ** 2)) + + # Track inner Frank-Wolfe non-convergence across ALL intermediate evaluations so + # the outer search cannot silently rank truncated W*(V) solves (codex). `_inner_solve_W` + # suppresses its own per-call warning during the search; we aggregate here. + _st = {"total": 0, "nonconv": 0} + + # Finite penalty for a non-converged evaluation: the objective is convex in w, so its + # maximum over the simplex is attained at a single-donor vertex. Penalizing above that + # bound guarantees a truncated W*(V) can never win the argmin, while staying FINITE + # (np.inf would flood scipy's simplex arithmetic with RuntimeWarnings). + _vertex_mspe = [float(np.mean((Z1 - Z0[:, j]) ** 2)) for j in range(Z0.shape[1])] + _penalty = 10.0 * (max(_vertex_mspe) + 1.0) if _vertex_mspe else 1.0 + + def objective(theta: np.ndarray) -> float: + v = _softmax(theta) + w, conv = _inner_solve_W(X1s, X0s, v, inner_max_iter, inner_min_decrease) + _st["total"] += 1 + if not conv: + # A truncated W*(V) is unusable for V ranking: in an argmin search even a + # single non-converged evaluation could win and silently flip the selected V. + # Penalize above the feasible objective bound so it can never be chosen (and + # is tallied for the aggregated warning below). + _st["nonconv"] += 1 + return _penalty + return float(np.mean((Z1 - Z0 @ w) ** 2)) + + nm_options = {"maxiter": 1000, "xatol": 1e-8, "fatol": 1e-8} + if optimizer_options: + nm_options.update(optimizer_options) + # Powell uses xtol/ftol rather than Nelder-Mead's xatol/fatol; translate so the + # same optimizer_options control both solvers without an OptimizeWarning. + powell_options = dict(nm_options) + if "xatol" in powell_options: + powell_options["xtol"] = powell_options.pop("xatol") + if "fatol" in powell_options: + powell_options["ftol"] = powell_options.pop("fatol") + + rng = np.random.default_rng(seed) + starts, start_total, start_nonconv = _v_starts( + k, X1, X0, X1s, X0s, Z1, Z0, n_starts, rng, inner_max_iter, inner_min_decrease + ) + _st["total"] += start_total + _st["nonconv"] += start_nonconv + + best_x: np.ndarray = starts[0] + best_fun = np.inf + outer_converged = False + for theta0 in starts: + res = minimize(objective, theta0, method="Nelder-Mead", options=nm_options) + outer_converged = outer_converged or bool(res.success) + if res.fun < best_fun: + best_fun = float(res.fun) + best_x = res.x + + # Derivative-free polish from the incumbent (best-of, mirrors R optimx). + res_p = minimize(objective, best_x, method="Powell", options=powell_options) + outer_converged = outer_converged or bool(res_p.success) + if res_p.fun < best_fun: + best_fun = float(res_p.fun) + best_x = res_p.x + + # Surface a silent under-optimized V: if neither the multistart Nelder-Mead nor + # the Powell polish reported success (e.g. optimizer_options={"maxiter": 1}), the + # selected V / donor weights / ATT may be sub-optimal. + if not outer_converged: + warnings.warn( + "Outer V-search (Nelder-Mead / Powell) did not converge; the selected " + "predictor-importance matrix V (and the resulting donor weights / ATT) may " + "be sub-optimal. Increase optimizer_options['maxiter'] or n_starts.", + UserWarning, + stacklevel=3, + ) + + # Aggregate intermediate inner Frank-Wolfe non-convergence across the whole nested + # search (univariate starts + every objective evaluation). Non-converged objective + # evaluations were excluded from V ranking (returned as +inf); warn on ANY such + # occurrence — unlike a bootstrap summary, an argmin search is sensitive to even one + # truncated solve, so no rate threshold is appropriate here. + if _st["nonconv"] > 0: + warnings.warn( + f"Inner Frank-Wolfe did not converge on {_st['nonconv']} of {_st['total']} " + f"weight solves during nested V selection (inner_max_iter={inner_max_iter}); " + "those evaluations were excluded from V ranking, but the search space was " + "effectively restricted, so the selected V / donor weights / ATT may be " + "sub-optimal. Increase inner_max_iter or relax inner_min_decrease.", + UserWarning, + stacklevel=3, + ) + + v_star = _softmax(best_x) + w_star, converged = _inner_solve_W(X1s, X0s, v_star, inner_max_iter, inner_min_decrease) + mspe = float(np.mean((Z1 - Z0 @ w_star) ** 2)) + return v_star, w_star, converged, mspe + + +def _compute_gap_path( + Y: pd.DataFrame, + w: np.ndarray, + treated_id: Any, + donor_ids: List[Any], + all_periods: List[Any], +) -> Dict[Any, float]: + """Period-keyed gap path ``α̂_1t = Y_1t − Σ_j w_j Y_jt`` over all periods.""" + treated_series = Y.loc[all_periods, treated_id].to_numpy(dtype=float) + donor_block = Y.loc[all_periods, donor_ids].to_numpy(dtype=float) # (T, J) + synthetic = donor_block @ w + gaps = treated_series - synthetic + return {period: float(g) for period, g in zip(all_periods, gaps)} diff --git a/diff_diff/synthetic_control_results.py b/diff_diff/synthetic_control_results.py new file mode 100644 index 00000000..e36c8352 --- /dev/null +++ b/diff_diff/synthetic_control_results.py @@ -0,0 +1,295 @@ +""" +Result container for the classic Synthetic Control Method (SCM) estimator. + +This module contains the ``SyntheticControlResults`` dataclass, extracted from +``synthetic_control.py`` to mirror the TROP estimator/results split. + +The classic synthetic control of Abadie, Diamond & Hainmueller (2010) produces a +gap path and donor/predictor weights but **no analytical standard error** — the +paper proposes permutation/placebo inference instead (a later PR). Accordingly +``se``/``t_stat``/``p_value``/``conf_int`` are always NaN on this object; the +point estimate ``att`` (average post-period gap) is the reported quantity. +""" + +from dataclasses import dataclass, field +from typing import Any, Dict, List, Optional, Tuple + +import numpy as np +import pandas as pd + +from diff_diff.results import _format_survey_block, _get_significance_stars + +__all__ = ["SyntheticControlResults"] + + +@dataclass +class SyntheticControlResults: + """ + Results from a classic Synthetic Control Method (SCM) estimation. + + Implements Abadie, Diamond & Hainmueller (2010), "Synthetic Control Methods + for Comparative Case Studies." A single treated unit's counterfactual is the + convex combination ``Σ_j w_j · Y_jt`` of donor units chosen to match the + treated unit's pre-period outcomes and predictors; the treatment effect path + is the gap ``α̂_1t = Y_1t − Σ_j w_j · Y_jt`` over the post periods. + + Attributes + ---------- + att : float + Average post-period gap (the reported point estimate). The per-period + gaps are in ``gap_path``. + se : float + Always NaN — classic SCM has no analytical standard error (inference is + permutation/placebo based; see Abadie-Diamond-Hainmueller 2010 §2.4). + t_stat, p_value : float + Always NaN (no analytical SE). + conf_int : tuple[float, float] + Always (NaN, NaN) (no analytical SE). + n_obs : int + Number of observations (treated + donor rows over all periods) used. + n_donors : int + Number of donor units in the (post-filter) donor pool. + n_pre_periods : int + Number of pre-treatment periods. + n_post_periods : int + Number of post-treatment periods. + donor_weights : dict + Mapping ``{donor_unit_id: weight}`` on the unit simplex. Weights below + the interpretability floor (1e-6) are dropped. + v_weights : dict + Mapping ``{predictor_label: v}`` — the diagonal predictor-importance + matrix V, trace-normalized to sum to 1. + predictor_balance : pandas.DataFrame + Predictor-balance table: for each predictor, the treated value, the + synthetic value (donor-weighted), and the donor-pool mean. + gap_path : dict + Mapping ``{period: gap}`` for ALL periods (pre periods carry the fit + residual used for ``pre_rmspe``; post periods carry the effect path). + pre_rmspe : float + Root mean squared prediction error over the pre-treatment periods (the + primary fit diagnostic). + mspe_v : float, optional + The outer-objective value (pre-period outcome MSPE of ``W*(V*)``), + populated only when the nested outer search actually runs; None on the + ``v_method="custom"`` path and on the degenerate single-donor path. + treated_unit : Any + The treated unit's identifier. + pre_periods, post_periods : list + Calendar-sorted pre / post period values. + v_method : str + ``"nested"`` (data-driven V) or ``"custom"`` (user-supplied V). + standardize : str + ``"std"`` (per-row SD scaling) or ``"none"``. + alpha : float + Significance level recorded for downstream (placebo) inference. + survey_metadata : Any, optional + Reserved; always None in this release. + """ + + att: float + se: float + t_stat: float + p_value: float + conf_int: Tuple[float, float] + n_obs: int + n_donors: int + n_pre_periods: int + n_post_periods: int + donor_weights: Dict[Any, float] + v_weights: Dict[str, float] + predictor_balance: pd.DataFrame + gap_path: Dict[Any, float] + pre_rmspe: float + treated_unit: Any + pre_periods: List[Any] + post_periods: List[Any] + v_method: str + standardize: str + alpha: float = 0.05 + mspe_v: Optional[float] = None + survey_metadata: Optional[Any] = field(default=None) + + # Reserved for PR-2 (placebo inference) / PR-3 (conformal). These are plain + # (un-annotated) class attributes, NOT dataclass fields, so dataclasses.fields() + # and dataclasses.asdict() cannot reach them; fit() sets them per instance. + _placebo_gaps = None + _rmspe_ratio = None + _fit_snapshot = None + + def __repr__(self) -> str: + """Concise string representation.""" + return ( + f"SyntheticControlResults(ATT={self.att:.4f}, " + f"pre_RMSPE={self.pre_rmspe:.4f}, " + f"n_donors={self.n_donors}, " + f"v_method={self.v_method!r})" + ) + + @property + def coef_var(self) -> float: + """Coefficient of variation: SE / abs(ATT). NaN here (SE is always NaN).""" + if not (np.isfinite(self.se) and self.se >= 0): + return np.nan + if not np.isfinite(self.att) or self.att == 0: + return np.nan + return self.se / abs(self.att) + + @property + def is_significant(self) -> bool: + """Always False — classic SCM produces no analytical p-value.""" + return bool(np.isfinite(self.p_value) and self.p_value < self.alpha) + + @property + def significance_stars(self) -> str: + """Significance stars based on p-value (empty here — p_value is NaN).""" + return _get_significance_stars(self.p_value) + + def summary(self, alpha: Optional[float] = None) -> str: + """ + Generate a formatted summary of the estimation results. + + Parameters + ---------- + alpha : float, optional + Significance level; defaults to the alpha used during estimation. + + Returns + ------- + str + Formatted summary table. + """ + alpha = alpha or self.alpha + + n_top = min(5, len(self.donor_weights)) + top_donors = sorted(self.donor_weights.items(), key=lambda kv: kv[1], reverse=True)[:n_top] + + lines = [ + "=" * 75, + "Synthetic Control Method (SCM) Estimation Results".center(75), + "Abadie, Diamond & Hainmueller (2010)".center(75), + "=" * 75, + "", + f"{'Observations:':<28} {self.n_obs:>10}", + f"{'Donor units:':<28} {self.n_donors:>10}", + f"{'Pre-treatment periods:':<28} {self.n_pre_periods:>10}", + f"{'Post-treatment periods:':<28} {self.n_post_periods:>10}", + f"{'Treated unit:':<28} {str(self.treated_unit):>10}", + "", + "-" * 75, + "Fit Diagnostics".center(75), + "-" * 75, + f"{'Pre-treatment RMSPE:':<28} {self.pre_rmspe:>10.4f}", + f"{'V selection:':<28} {self.v_method:>10}", + f"{'Standardization:':<28} {self.standardize:>10}", + ] + if self.mspe_v is not None and np.isfinite(self.mspe_v): + lines.append(f"{'Outer-objective MSPE:':<28} {self.mspe_v:>10.6f}") + + if self.survey_metadata is not None: + lines.extend(_format_survey_block(self.survey_metadata, 75)) + + lines.extend( + [ + "", + "-" * 75, + f"{'Top donor weights (w_j)':<40}", + "-" * 75, + ] + ) + for unit_id, w in top_donors: + lines.append(f"{' ' + str(unit_id):<40} {w:>10.4f}") + + lines.extend( + [ + "", + "-" * 75, + f"{'Parameter':<15} {'Estimate':>12} {'Std. Err.':>12} " + f"{'t-stat':>10} {'P>|t|':>10}", + "-" * 75, + f"{'ATT (avg gap)':<15} {self.att:>12.4f} {'n/a':>12} " f"{'n/a':>10} {'n/a':>10}", + "-" * 75, + "", + "Inference: classic SCM has no analytical standard error.", + "Use permutation / placebo inference for significance testing", + "(Abadie-Diamond-Hainmueller 2010, Section 2.4).", + "=" * 75, + ] + ) + + return "\n".join(lines) + + def print_summary(self, alpha: Optional[float] = None) -> None: + """Print the summary to stdout.""" + print(self.summary(alpha)) + + def to_dict(self) -> Dict[str, Any]: + """ + Convert scalar results to a dictionary. + + Returns + ------- + Dict[str, Any] + Dictionary of the scalar estimation results (weights/balance/gaps + are available via the ``get_*_df`` accessors). + """ + result = { + "att": self.att, + "se": self.se, + "t_stat": self.t_stat, + "p_value": self.p_value, + "conf_int_lower": self.conf_int[0], + "conf_int_upper": self.conf_int[1], + "n_obs": self.n_obs, + "n_donors": self.n_donors, + "n_pre_periods": self.n_pre_periods, + "n_post_periods": self.n_post_periods, + "pre_rmspe": self.pre_rmspe, + "mspe_v": self.mspe_v, + "treated_unit": self.treated_unit, + "v_method": self.v_method, + "standardize": self.standardize, + } + if self.survey_metadata is not None: + sm = self.survey_metadata + result["weight_type"] = sm.weight_type + result["effective_n"] = sm.effective_n + result["design_effect"] = sm.design_effect + return result + + def to_dataframe(self) -> pd.DataFrame: + """Convert scalar results to a single-row pandas DataFrame.""" + return pd.DataFrame([self.to_dict()]) + + def get_gap_df(self) -> pd.DataFrame: + """ + Get the gap (effect) path as a DataFrame, in calendar order. + + Rebuilt period-keyed from ``gap_path`` using the canonical + ``pre_periods + post_periods`` order so the row order is independent of + any dict-insertion order. Columns: ``period``, ``gap``, ``phase``. + + Returns + ------- + pandas.DataFrame + """ + rows = [] + for period in list(self.pre_periods) + list(self.post_periods): + if period in self.gap_path: + phase = "post" if period in self.post_periods else "pre" + rows.append({"period": period, "gap": self.gap_path[period], "phase": phase}) + return pd.DataFrame(rows, columns=["period", "gap", "phase"]) + + def get_weights_df(self) -> pd.DataFrame: + """ + Get donor weights as a DataFrame, sorted by weight descending. + + Returns + ------- + pandas.DataFrame + Columns: ``unit``, ``weight``. + """ + items = sorted(self.donor_weights.items(), key=lambda kv: kv[1], reverse=True) + return pd.DataFrame( + [{"unit": unit, "weight": w} for unit, w in items], + columns=["unit", "weight"], + ) diff --git a/docs/api/_autosummary/diff_diff.SyntheticControl.rst b/docs/api/_autosummary/diff_diff.SyntheticControl.rst new file mode 100644 index 00000000..3bb9c3b0 --- /dev/null +++ b/docs/api/_autosummary/diff_diff.SyntheticControl.rst @@ -0,0 +1,36 @@ +diff\_diff.SyntheticControl +=========================== + +.. currentmodule:: diff_diff + +.. autoclass:: SyntheticControl + :no-members: + + + .. rubric:: Methods + + .. autosummary:: + + ~SyntheticControl.__init__ + ~SyntheticControl.fit + ~SyntheticControl.get_params + ~SyntheticControl.set_params + + + + + .. rubric:: Attributes + + .. autosummary:: + + ~SyntheticControl.v_method + ~SyntheticControl.custom_v + ~SyntheticControl.optimizer_options + ~SyntheticControl.n_starts + ~SyntheticControl.inner_max_iter + ~SyntheticControl.inner_min_decrease + ~SyntheticControl.standardize + ~SyntheticControl.alpha + ~SyntheticControl.seed + ~SyntheticControl.results_ + ~SyntheticControl.is_fitted_ diff --git a/docs/api/_autosummary/diff_diff.SyntheticControlResults.rst b/docs/api/_autosummary/diff_diff.SyntheticControlResults.rst new file mode 100644 index 00000000..b39b5b90 --- /dev/null +++ b/docs/api/_autosummary/diff_diff.SyntheticControlResults.rst @@ -0,0 +1,53 @@ +diff\_diff.SyntheticControlResults +================================== + +.. currentmodule:: diff_diff + +.. autoclass:: SyntheticControlResults + :no-members: + + + .. rubric:: Methods + + .. autosummary:: + + ~SyntheticControlResults.__init__ + ~SyntheticControlResults.get_gap_df + ~SyntheticControlResults.get_weights_df + ~SyntheticControlResults.print_summary + ~SyntheticControlResults.summary + ~SyntheticControlResults.to_dataframe + ~SyntheticControlResults.to_dict + + + + + .. rubric:: Attributes + + .. autosummary:: + + ~SyntheticControlResults.alpha + ~SyntheticControlResults.coef_var + ~SyntheticControlResults.is_significant + ~SyntheticControlResults.mspe_v + ~SyntheticControlResults.significance_stars + ~SyntheticControlResults.survey_metadata + ~SyntheticControlResults.att + ~SyntheticControlResults.se + ~SyntheticControlResults.t_stat + ~SyntheticControlResults.p_value + ~SyntheticControlResults.conf_int + ~SyntheticControlResults.n_obs + ~SyntheticControlResults.n_donors + ~SyntheticControlResults.n_pre_periods + ~SyntheticControlResults.n_post_periods + ~SyntheticControlResults.donor_weights + ~SyntheticControlResults.v_weights + ~SyntheticControlResults.predictor_balance + ~SyntheticControlResults.gap_path + ~SyntheticControlResults.pre_rmspe + ~SyntheticControlResults.treated_unit + ~SyntheticControlResults.pre_periods + ~SyntheticControlResults.post_periods + ~SyntheticControlResults.v_method + ~SyntheticControlResults.standardize diff --git a/docs/api/index.rst b/docs/api/index.rst index 7ff6b19d..426e55f0 100644 --- a/docs/api/index.rst +++ b/docs/api/index.rst @@ -23,6 +23,7 @@ Core estimator classes for DiD analysis: diff_diff.StackedDiD diff_diff.TripleDifference diff_diff.TROP + diff_diff.SyntheticControl diff_diff.ContinuousDiD diff_diff.HeterogeneousAdoptionDiD diff_diff.EfficientDiD @@ -57,6 +58,7 @@ Result containers returned by estimators: diff_diff.TripleDifferenceResults diff_diff.StackedDiDResults diff_diff.TROPResults + diff_diff.SyntheticControlResults diff_diff.ContinuousDiDResults diff_diff.DoseResponseCurve diff_diff.HeterogeneousAdoptionDiDResults @@ -309,6 +311,7 @@ Estimators stacked_did triple_diff trop + synthetic_control continuous_did had efficient_did diff --git a/docs/api/synthetic_control.rst b/docs/api/synthetic_control.rst new file mode 100644 index 00000000..a9377898 --- /dev/null +++ b/docs/api/synthetic_control.rst @@ -0,0 +1,190 @@ +Synthetic Control Method (SCM) +============================== + +Classic synthetic control estimator for a single treated unit (Abadie, Diamond & +Hainmueller 2010; originating in Abadie & Gardeazabal 2003). + +The treated unit's counterfactual is a convex combination of "donor" (never-treated) +units. Donor weights ``W*(V)`` solve a simplex-constrained, predictor-importance-weighted +least-squares fit of the treated unit's pre-period predictors; the diagonal +predictor-importance matrix ``V`` is chosen either data-driven (minimizing pre-period +outcome MSPE, ``v_method="nested"``) or supplied by the user (``v_method="custom"``). The +treatment-effect path is the gap :math:`\hat{\alpha}_{1t} = Y_{1t} - \sum_j w_j Y_{jt}` +over the post periods. + +**When to use SCM:** + +- Exactly **one treated unit** with a long, well-fit pre-treatment period. +- A curated **donor pool** of comparable never-treated units. +- Aggregate / few-unit comparative case studies (states, regions, countries). + +**Inference:** classic SCM has **no analytical standard error**. ``se``, ``t_stat``, +``p_value`` and ``conf_int`` are NaN; ``att`` (the mean post-period gap) is the reported +estimate. The paper's in-space placebo permutation inference (post/pre RMSPE-ratio, +``rank/(J+1)`` p-value) is planned for a follow-up release. + +**Distinct from** :class:`~diff_diff.SyntheticDiD` (Arkhangelsky et al. 2021), which adds +time weights and ridge regularization; classic SCM uses **donor weights only** plus the +outer ``V`` search. + +**Reference:** Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic Control +Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco +Control Program. *Journal of the American Statistical Association*, 105(490), 493–505. +`doi:10.1198/jasa.2009.ap08746 `_ + +SyntheticControl +---------------- + +Main estimator class for classic synthetic control estimation. + +.. autoclass:: diff_diff.SyntheticControl + :no-index: + :members: + :undoc-members: + :show-inheritance: + :inherited-members: + + .. rubric:: Methods + + .. autosummary:: + + ~SyntheticControl.fit + ~SyntheticControl.get_params + ~SyntheticControl.set_params + +SyntheticControlResults +----------------------- + +Results container for synthetic control estimation. + +.. autoclass:: diff_diff.SyntheticControlResults + :no-index: + :members: + :undoc-members: + :show-inheritance: + + .. rubric:: Methods + + .. autosummary:: + + ~SyntheticControlResults.summary + ~SyntheticControlResults.print_summary + ~SyntheticControlResults.to_dict + ~SyntheticControlResults.to_dataframe + ~SyntheticControlResults.get_gap_df + ~SyntheticControlResults.get_weights_df + +Convenience Function +-------------------- + +.. autofunction:: diff_diff.synthetic_control + +Predictors and V selection +-------------------------- + +Predictor rows of ``X1`` (treated) / ``X0`` (donors) are built, in this canonical row +order (the ordering matches R ``Synth::dataprep``), from: + +.. list-table:: + :header-rows: 1 + :widths: 25 75 + + * - Argument + - Meaning + * - ``predictors`` + ``predictor_window`` + ``predictors_op`` + - Columns averaged over a pre-period window (default: all pre periods). + * - ``special_predictors`` + - ``(var, periods, op)`` triples, each averaged over its own periods/operator. + * - ``pre_period_outcomes`` + - Individual pre-period outcomes as predictor rows (``"all"`` or a list). When + no predictor arguments are given, defaults to all pre-period outcomes. + +``v_method="nested"`` selects the diagonal predictor-importance matrix ``V`` by minimizing +the pre-period **outcome** MSPE of ``W*(V)`` over a multistart Nelder-Mead search with a +derivative-free Powell polish. ``v_method="custom"`` takes a user-supplied ``custom_v`` +(one entry per predictor row, trace-normalized) and skips the outer search. + +.. note:: + + The predictor standardization (per-row SD over donors+treated, ddof=1) and the + optimizer are pinned from the R ``Synth`` source — they are not specified in + Abadie-Diamond-Hainmueller (2010). The outer objective uses all pre periods rather than + R's ``time.optimize.ssr`` window, so the nested ``V`` differs from R by an + efficiency-only choice. Predictor/outcome aggregation also **fails closed** on any + non-finite cell, whereas R ``dataprep`` uses ``na.rm=TRUE`` — restrict + ``predictor_window`` / ``special_predictors`` periods to where a variable is observed. + Predictor rows support only **equal-weight** linear combinations (``mean``, ``sum``, + per-period lags); ADH (2010) §2.3's general weighted form ``Σ_s k_s Y_is`` with + arbitrary ``k_s`` (and non-linear ops such as ``median``) is not accepted in this + release. See ``docs/methodology/REGISTRY.md`` §SyntheticControl for all deviation labels. + +Example Usage +------------- + +Basic usage with covariate and special predictors:: + + from diff_diff import SyntheticControl + + scm = SyntheticControl(v_method="nested", seed=0) + results = scm.fit( + data, + outcome="gdpcap", + treatment="treated", # absorbing 0/1 indicator + unit="region", + time="year", + predictors=["invest", "school.high"], + # Set predictor_window explicitly when a covariate is only observed on a + # subset of the pre periods — the default averages over ALL pre periods and + # fails closed if any selected cell is non-finite. + predictor_window=[1964, 1965, 1966, 1967, 1968, 1969], + special_predictors=[("gdpcap", [1960, 1965, 1969], "mean")], + ) + results.print_summary() + + # Effect path and donor weights + gap_df = results.get_gap_df() # period, gap, phase + weights_df = results.get_weights_df() # unit, weight (descending) + +Quick estimation with the convenience function:: + + from diff_diff import synthetic_control + + results = synthetic_control( + data, outcome="gdpcap", treatment="treated", + unit="region", time="year", + ) + print(f"ATT: {results.att:.3f}, pre-RMSPE: {results.pre_rmspe:.3f}") + +Supplying a fixed predictor-importance matrix (skips the outer V search):: + + import numpy as np + + scm = SyntheticControl(v_method="custom", custom_v=np.ones(n_predictors)) + results = scm.fit(data, outcome="gdpcap", treatment="treated", + unit="region", time="year", predictors=["invest"]) + +Comparison with Synthetic DiD +----------------------------- + +.. list-table:: + :header-rows: 1 + :widths: 25 35 40 + + * - Feature + - SyntheticControl + - SyntheticDiD + * - Unit (donor) weights + - Simplex, predictor-importance weighted + - Simplex, ridge-regularized + * - Time weights + - None (level matching) + - Simplex (double difference) + * - Predictor-importance ``V`` + - Nested / custom diagonal ``V`` + - No analog + * - Inference + - Placebo permutation (no analytical SE) + - Bootstrap / jackknife / placebo + +Use **SCM** for a single treated unit with a long pre-period and a curated donor pool; +use **SDID** when you have several treated units and parallel trends is plausible. diff --git a/docs/doc-deps.yaml b/docs/doc-deps.yaml index 09d8aa6d..7eb061a4 100644 --- a/docs/doc-deps.yaml +++ b/docs/doc-deps.yaml @@ -596,6 +596,40 @@ sources: - path: docs/performance-plan.md type: performance + # ── SyntheticControl ──────────────────────────────────────────────── + + diff_diff/synthetic_control.py: + drift_risk: medium + docs: + - path: docs/methodology/REGISTRY.md + section: "SyntheticControl" + type: methodology + - path: docs/api/synthetic_control.rst + type: api_reference + - path: README.md + section: "Estimators (one-line catalog entry)" + type: user_guide + - path: docs/references.rst + type: user_guide + - path: diff_diff/guides/llms-full.txt + section: "SyntheticControl" + type: user_guide + - path: diff_diff/guides/llms.txt + section: "Estimators" + type: user_guide + + diff_diff/synthetic_control_results.py: + drift_risk: low + docs: + - path: docs/methodology/REGISTRY.md + section: "SyntheticControl" + type: methodology + - path: docs/api/synthetic_control.rst + type: api_reference + - path: diff_diff/guides/llms-full.txt + section: "SyntheticControlResults" + type: user_guide + # ── HonestDiD ───────��────────────────────────────────────────────── diff_diff/honest_did.py: diff --git a/docs/methodology/REGISTRY.md b/docs/methodology/REGISTRY.md index a62ba301..ddafb3bf 100644 --- a/docs/methodology/REGISTRY.md +++ b/docs/methodology/REGISTRY.md @@ -21,6 +21,7 @@ This document provides the academic foundations and key implementation requireme - [WooldridgeDiD (ETWFE)](#wooldridgedid-etwfe) 3. [Advanced Estimators](#advanced-estimators) - [SyntheticDiD](#syntheticdid) + - [SyntheticControl](#syntheticcontrol) - [TripleDifference](#tripledifference) - [StaggeredTripleDifference](#staggeredtripledifference) - [TROP](#trop) @@ -1954,6 +1955,48 @@ Convergence criterion: stop when objective decrease < min_decrease² (default mi --- +## SyntheticControl + +**Primary source:** [Abadie, A., Diamond, A., & Hainmueller, J. (2010). "Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California's Tobacco Control Program." *JASA*, 105(490), 493–505.](https://doi.org/10.1198/jasa.2009.ap08746) Method originates in Abadie & Gardeazabal (2003). Paper reviews on file: `docs/methodology/papers/abadie-diamond-hainmueller-2010-review.md` (primary), `...-2015-review.md`, `abadie-2021-review.md`, `chernozhukov-wuthrich-zhu-2021-review.md`. + +Classic synthetic control (donor/unit weights only) for a single treated unit, distinct from `SyntheticDiD` (Arkhangelsky et al. 2021), which adds time weights and ridge regularization. Equation (1) of ADH 2010 shows classic SCM **generalizes the TWFE/DiD model** (recovered when the factor loadings `λ_t` are constant in time). + +**Assumption checks / warnings:** +- **One treated unit, block (absorbing) assignment** after `T0`; remaining units are the never-exposed donor pool (Section 2.2). Rejects >1 ever-treated unit (pass `treated_unit=` + curate `donor_pool=`); rejects an ever-treated unit in the donor pool (contamination). +- **No anticipation / absorbing treatment.** `post_periods` must be a contiguous suffix of the time axis, cross-checked against the treated unit's `D` column (`D==1` in any pre period → `ValueError`), on both the inferred and explicit branches. +- **Good pre-treatment fit is required, not assumed** (journal p. 495). Emits a `UserWarning` when pre-period RMSPE exceeds the SD of the treated unit's pre-period outcomes. +- **No interference / SUTVA across units; donor-pool curation** (exclude units with their own intervention/large shocks) — analyst-supplied via `donor_pool=`. + +**Estimator (Section 2.3):** +- Predictor matrices `X1` (k×1 treated) / `X0` (k×J donors) = covariates `Z` + linear combinations of pre-period outcomes (`predictors` averaged over `predictor_window`, `special_predictors`, and/or per-period outcome lags `pre_period_outcomes`). Canonical row order: predictor averages → special predictors → outcome lags (the row *order* matches R `Synth::dataprep`; aggregation semantics differ — see the `na.rm` deviation below). +- **Inner solve:** `W*(V) = argmin_W (X1 − X0 W)' diag(V) (X1 − X0 W)` s.t. `w_j ≥ 0, Σ w_j = 1`. Implemented by folding `V^½` into the predictors (`packed = [V^½·X0 | V^½·X1]`) and calling the Frank-Wolfe simplex solver `utils._sc_weight_fw(intercept=False, zeta=0)`. +- **Outer solve (`v_method="nested"`):** choose diagonal PSD `V` minimizing the pre-period **outcome** MSPE `mean((Z1 − Z0·W*(V))²)`. `v_method="custom"` skips the outer search and uses a user-supplied `custom_v` (trace-normalized). +- **Effect:** gap path `α̂_1t = Y_1t − Σ_j w_j·Y_jt`; `att` = mean post-period gap; `pre_rmspe` = pre-period fit diagnostic. + +**Inference:** **No analytical standard error** (Section 2.4) — `se`/`t_stat`/`p_value`/`conf_int` are NaN. The paper proposes in-space placebo permutation inference with the post/pre RMSPE-ratio statistic and `rank/(J+1)` p-value (deferred to a follow-up PR; `_placebo_gaps`/`_rmspe_ratio` are reserved on the results object). + +**Notes / deviations:** +- **Note:** The standardization divisor `divisor = sqrt(apply(cbind(X0,X1), 1, var))` (per-predictor SD over donors+treated, ddof=1) and the inner/outer optimizer are **not specified in ADH 2010** (which defers these numerics to Abadie & Gardeazabal 2003 App. B / the `Synth` software). The divisor is pinned from the R `Synth::synth` source; `solution.v` lives in this scaled predictor space, so the deterministic R-parity test feeds `custom_v` in the same scaled space. +- **Note:** The outer objective minimizes the pre-period outcome MSPE over **all** pre periods, whereas R `Synth` uses a `time.optimize.ssr` window (1960–1969 in the Basque example). The nested `V` therefore differs from R by an efficiency-only choice (the paper notes inferential validity holds for *any* `V`), so end-to-end nested parity is a tolerance band, not equality. +- **Note:** `V` is parametrized on the unit simplex via a softmax of an unconstrained vector (trace-normalization is identification-fixing, not a constraint loss); the multistart Nelder-Mead + derivative-free Powell polish approximates R's best-of-`optimx` behavior over the non-smooth outer objective. +- **Note:** The 1×SD poor-fit threshold is a defensive implementation choice in the spirit of the `SyntheticDiD` convention; ADH 2010 gives only the qualitative guidance "do not use SCM when the fit is poor" (no numeric cutoff). The warning fires whenever pre-period RMSPE exceeds the SD of the treated unit's pre-period outcomes — **including a flat treated pre-path** (`SD = 0`) with non-trivial RMSPE (a scale-aware roundoff floor suppresses the warning on a near-perfect flat fit). This is slightly broader than `SyntheticDiD`'s `SD > 0`-gated form, matching the literal RMSPE-exceeds-SD contract above. +- **Deviation from R:** `standardize="none"` disables predictor standardization entirely; R `Synth` always scales by the predictor SD. Provided for diagnostics; changes the geometry of the `V` objective. +- **Note:** predictor rows support only **equal-weight** linear combinations of pre-period values — `mean` (`k_s = 1/T0`), `sum` (`k_s = 1`), and per-period outcome lags (identity, a single `k_s = 1`). ADH (2010) §2.3 defines the general form `Ȳ_i^{K_m} = Σ_s k_s Y_is` with *arbitrary* weights `k_s`; this release does NOT accept user-supplied non-uniform `K_m` weight vectors (and `median` and other non-linear aggregations are intentionally excluded). The supported set still spans the standard `Synth::dataprep` `predictors.op` + `special.predictors` usage; arbitrary-weight `K_m` is a deferred extension. +- **Deviation from R:** predictor/outcome **aggregation fails closed on any non-finite (NaN/inf) cell**, whereas R `Synth::dataprep` hardwires `na.rm=TRUE` (aggregating over the observed cells of a partially-missing window). The fail-closed contract is deliberate: na-dropping silently aggregates different period subsets across units, yielding incomparable predictors with no warning. The analyst must restrict `predictor_window` / `special_predictors` / `pre_period_outcomes` periods (and the outcome panel) to where each variable is observed; both partially- and fully-missing windows raise `ValueError`. Only the row *ordering* matches `dataprep`, not the missing-data handling. + +**Reference implementation:** authors' `Synth` package for R/MATLAB/Stata (`Synth::synth`). **R-parity anchor:** the Basque Country study (Abadie-Gardeazabal 2003, `data("basque")`) — published synthetic = region 10 (Cataluña) 0.851 + region 14 (Madrid) 0.149, `loss.v` 0.0089. Two-tier test (`tests/test_methodology_synthetic_control.py`): Tier-1 feeds R's `solution.v` via `custom_v` → donor weights match to atol 1e-3 (deterministic); Tier-2 checks the nested fit in a band. + +**Requirements checklist:** +- [x] Donor weights on the unit simplex; exactly one treated unit, block assignment. +- [x] Predictors = covariates + linear combinations of pre-period outcomes (incl. "all pre-period outcomes" default). +- [x] Inner simplex-constrained weighted LS via `_sc_weight_fw` with diagonal PSD `V`. +- [x] Outer nested `V` (pre-period outcome MSPE) + user-supplied `custom_v`. +- [x] Gap path + pre-period RMSPE + predictor-balance table. +- [x] No analytical SE (NaN inference); placebo permutation inference deferred to a follow-up PR. +- [x] Predictor-leakage, absorbing-suffix/no-anticipation, empty-window, duplicate-label, and inner-non-convergence validation gates. + +--- + ## TripleDifference **Primary source:** [Ortiz-Villavicencio, M., & Sant'Anna, P.H.C. (2025). Better Understanding Triple Differences Estimators. arXiv:2505.09942v3.](https://arxiv.org/abs/2505.09942v3). Paper review on file: `docs/methodology/papers/ortiz-villavicencio-santanna-2025-review.md`. diff --git a/tests/data/synth_basque_golden.json b/tests/data/synth_basque_golden.json new file mode 100644 index 00000000..c9694aa1 --- /dev/null +++ b/tests/data/synth_basque_golden.json @@ -0,0 +1,336 @@ +{ + "config": { + "treated_regionno": 17, + "controls": [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 18], + "treatment_year": 1970, + "predictors": ["school.illit", "school.prim", "school.med", "school.high", "school.post.high", "invest"], + "predictors_op": "mean", + "predictor_window": [1964, 1965, 1966, 1967, 1968, 1969], + "special": [ + { + "var": "gdpcap", + "periods": [1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969], + "op": "mean" + }, + { + "var": "sec.agriculture", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "sec.energy", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "sec.industry", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "sec.construction", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "sec.services.venta", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "sec.services.nonventa", + "periods": [1961, 1963, 1965, 1967, 1969], + "op": "mean" + }, + { + "var": "popdens", + "periods": 1969, + "op": "mean" + } + ], + "time_optimize_ssr": [1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969], + "time_plot": [1955, 1997] + }, + "predictor_names": ["school.illit", "school.prim", "school.med", "school.high", "school.post.high", "invest", "special.gdpcap.1960.1969", "special.sec.agriculture.1961.1969", "special.sec.energy.1961.1969", "special.sec.industry.1961.1969", "special.sec.construction.1961.1969", "special.sec.services.venta.1961.1969", "special.sec.services.nonventa.1961.1969", "special.popdens.1969"], + "solution_v": [0.004190428953644, 0.002315632079895, 9.991756536063e-05, 0.0003209878506081, 0.0001849675435559, 0.01401235394405, 0.2115363120499, 0.09685164661694, 0.05728994603945, 0.08073585099101, 0.005087453822699, 0.006088182349676, 0.1323301567651, 0.3889561634281], + "solution_w": { + "2": 7.910422683473e-08, + "3": 9.279229329128e-08, + "4": 9.422082286706e-07, + "5": 6.401979219095e-08, + "6": 1.866746213847e-07, + "7": 1.257121355712e-06, + "8": 8.85623844536e-08, + "9": 1.582346123341e-07, + "10": 0.8508156971052, + "11": 7.439862860357e-07, + "12": 8.34954868822e-08, + "13": 2.238716692627e-07, + "14": 0.1491799735912, + "15": 1.361549083144e-07, + "16": 1.230732112502e-07, + "18": 1.500043873589e-07 + }, + "loss_v": 0.00886476977596, + "loss_w": 0.2547099107433, + "divisor": [203.157026666841, 896.481878143579, 74.131140341639, 20.224753349933, 14.411936438481, 3.246426201657, 1.174384747909, 10.295998784361, 4.201807222419, 9.685029613179, 1.180101475365, 7.370361548251, 1.749682658675, 104.149609523416], + "X1": [39.888464609782, 1031.742299397786, 90.35866800944, 25.727525075277, 13.47971979777, 24.647383054097, 5.285468451536, 6.843999958038, 4.106000041962, 45.081999969482, 6.15, 33.754000091553, 4.07200012207, 246.889999389648], + "X0": { + "school.illit": { + "2": 863.38916015625, + "3": 73.121225992839, + "4": 31.48842271169, + "5": 47.90390586853, + "6": 128.308287302653, + "7": 9.394910653432, + "8": 105.508144378662, + "9": 254.44970703125, + "10": 277.935236612956, + "11": 266.967600504557, + "12": 177.727966308594, + "13": 234.752001444499, + "14": 133.158962249756, + "15": 105.877480824788, + "16": 13.438049634298, + "18": 9.151831626892 + }, + "school.prim": { + "2": 3062.424886067708, + "3": 728.578928629557, + "4": 670.909393310547, + "5": 300.813618977865, + "6": 522.094955444336, + "7": 289.571731567383, + "8": 1766.928690592448, + "9": 1035.269368489583, + "10": 2883.361735026042, + "11": 1695.117126464844, + "12": 692.888549804688, + "13": 1666.930419921875, + "14": 1856.074279785156, + "15": 431.746805826823, + "16": 280.002395629883, + "18": 152.269465128581 + }, + "school.med": { + "2": 155.565317789714, + "3": 44.215388615926, + "4": 46.398482004801, + "5": 20.045204003652, + "6": 45.447489420573, + "7": 24.106413841248, + "8": 102.299929300944, + "9": 33.369543075562, + "10": 215.16547648112, + "11": 103.959093729655, + "12": 24.048449834188, + "13": 77.464477539062, + "14": 269.961647033691, + "15": 27.907433827718, + "16": 20.450935522715, + "18": 9.760034720103 + }, + "school.high": { + "2": 57.266496022542, + "3": 16.091675917308, + "4": 14.799357891083, + "5": 5.921604235967, + "6": 12.086848894755, + "7": 7.304419835409, + "8": 37.787317276001, + "9": 16.539727210999, + "10": 63.60831451416, + "11": 34.275772412618, + "12": 11.271675745646, + "13": 29.341940879822, + "14": 62.458658218384, + "15": 9.082539876302, + "16": 6.550352811813, + "18": 3.377169887225 + }, + "school.post.high": { + "2": 27.278923988342, + "3": 8.684415817261, + "4": 6.424504915873, + "5": 3.680154164632, + "6": 5.844122330348, + "7": 2.885214368502, + "8": 19.173426946004, + "9": 6.308164914449, + "10": 32.374610900879, + "11": 16.822105884552, + "12": 4.57056649526, + "13": 13.543060302734, + "14": 57.7049853007, + "15": 4.428527673086, + "16": 4.190367738406, + "18": 1.7327627937 + }, + "invest": { + "2": 19.320030848185, + "3": 21.577486038208, + "4": 22.769642512004, + "5": 24.441711902618, + "6": 25.954247156779, + "7": 29.071211496989, + "8": 19.652509053548, + "9": 17.97068977356, + "10": 22.520380338033, + "11": 23.70269648234, + "12": 20.239483833313, + "13": 20.606269041697, + "14": 16.234542687734, + "15": 20.70627117157, + "16": 19.955158869425, + "18": 18.055932044983 + }, + "special.gdpcap.1960.1969": { + "2": 2.560746934679, + "3": 3.699907151947, + "4": 3.733876023215, + "5": 5.215974016143, + "6": 3.051014006461, + "7": 3.87117252388, + "8": 2.807062253105, + "9": 2.240688396978, + "10": 5.147008081396, + "11": 3.843244777615, + "12": 1.926478157381, + "13": 2.516288179835, + "14": 5.976692424165, + "15": 2.899557297245, + "16": 3.996086850639, + "18": 3.809204559195 + }, + "special.sec.agriculture.1961.1969": { + "2": 24.193999862671, + "3": 21.725999832153, + "4": 12.362000083923, + "5": 13.130000114441, + "6": 19.943999862671, + "7": 15.921999549866, + "8": 29.718000030518, + "9": 36.086000823975, + "10": 6.9359998703, + "11": 18.581999969482, + "12": 37.948000335693, + "13": 28.862000274658, + "14": 1.86400001049, + "15": 19.352000427246, + "16": 24.704000091553, + "18": 30.32200050354 + }, + "special.sec.energy.1961.1969": { + "2": 2.774000024796, + "3": 6.277999973297, + "4": 18.647999954224, + "5": 2.076000022888, + "6": 7.818000030518, + "7": 2.894000053406, + "8": 7.818000030518, + "9": 5.180000019073, + "10": 2.880000019073, + "11": 2.698000001907, + "12": 2.646000027657, + "13": 5.327999973297, + "14": 2.074000000954, + "15": 10.22799987793, + "16": 2.740000009537, + "18": 2.88799996376 + }, + "special.sec.industry.1961.1969": { + "2": 18.276000213623, + "3": 22.779999923706, + "4": 24.125999832153, + "5": 18.25800037384, + "6": 9.816000175476, + "7": 36.529999542236, + "8": 16.474000167847, + "9": 17.17799987793, + "10": 40.055999755859, + "11": 28.045999908447, + "12": 10.8859998703, + "13": 17.881999969482, + "14": 23.834000396729, + "15": 19.643999862671, + "16": 28.398000335693, + "18": 26.611999893188 + }, + "special.sec.construction.1961.1969": { + "2": 8.129999923706, + "3": 7.83200006485, + "4": 9.006000137329, + "5": 8.293999958038, + "6": 8.669999885559, + "7": 5.975999927521, + "8": 7.143999958038, + "9": 5.80999994278, + "10": 6.706000041962, + "11": 6.170000076294, + "12": 8.770000076294, + "13": 7.03200006485, + "14": 8.357999992371, + "15": 6.289999961853, + "16": 6.98200006485, + "18": 5.238000011444 + }, + "special.sec.services.venta.1961.1969": { + "2": 38.186000061035, + "3": 34.289999389648, + "4": 30.151999664307, + "5": 51.751999664307, + "6": 45.278000640869, + "7": 33.484000396729, + "8": 30.844000244141, + "9": 29.238000106812, + "10": 39.068000030518, + "11": 39.064000701904, + "12": 31.720000076294, + "13": 33.714000701904, + "14": 52.713999938965, + "15": 36.17799911499, + "16": 30.468000030518, + "18": 28.29999961853 + }, + "special.sec.services.nonventa.1961.1969": { + "2": 8.444000053406, + "3": 7.095999908447, + "4": 5.707999897003, + "5": 6.493999958038, + "6": 8.48200006485, + "7": 5.197999954224, + "8": 7.999999904633, + "9": 6.505999851227, + "10": 4.356000041962, + "11": 5.444000053406, + "12": 8.038000106812, + "13": 7.188000011444, + "14": 11.162000083923, + "15": 8.311999893188, + "16": 6.709999847412, + "18": 6.643999958038 + }, + "special.popdens.1969": { + "2": 68.51000213623, + "3": 24.040000915527, + "4": 98.73999786377, + "5": 104.169998168945, + "6": 148.25, + "7": 87.389999389648, + "8": 28.770000457764, + "9": 22.379999160767, + "10": 153.119995117188, + "11": 128.699996948242, + "12": 28.969999313354, + "13": 91.019996643066, + "14": 442.450012207031, + "15": 73.360000610352, + "16": 44.169998168945, + "18": 46.580001831055 + } + }, + "years": [1955, 1956, 1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967, 1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997], + "treated_path": [3.853184630005, 3.945658296151, 4.033561734873, 4.023421896897, 4.013781968405, 4.285918396223, 4.574336095797, 4.898957353563, 5.197014981629, 5.338902978753, 5.465153005252, 5.545915627064, 5.614895726639, 5.852184933072, 6.08140541737, 6.17009424135, 6.283633404546, 6.555555398653, 6.810768561103, 7.105184302811, 7.377891682176, 7.232933621923, 7.089831372119, 6.786703607145, 6.639817386857, 6.56283917137, 6.500785454993, 6.545058607, 6.595329801139, 6.761496750091, 6.937160671728, 7.332191151301, 7.742788123594, 8.120536640759, 8.509711162324, 8.776777889074, 9.025278666196, 8.873892824706, 8.718223539089, 9.018137849286, 9.440873861653, 9.686518137675, 10.170665872809], + "synthetic_path": [3.702941626236, 3.85396941359, 3.996388245904, 4.029401685261, 4.059672772757, 4.378924693221, 4.733052176452, 4.987634927316, 5.222027456975, 5.298503665995, 5.362138883475, 5.448530450656, 5.523111080498, 5.760711870916, 5.993100448641, 6.137916329668, 6.294338685153, 6.620780377198, 6.933000322789, 7.087051801242, 7.228035388368, 7.220668043084, 7.211129382309, 7.07464338848, 7.057308667347, 7.129300695567, 7.234425280207, 7.325333530431, 7.421847797791, 7.516347610321, 7.610133692242, 8.117951258459, 8.623595409787, 9.086804616765, 9.545547512504, 9.788247373834, 10.037692891258, 9.838212341386, 9.639052163101, 9.987880532374, 10.303885350264, 10.538434998253, 10.998744701142], + "gap": [0.1502430037689, 0.09168888256074, 0.03717348896852, -0.005979788364376, -0.04589080435155, -0.09300629699853, -0.1587160806545, -0.08867757375288, -0.02501247534587, 0.0403993127577, 0.1030141217766, 0.09738517640774, 0.0917846461413, 0.09147306215556, 0.08830496872896, 0.03217791168141, -0.01070528060644, -0.06522497854548, -0.1222317616855, 0.01813250156919, 0.1498562938071, 0.01226557883842, -0.1212980101894, -0.2879397813353, -0.4174912804894, -0.5664615241969, -0.7336398252139, -0.7802749234314, -0.8265179966512, -0.7548508602291, -0.6729730205141, -0.7857601071586, -0.8808072861931, -0.9662679760057, -1.035836350179, -1.01146948476, -1.012414225062, -0.96431951668, -0.9208286240117, -0.969742683088, -0.863011488611, -0.851916860578, -0.8280788283333] +} diff --git a/tests/data/synth_basque_panel.csv b/tests/data/synth_basque_panel.csv new file mode 100644 index 00000000..93a97289 --- /dev/null +++ b/tests/data/synth_basque_panel.csv @@ -0,0 +1,732 @@ +"regionno","regionname","year","gdpcap","sec.agriculture","sec.energy","sec.industry","sec.construction","sec.services.venta","sec.services.nonventa","school.illit","school.prim","school.med","school.high","school.post.high","popdens","invest","treated" +2,"Andalucia",1955,1.68873183014256,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1956,1.75849753289657,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1957,1.8276206972394,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1958,1.85275629430933,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1959,1.87803484718652,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1960,2.01013996001143,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1961,2.12917747377702,27.7199993133545,2.77999997138977,18.4500007629395,6.96999979019165,36.6300010681152,7.44999980926514,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1962,2.28034844571473,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1963,2.4310197260883,27.1299991607666,2.54999995231628,18.1000003814697,7.73000001907349,37.6699981689453,6.82000017166138,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1964,2.50885458681727,NA,NA,NA,NA,NA,NA,924.403015136719,3028.12719726562,129.512100219727,48.9593620300293,24.9379730224609,NA,17.4014949798584,0 +2,"Andalucia",1965,2.58469002752841,24.5200004577637,2.65000009536743,18.1800003051758,8.55000019073486,38.3199996948242,7.78000020980835,899.348999023438,3036.81640625,136.179748535156,51.7772178649902,25.7527732849121,NA,18.3089637756348,0 +2,"Andalucia",1966,2.69444434855945,NA,NA,NA,NA,NA,NA,874.787658691406,3049.4423828125,143.5947265625,54.983226776123,26.6312026977539,NA,18.4299926757812,0 +2,"Andalucia",1967,2.80234220537994,21.8700008392334,2.65000009536743,18.2199993133545,8.53999996185303,39.0800018310547,9.64999961853027,850.710571289062,3072.63598632812,152.235733032227,58.6103782653809,27.6725540161133,NA,19.1156578063965,0 +2,"Andalucia",1968,2.98736079765435,NA,NA,NA,NA,NA,NA,827.109252929688,3086.955078125,174.595489501953,62.4910278320312,28.2954711914062,NA,20.6084861755371,0 +2,"Andalucia",1969,3.17909177525729,19.7299995422363,3.24000000953674,18.4300003051758,8.85999965667725,39.2299995422363,10.5200004577637,803.975463867188,3100.572265625,197.274108886719,66.7777633666992,30.3835697174072,68.5100021362305,22.0555896759033,0 +2,"Andalucia",1970,3.35432726531259,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3137626647949,0 +2,"Andalucia",1971,3.52292206586344,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3175086975098,0 +2,"Andalucia",1972,3.75621265018031,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.1272926330566,0 +2,"Andalucia",1973,3.98771781234937,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8019485473633,0 +2,"Andalucia",1974,4.05184220872719,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.6468963623047,0 +2,"Andalucia",1975,4.11218219878026,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.7574157714844,0 +2,"Andalucia",1976,4.22179356400403,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5302619934082,0 +2,"Andalucia",1977,4.33169084084235,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.8983554840088,0 +2,"Andalucia",1978,4.30519868904643,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.8129272460938,0 +2,"Andalucia",1979,4.2269351788161,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.6916809082031,0 +2,"Andalucia",1980,4.21151120606166,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.9232215881348,0 +2,"Andalucia",1981,4.20701245654143,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8307266235352,0 +2,"Andalucia",1982,4.24907153689906,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.1853771209717,0 +2,"Andalucia",1983,4.29163096258212,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1268634796143,0 +2,"Andalucia",1984,4.35868333796371,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.9248065948486,0 +2,"Andalucia",1985,4.42659257650716,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.382495880127,0 +2,"Andalucia",1986,4.66323891764741,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.3726501464844,0 +2,"Andalucia",1987,4.90067107988677,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.4932041168213,0 +2,"Andalucia",1988,5.15959717075813,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3585662841797,0 +2,"Andalucia",1989,5.41773787637126,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.7622928619385,0 +2,"Andalucia",1990,5.5852611621289,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.3183250427246,0 +2,"Andalucia",1991,5.74921447527269,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.49245262146,0 +2,"Andalucia",1992,5.64124535846901,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.3187294006348,0 +2,"Andalucia",1993,5.53491849008542,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1836738586426,0 +2,"Andalucia",1994,5.63881728894981,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.2683906555176,0 +2,"Andalucia",1995,5.72072268553851,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.1725749969482,0 +2,"Andalucia",1996,5.99592961232617,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +2,"Andalucia",1997,6.30098553733532,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1955,2.28877455736698,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1956,2.44515862202038,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1957,2.60339893296391,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1958,2.63903175752798,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1959,2.67709221577038,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1960,2.8814623608517,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1961,3.09954297028416,25.0699996948242,6.90999984741211,21.6599998474121,6.80000019073486,32.8499984741211,6.71000003814697,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1962,3.35918318643055,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1963,3.61418169724945,25.3400001525879,6.07999992370605,21.7299995422363,7.53000020980835,33.3499984741211,5.96000003814697,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1964,3.68009129785463,NA,NA,NA,NA,NA,NA,77.6694641113281,722.28515625,36.2118072509766,14.2490568161011,8.3411340713501,NA,21.2724781036377,0 +3,"Aragon",1965,3.74528699110522,21.0200004577637,6.26999998092651,23.6000003814697,8.28999996185303,34.2299995422363,6.59999990463257,75.8038024902344,723.77685546875,38.2993583679199,14.8717565536499,8.45247268676758,NA,22.7678966522217,0 +3,"Aragon",1966,3.88331910400042,NA,NA,NA,NA,NA,NA,73.9733581542969,726.22314453125,40.6244239807129,15.5857191085815,8.57572555541992,NA,21.4764251708984,0 +3,"Aragon",1967,4.01613812417992,18.4799995422363,6.07999992370605,23.3799991607666,8.43000030517578,35.7099990844727,7.92999982833862,72.1775436401367,730.974853515625,43.3277626037598,16.3962135314941,8.74120998382568,NA,19.729175567627,0 +3,"Aragon",1968,4.24364488215421,NA,NA,NA,NA,NA,NA,70.4157562255859,733.101989746094,49.993724822998,17.2527370452881,8.76598262786865,NA,20.5848808288574,0 +3,"Aragon",1969,4.47622090535517,18.7199993133545,6.05000019073486,23.5300006866455,8.10999965667725,35.310001373291,8.27999973297119,68.6874313354492,735.111572265625,56.8352546691895,18.1945724487305,9.22996997833252,24.0400009155273,23.6340599060059,0 +3,"Aragon",1970,4.59625802016946,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.9269580841064,0 +3,"Aragon",1971,4.72372186350775,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.1613140106201,0 +3,"Aragon",1972,5.00128538191097,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.9001197814941,0 +3,"Aragon",1973,5.28349060564169,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.707145690918,0 +3,"Aragon",1974,5.44237378187705,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.394905090332,0 +3,"Aragon",1975,5.60097104649788,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.4223594665527,0 +3,"Aragon",1976,5.73657508987388,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.0618801116943,0 +3,"Aragon",1977,5.86639552482773,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.859992980957,0 +3,"Aragon",1978,5.92502135271619,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.286657333374,0 +3,"Aragon",1979,5.95279923509578,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.492244720459,0 +3,"Aragon",1980,5.97472159530871,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.7298030853271,0 +3,"Aragon",1981,6.01113958721057,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.8206253051758,0 +3,"Aragon",1982,6.13360477154407,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.4890670776367,0 +3,"Aragon",1983,6.26085418117145,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.5623760223389,0 +3,"Aragon",1984,6.37289361591443,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.1013278961182,0 +3,"Aragon",1985,6.49550132021431,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.533239364624,0 +3,"Aragon",1986,6.92652092426717,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.3258762359619,0 +3,"Aragon",1987,7.3586118251928,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.5995044708252,0 +3,"Aragon",1988,7.80277028828817,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.0823040008545,0 +3,"Aragon",1989,8.24264530725596,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.157751083374,0 +3,"Aragon",1990,8.45829762924879,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.2104244232178,0 +3,"Aragon",1991,8.6682378207231,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.0179557800293,0 +3,"Aragon",1992,8.46686626086743,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.8654041290283,0 +3,"Aragon",1993,8.25692694107487,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5592422485352,0 +3,"Aragon",1994,8.57397851450925,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.9211864471436,0 +3,"Aragon",1995,8.84675824345946,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.6667442321777,0 +3,"Aragon",1996,9.09668683529036,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +3,"Aragon",1997,9.51870894030277,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1955,2.50292780466742,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1956,2.61553840887816,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1957,2.72579263939377,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1958,2.75185657752919,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1959,2.77742082410048,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1960,2.96729511957242,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1961,3.14388738227415,14.1199998855591,21.3600006103516,22.0799999237061,9.02999973297119,28.4599990844727,4.94999980926514,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1962,3.37353608023266,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1963,3.59725799604131,15.1700000762939,19.0400009155273,22.3600006103516,8.96000003814697,29.7099990844727,4.76999998092651,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1964,3.67259352726788,NA,NA,NA,NA,NA,NA,29.012674331665,668.60009765625,37.5539970397949,12.8383855819702,5.93828773498535,NA,21.048454284668,0 +4,"Principado De Asturias",1965,3.74335883107059,12.1000003814697,18.4699993133545,24.4099998474121,9.5600004196167,30.0300006866455,5.42999982833862,30.0442886352539,668.96533203125,39.8879127502441,13.5026836395264,6.10620307922363,NA,22.6595020294189,0 +4,"Principado De Asturias",1966,3.9093828242154,NA,NA,NA,NA,NA,NA,31.0444412231445,670.01220703125,42.4897613525391,14.2605600357056,6.28775262832642,NA,22.340934753418,0 +4,"Principado De Asturias",1967,4.07312213948916,10.5600004196167,18.3099994659424,24.7999992370605,8.68000030517578,31.1700000762939,6.48000001907349,32.0137672424316,672.922668457031,45.5107574462891,15.1190309524536,6.50611162185669,NA,21.4814567565918,0 +4,"Principado De Asturias",1968,4.30862614169389,NA,NA,NA,NA,NA,NA,32.9529037475586,672.704772949219,52.7372436523438,16.0334854125977,6.62471008300781,NA,23.0380535125732,0 +4,"Principado De Asturias",1969,4.54970019029161,9.85999965667725,16.0599994659424,26.9799995422363,8.80000019073486,31.3899993896484,6.90999984741211,33.8624610900879,672.251281738281,60.2112197875977,17.0420017242432,7.08396434783936,98.7399978637695,26.0494537353516,0 +4,"Principado De Asturias",1970,4.63160515103944,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.9371795654297,0 +4,"Principado De Asturias",1971,4.69965734539016,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.7478637695312,0 +4,"Principado De Asturias",1972,5.02256487683207,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0096817016602,0 +4,"Principado De Asturias",1973,5.34561536408124,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.8314018249512,0 +4,"Principado De Asturias",1974,5.50249933821921,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.3478260040283,0 +4,"Principado De Asturias",1975,5.65195658383319,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0620956420898,0 +4,"Principado De Asturias",1976,5.59225945907196,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.9734210968018,0 +4,"Principado De Asturias",1977,5.53870333225105,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.2392463684082,0 +4,"Principado De Asturias",1978,5.61461025086582,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.3666744232178,0 +4,"Principado De Asturias",1979,5.70429889381494,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.1282520294189,0 +4,"Principado De Asturias",1980,5.79698655783058,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.1302757263184,0 +4,"Principado De Asturias",1981,5.93566110017674,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.4835338592529,0 +4,"Principado De Asturias",1982,5.84975729939325,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.848123550415,0 +4,"Principado De Asturias",1983,5.76906615548459,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.3350830078125,0 +4,"Principado De Asturias",1984,5.88731806607152,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.7501068115234,0 +4,"Principado De Asturias",1985,6.01128254301784,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3055534362793,0 +4,"Principado De Asturias",1986,6.23479002511559,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.2548580169678,0 +4,"Principado De Asturias",1987,6.4652958041564,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5103187561035,0 +4,"Principado De Asturias",1988,6.68816042096231,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1311721801758,0 +4,"Principado De Asturias",1989,6.91352458519106,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.3636207580566,0 +4,"Principado De Asturias",1990,6.98314755005824,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.2122764587402,0 +4,"Principado De Asturias",1991,7.04063103901118,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.6604309082031,0 +4,"Principado De Asturias",1992,6.92280756000518,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.5400199890137,0 +4,"Principado De Asturias",1993,6.79834317357429,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.5378379821777,0 +4,"Principado De Asturias",1994,6.95415585083949,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.6073780059814,0 +4,"Principado De Asturias",1995,7.11646691556319,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.4219799041748,0 +4,"Principado De Asturias",1996,7.21722373755378,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +4,"Principado De Asturias",1997,7.4757213968442,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1955,3.14395886017778,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1956,3.34775783579134,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1957,3.54962865137026,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1958,3.64267348698363,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1959,3.73486167735558,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1960,4.05884050567026,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1961,4.36025410848017,16.3400001525879,2.14000010490417,22.1599998474121,7.05999994277954,45.3800010681152,6.92999982833862,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1962,4.64617269647288,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1963,4.91152526105822,15.8000001907349,1.94000005722046,20.0900001525879,8.14999961853027,48.2900009155273,5.73000001907349,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1964,5.05069971288895,NA,NA,NA,NA,NA,NA,48.0786170959473,294.38134765625,14.7882413864136,5.30359840393066,3.35199999809265,NA,14.1790342330933,0 +5,"Baleares (Islas)",1965,5.18466150784486,12.5100002288818,2.14000010490417,18.7900009155273,8.88000011444092,52.0999984741211,5.57999992370605,48.0171203613281,296.678833007812,16.258171081543,5.51208782196045,3.46646857261658,NA,19.0025005340576,0 +5,"Baleares (Islas)",1966,5.46679525367194,NA,NA,NA,NA,NA,NA,47.9492034912109,299.291717529297,17.9040660858154,5.75198602676392,3.58977746963501,NA,26.2534103393555,0 +5,"Baleares (Islas)",1967,5.73764638674664,11.0100002288818,2.11999988555908,16.5300006866455,8.60000038146973,54.7799987792969,6.96000003814697,47.8750381469727,302.806091308594,19.802547454834,6.02474308013916,3.7353618144989,NA,29.701530456543,0 +5,"Baleares (Islas)",1968,6.16145369598642,NA,NA,NA,NA,NA,NA,47.7947998046875,304.936309814453,23.6700248718262,6.31135892868042,3.82472586631775,NA,30.7616958618164,0 +5,"Baleares (Islas)",1969,6.58169103261233,9.98999977111816,2.03999996185303,13.7200002670288,8.77999973297119,58.2099990844727,7.26999998092651,47.7086563110352,306.787414550781,27.8481731414795,6.62585115432739,4.11259126663208,104.169998168945,26.7520999908447,0 +5,"Baleares (Islas)",1970,6.88703243339515,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.4437465667725,0 +5,"Baleares (Islas)",1971,7.16866626973767,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.7412166595459,0 +5,"Baleares (Islas)",1972,7.5706937387309,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.8075218200684,0 +5,"Baleares (Islas)",1973,7.95629785184144,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.1038875579834,0 +5,"Baleares (Islas)",1974,7.97272207940588,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3105583190918,0 +5,"Baleares (Islas)",1975,7.97414989411508,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3685703277588,0 +5,"Baleares (Islas)",1976,8.03484683784544,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.8859233856201,0 +5,"Baleares (Islas)",1977,8.0805482404023,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.0879421234131,0 +5,"Baleares (Islas)",1978,8.04198765475489,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.650239944458,0 +5,"Baleares (Islas)",1979,8.00485575549843,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.8388481140137,0 +5,"Baleares (Islas)",1980,8.25978257049325,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.2605838775635,0 +5,"Baleares (Islas)",1981,8.53113361305252,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.8526191711426,0 +5,"Baleares (Islas)",1982,8.7225078548986,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.5585250854492,0 +5,"Baleares (Islas)",1983,8.92530722946346,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,11.6211080551147,0 +5,"Baleares (Islas)",1984,9.27592116538132,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,10.1354694366455,0 +5,"Baleares (Islas)",1985,9.65224186783687,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,11.3290481567383,0 +5,"Baleares (Islas)",1986,10.2577834904313,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.519567489624,0 +5,"Baleares (Islas)",1987,10.8233358409874,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.2754974365234,0 +5,"Baleares (Islas)",1988,11.1203945217661,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.1139583587646,0 +5,"Baleares (Islas)",1989,11.4081689202081,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.8808898925781,0 +5,"Baleares (Islas)",1990,11.5124246727498,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.360876083374,0 +5,"Baleares (Islas)",1991,11.6795199627673,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.8074073791504,0 +5,"Baleares (Islas)",1992,11.3196226161945,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.028039932251,0 +5,"Baleares (Islas)",1993,10.9697225876312,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.9990978240967,0 +5,"Baleares (Islas)",1994,11.4195940529268,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.0898628234863,0 +5,"Baleares (Islas)",1995,11.7737792689812,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.896915435791,0 +5,"Baleares (Islas)",1996,11.9265920534981,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +5,"Baleares (Islas)",1997,12.3500428449015,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1955,1.91438157910519,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1956,2.07183672271092,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1957,2.22607819361879,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1958,2.22086553674397,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1959,2.21343902614041,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1960,2.35768361487957,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1961,2.44572979148815,26.2099990844727,9.53999996185303,9.78999996185303,6.17999982833862,40.2299995422363,8.0600004196167,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1962,2.64824325443846,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1963,2.84475867525573,23.9099998474121,8.42000007629395,9.85000038146973,6.80999994277954,43.9900016784668,7.01999998092651,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1964,2.95115680362251,NA,NA,NA,NA,NA,NA,134.977569580078,505.942596435547,33.2646560668945,10.2709188461304,5.27583169937134,NA,17.0259399414062,0 +6,"Canarias",1965,3.05419873932504,19.4300003051758,8,10.0500001907349,8.68000030517578,46.1500015258789,7.69999980926514,132.246612548828,512.223693847656,36.6822166442871,10.8871955871582,5.47498607635498,NA,20.9989643096924,0 +6,"Canarias",1966,3.2317908209959,NA,NA,NA,NA,NA,NA,129.563690185547,518.962585449219,40.5098114013672,11.587664604187,5.68915796279907,NA,26.6728916168213,0 +6,"Canarias",1967,3.40338464261327,16.1800003051758,6.92000007629395,9.82999992370605,10.3999996185303,47.0900001525879,9.59000015258789,126.92797088623,527.237548828125,44.9232215881348,12.3797988891602,5.93983459472656,NA,30.120325088501,0 +6,"Canarias",1968,3.66031153138724,NA,NA,NA,NA,NA,NA,124.338722229004,532.043151855469,53.8288078308105,13.2286462783813,6.10211324691772,NA,32.150463104248,0 +6,"Canarias",1969,3.91288219060737,13.9899997711182,6.21000003814697,9.5600004196167,11.2799997329712,48.9300003051758,10.039999961853,121.79515838623,536.16015625,63.4762229919434,14.1668691635132,6.5828104019165,148.25,28.7568988800049,0 +6,"Canarias",1970,4.22493554087783,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.9603977203369,0 +6,"Canarias",1971,4.49357347398511,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.4238739013672,0 +6,"Canarias",1972,4.78191925981528,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.6821231842041,0 +6,"Canarias",1973,5.05491298663552,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.5648536682129,0 +6,"Canarias",1974,4.967794933172,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0051422119141,0 +6,"Canarias",1975,4.88153374287034,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.5993461608887,0 +6,"Canarias",1976,4.94844316244466,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.8343353271484,0 +6,"Canarias",1977,5.00792628933586,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.2992210388184,0 +6,"Canarias",1978,5.04091682859027,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.8448696136475,0 +6,"Canarias",1979,5.23350488727551,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.7241382598877,0 +6,"Canarias",1980,5.43708921125772,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.553279876709,0 +6,"Canarias",1981,5.65824053758078,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.641674041748,0 +6,"Canarias",1982,5.68730371470317,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3411331176758,0 +6,"Canarias",1983,5.71986582237664,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.9887752532959,0 +6,"Canarias",1984,5.80134235154353,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.7615900039673,0 +6,"Canarias",1985,5.88560390390692,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.1585693359375,0 +6,"Canarias",1986,6.25678386323216,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.6497135162354,0 +6,"Canarias",1987,6.61268193392848,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.7193813323975,0 +6,"Canarias",1988,6.97700655211792,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.3144397735596,0 +6,"Canarias",1989,7.33790328181903,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.8593711853027,0 +6,"Canarias",1990,7.34504409872849,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.5112228393555,0 +6,"Canarias",1991,7.34718669247403,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.8962173461914,0 +6,"Canarias",1992,7.22007980281304,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.1320972442627,0 +6,"Canarias",1993,7.09218796373469,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.6260604858398,0 +6,"Canarias",1994,7.41074013730452,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.9668464660645,0 +6,"Canarias",1995,7.61639514128776,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.6725215911865,0 +6,"Canarias",1996,7.81705192210708,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +6,"Canarias",1997,8.06055447606487,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1955,2.55941169257165,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1956,2.69387317909169,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1957,2.8203369855404,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1958,2.87903450925294,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1959,2.94373029301896,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1960,3.1370322590588,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1961,3.32762067980577,16.7099990844727,3.1800000667572,37.9199981689453,5.61999988555908,31.6800003051758,4.90000009536743,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1962,3.55534143565008,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1963,3.77142262506471,18.2399997711182,2.94000005722046,36.1800003051758,5.75,32.5999984741211,4.28999996185303,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1964,3.83940290567092,NA,NA,NA,NA,NA,NA,9.68474006652832,284.568359375,20.7413806915283,6.54473781585693,2.73772668838501,NA,25.9189758300781,0 +7,"Cantabria",1965,3.90609832737521,16.2099990844727,2.77999997138977,36.75,6.28999996185303,33.2400016784668,4.73999977111816,9.56692123413086,286.330200195312,21.5585098266602,6.80100536346436,2.78703331947327,NA,28.1631698608398,0 +7,"Cantabria",1966,4.03213348432256,NA,NA,NA,NA,NA,NA,9.45053100585938,288.415008544922,22.4632587432861,7.09592056274414,2.84099197387695,NA,29.7192420959473,0 +7,"Cantabria",1967,4.15595544130248,14.9700002670288,2.69000005722046,35.8699989318848,6.17999982833862,34.5800018310547,5.71999979019165,9.33555126190186,291.335388183594,23.5245227813721,7.43125104904175,2.90978312492371,NA,28.967601776123,0 +7,"Cantabria",1968,4.37589251494551,NA,NA,NA,NA,NA,NA,9.2219648361206,292.683197021484,26.6404342651367,7.78354120254517,2.93244957923889,NA,29.9379234313965,0 +7,"Cantabria",1969,4.61082556560291,13.4799995422363,2.88000011444092,35.9300003051758,6.03999996185303,35.3199996948242,6.34000015258789,9.10975551605225,294.098236083984,29.710376739502,8.17006301879883,3.10330152511597,87.3899993896484,31.7203559875488,0 +7,"Cantabria",1970,4.7914166683403,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.2976818084717,0 +7,"Cantabria",1971,4.96965161530299,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.1462631225586,0 +7,"Cantabria",1972,5.15538433285244,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.6463279724121,0 +7,"Cantabria",1973,5.33876045878633,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.9357929229736,0 +7,"Cantabria",1974,5.50835486038587,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.2517414093018,0 +7,"Cantabria",1975,5.67523571669255,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.4183330535889,0 +7,"Cantabria",1976,5.79605821676508,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3222694396973,0 +7,"Cantabria",1977,5.90295647253686,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.7218837738037,0 +7,"Cantabria",1978,5.91659524106394,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.854923248291,0 +7,"Cantabria",1979,5.90645540308796,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.3349609375,0 +7,"Cantabria",1980,5.8991001524676,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.4796085357666,0 +7,"Cantabria",1981,5.91488151474021,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.029748916626,0 +7,"Cantabria",1982,5.9168807168376,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.2042331695557,0 +7,"Cantabria",1983,5.94165957815066,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.771089553833,0 +7,"Cantabria",1984,6.02884910951782,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.2284927368164,0 +7,"Cantabria",1985,6.13831751893432,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.6516265869141,0 +7,"Cantabria",1986,6.42045126476141,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.644136428833,0 +7,"Cantabria",1987,6.71322475804904,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.1474628448486,0 +7,"Cantabria",1988,7.02342186202938,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.3745632171631,0 +7,"Cantabria",1989,7.33361896600971,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.8378524780273,0 +7,"Cantabria",1990,7.45072853766112,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.0300846099854,0 +7,"Cantabria",1991,7.59640137695034,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.6100959777832,0 +7,"Cantabria",1992,7.46215367037989,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.6292610168457,0 +7,"Cantabria",1993,7.32790596380945,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0652160644531,0 +7,"Cantabria",1994,7.55069997439348,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.568395614624,0 +7,"Cantabria",1995,7.77706352175048,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.591121673584,0 +7,"Cantabria",1996,7.9077408198662,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +7,"Cantabria",1997,8.22693498704611,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1955,1.72914877104031,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1956,1.83833192260359,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1957,1.9476578120537,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1958,1.97136537649399,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1959,1.99514420091748,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1960,2.13881740226844,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1961,2.23950296427583,32.5499992370605,9.01000022888184,16.0499992370605,6.17000007629395,29.1399993896484,7.07999992370605,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1962,2.45422738104403,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1963,2.67223651257286,33.2299995422363,8.17000007629395,15.75,6.6399998664856,29.7099990844727,6.5,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1964,2.77777777777778,NA,NA,NA,NA,NA,NA,110.669097900391,1800.78466796875,90.2709884643555,33.6689147949219,19.031759262085,NA,18.7428321838379,0 +8,"Castilla Y Leon",1965,2.88217648612673,30.5200004577637,7.84999990463257,16.3600006103516,7.36999988555908,30.6100006103516,7.28000020980835,108.557182312012,1784.60620117188,93.0149154663086,35.0594444274902,19.0506000518799,NA,19.8590602874756,0 +8,"Castilla Y Leon",1966,2.98807492292939,NA,NA,NA,NA,NA,NA,106.481353759766,1770.78649902344,96.0362777709961,36.6567306518555,19.0829524993896,NA,19.3095703125,0 +8,"Castilla Y Leon",1967,3.09454431127937,27.0200004577637,7.23999977111816,16.7600002288818,7.69999980926514,32.1300010681152,9.14999961853027,104.441024780273,1762.45178222656,99.6097412109375,38.4714508056641,19.1937732696533,NA,18.868968963623,0 +8,"Castilla Y Leon",1968,3.30227086694538,NA,NA,NA,NA,NA,NA,102.435623168945,1748.35729980469,111.663963317871,40.3835639953613,18.9824352264404,NA,19.4028263092041,0 +8,"Castilla Y Leon",1969,3.5209939058288,25.2700004577637,6.82000017166138,17.4500007629395,7.84000015258789,32.6300010681152,9.98999977111816,100.464584350586,1734.58569335938,123.203689575195,42.4837989807129,19.6990413665771,28.7700004577637,21.7317962646484,0 +8,"Castilla Y Leon",1970,3.67052284726685,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3235569000244,0 +8,"Castilla Y Leon",1971,3.82183677501183,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8736476898193,0 +8,"Castilla Y Leon",1972,4.08176224901144,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.7254333496094,0 +8,"Castilla Y Leon",1973,4.34540152311393,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.2864246368408,0 +8,"Castilla Y Leon",1974,4.46472429473276,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.2827758789062,0 +8,"Castilla Y Leon",1975,4.58111952318869,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.2781448364258,0 +8,"Castilla Y Leon",1976,4.7327193625482,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.5485191345215,0 +8,"Castilla Y Leon",1977,4.88467571974414,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.1691036224365,0 +8,"Castilla Y Leon",1978,4.94680091402457,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.4022541046143,0 +8,"Castilla Y Leon",1979,4.92037980429141,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.975700378418,0 +8,"Castilla Y Leon",1980,4.89374469668822,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.6389312744141,0 +8,"Castilla Y Leon",1981,4.88103383338577,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.1625480651855,0 +8,"Castilla Y Leon",1982,4.92309334958427,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.6147747039795,0 +8,"Castilla Y Leon",1983,4.97072247633489,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.4617786407471,0 +8,"Castilla Y Leon",1984,5.11346777246122,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.6302833557129,0 +8,"Castilla Y Leon",1985,5.26199667700969,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.7101421356201,0 +8,"Castilla Y Leon",1986,5.6473148785057,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.7041282653809,0 +8,"Castilla Y Leon",1987,6.04463003594955,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.0892887115479,0 +8,"Castilla Y Leon",1988,6.35439870834895,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.9405784606934,0 +8,"Castilla Y Leon",1989,6.67402174295067,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.784252166748,0 +8,"Castilla Y Leon",1990,6.87032273005704,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.4983215332031,0 +8,"Castilla Y Leon",1991,7.06319626451594,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.53342628479,0 +8,"Castilla Y Leon",1992,7.04548717804958,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.0742378234863,0 +8,"Castilla Y Leon",1993,7.02763513577594,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.9429492950439,0 +8,"Castilla Y Leon",1994,7.07426444355742,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.298885345459,0 +8,"Castilla Y Leon",1995,7.28291934028894,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3803024291992,0 +8,"Castilla Y Leon",1996,7.61139691812384,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +8,"Castilla Y Leon",1997,7.88846009120162,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1955,1.32776351357759,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1956,1.41509567387136,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1957,1.50357039102109,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1958,1.5314196423441,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1959,1.5593401536503,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1960,1.66752359520466,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1961,1.75242790389967,42.1300010681152,5.09000015258789,16.1299991607666,4.78999996185303,26.3899993896484,5.46999979019165,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1962,1.92045131706231,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1963,2.09190240079285,41.5400009155273,4.57999992370605,16.2399997711182,5.1100001335144,27.3700008392334,5.15999984741211,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1964,2.18259064479066,NA,NA,NA,NA,NA,NA,274.36865234375,1067.63903808594,26.89621925354,13.6807374954224,5.95148801803589,NA,14.950511932373,0 +9,"Castilla-La Mancha",1965,2.27470713933854,35.060001373291,5.32000017166138,17.5200004577637,5.96999979019165,30.1299991607666,6,266.183166503906,1052.43518066406,28.6108989715576,14.6527147293091,6.07189273834229,NA,16.5078983306885,0 +9,"Castilla-La Mancha",1966,2.3783919403329,NA,NA,NA,NA,NA,NA,258.163299560547,1038.83850097656,30.5229873657227,15.7534484863281,6.20317840576172,NA,17.8852577209473,0 +9,"Castilla-La Mancha",1967,2.48236221710092,31.5799999237061,5.34999990463257,18.2000007629395,6.26999998092651,31.1000003814697,7.5,250.306091308594,1028.95056152344,32.7421226501465,16.996208190918,6.3677134513855,NA,18.6468353271484,0 +9,"Castilla-La Mancha",1968,2.7090831539761,NA,NA,NA,NA,NA,NA,242.608703613281,1017.61560058594,37.9977378845215,18.335750579834,6.43204116821289,NA,18.7182598114014,0 +9,"Castilla-La Mancha",1969,2.94744365728096,30.1200008392334,5.55999994277954,17.7999992370605,6.90999984741211,31.2000007629395,8.39999961853027,235.068328857422,1006.13732910156,43.4472923278809,19.8195037841797,6.82267570495605,22.3799991607666,21.1153755187988,0 +9,"Castilla-La Mancha",1970,3.13688952117196,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.894416809082,0 +9,"Castilla-La Mancha",1971,3.31962299973445,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.1898212432861,0 +9,"Castilla-La Mancha",1972,3.62917745634339,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.9155044555664,0 +9,"Castilla-La Mancha",1973,3.94608672773181,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.1244258880615,0 +9,"Castilla-La Mancha",1974,4.0281346442869,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.9158058166504,0 +9,"Castilla-La Mancha",1975,4.11196820091023,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.2131309509277,0 +9,"Castilla-La Mancha",1976,4.26071110332874,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.0932846069336,0 +9,"Castilla-La Mancha",1977,4.41216798688098,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.3084964752197,0 +9,"Castilla-La Mancha",1978,4.44630086507092,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.2669353485107,0 +9,"Castilla-La Mancha",1979,4.40874053423352,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.8906097412109,0 +9,"Castilla-La Mancha",1980,4.32876329767946,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.1016082763672,0 +9,"Castilla-La Mancha",1981,4.26128249071694,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.6895370483398,0 +9,"Castilla-La Mancha",1982,4.34318745146476,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.0412845611572,0 +9,"Castilla-La Mancha",1983,4.42466441647253,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.0229206085205,0 +9,"Castilla-La Mancha",1984,4.55005714396891,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.228910446167,0 +9,"Castilla-La Mancha",1985,4.67766350727359,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.167423248291,0 +9,"Castilla-La Mancha",1986,4.98064831644084,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.0725765228271,0 +9,"Castilla-La Mancha",1987,5.2955586036523,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.3752861022949,0 +9,"Castilla-La Mancha",1988,5.67787778408178,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.907299041748,0 +9,"Castilla-La Mancha",1989,6.06533857932332,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.87522315979,0 +9,"Castilla-La Mancha",1990,6.27942013079968,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.8845691680908,0 +9,"Castilla-La Mancha",1991,6.47450730106688,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,29.4713535308838,0 +9,"Castilla-La Mancha",1992,6.33069114390866,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,29.3668060302734,0 +9,"Castilla-La Mancha",1993,6.18858914891504,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.2604522705078,0 +9,"Castilla-La Mancha",1994,6.2309341408872,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.7345809936523,0 +9,"Castilla-La Mancha",1995,6.3287634197149,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.9419326782227,0 +9,"Castilla-La Mancha",1996,6.61439609609308,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +9,"Castilla-La Mancha",1997,6.86539554895588,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1955,3.54662963030373,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1956,3.69044556954152,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1957,3.82683499817574,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1958,3.8756783776064,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1959,3.92173673384055,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1960,4.24178820002321,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1961,4.57533547892566,8.23999977111816,2.89000010490417,41.6100006103516,5.55000019073486,37.2999992370605,4.42000007629395,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1962,4.83804641196265,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1963,5.08133409636867,8.27999973297119,2.72000002861023,39.939998626709,6.42999982833862,38.8300018310547,3.78999996185303,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1964,5.15809787814531,NA,NA,NA,NA,NA,NA,262.816864013672,2749.12109375,168.848922729492,55.7905082702637,29.5606575012207,NA,16.5614433288574,0 +10,"Cataluna",1965,5.22365052507319,6.76999998092651,2.85999989509583,40.2900009155273,7.28000020980835,38.8499984741211,3.96000003814697,269.135833740234,2798.28125,181.376449584961,58.4356880187988,30.5408153533936,NA,20.0388870239258,0 +10,"Cataluna",1966,5.33247650503874,NA,NA,NA,NA,NA,NA,275.247894287109,2851.96044921875,195.368789672852,61.4609680175781,31.5972309112549,NA,23.8684711456299,0 +10,"Cataluna",1967,5.42944892070458,5.69000005722046,2.9300000667572,39.4799995422363,7.09999990463257,40.1300010681152,4.67000007629395,281.157318115234,2915.71752929688,211.568862915039,64.8915252685547,32.8478584289551,NA,24.7314453125,0 +10,"Cataluna",1968,5.67437885353069,NA,NA,NA,NA,NA,NA,286.868377685547,2966.74462890625,247.832916259766,68.5314025878906,33.6025543212891,NA,25.75998878479,0 +10,"Cataluna",1969,5.91552394419117,5.69999980926514,3,38.9599990844727,7.17000007629395,40.2299995422363,4.94000005722046,292.385131835938,3018.34545898438,285.996917724609,72.539794921875,36.0985488891602,153.119995117188,24.1620464324951,0 +10,"Cataluna",1970,6.06683787193614,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5039901733398,0 +10,"Cataluna",1971,6.22764920820614,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.9517269134521,0 +10,"Cataluna",1972,6.53906012902564,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3312225341797,0 +10,"Cataluna",1973,6.83797505609446,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.6595363616943,0 +10,"Cataluna",1974,6.98736082380481,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.100830078125,0 +10,"Cataluna",1975,7.12489302721544,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1886425018311,0 +10,"Cataluna",1976,7.13538981886872,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.4112911224365,0 +10,"Cataluna",1977,7.1429590673591,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.5070133209229,0 +10,"Cataluna",1978,7.01935154409008,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.5064487457275,0 +10,"Cataluna",1979,7.01099691034147,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.6833438873291,0 +10,"Cataluna",1980,7.07883467098128,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.899486541748,0 +10,"Cataluna",1981,7.1822335603611,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.670280456543,0 +10,"Cataluna",1982,7.28720365609826,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.7905778884888,0 +10,"Cataluna",1983,7.3978863181948,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.8484268188477,0 +10,"Cataluna",1984,7.48429002846285,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.6417980194092,0 +10,"Cataluna",1985,7.5699798313763,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.1108713150024,0 +10,"Cataluna",1986,8.07769173930216,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.9053049087524,0 +10,"Cataluna",1987,8.58397583251883,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.1826725006104,0 +10,"Cataluna",1988,9.05741234228836,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.6598262786865,0 +10,"Cataluna",1989,9.52584975721223,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.4189586639404,0 +10,"Cataluna",1990,9.78506175969812,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.870548248291,0 +10,"Cataluna",1991,10.0506998000571,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.4772415161133,0 +10,"Cataluna",1992,9.83790310748268,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0941829681396,0 +10,"Cataluna",1993,9.62510728658999,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.3188209533691,0 +10,"Cataluna",1994,10.0064270838912,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.8029975891113,0 +10,"Cataluna",1995,10.33990288489,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.3809680938721,0 +10,"Cataluna",1996,10.5762637502566,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +10,"Cataluna",1997,11.0454159442168,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1955,2.57597822218206,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1956,2.73850328477578,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1957,2.89988579051354,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1958,2.96351049532723,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1959,3.02620685907542,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1960,3.21929439140435,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1961,3.36246790119118,23.6499996185303,3.01999998092651,27.6800003051758,4.53000020980835,36.1800003051758,4.96000003814697,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1962,3.56998002314629,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1963,3.76521014922076,21.7800006866455,2.75,27.2099990844727,5.42000007629395,38.3300018310547,4.5,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1964,3.82369302130195,NA,NA,NA,NA,NA,NA,265.352386474609,1625.75366210938,84.9697952270508,29.6055355072021,15.6832036972046,NA,19.6991195678711,0 +11,"Comunidad Valenciana",1965,3.8741786491527,18.2700004577637,2.59999990463257,28.1800003051758,6.67000007629395,39.2400016784668,5.05000019073486,266.080993652344,1650.5283203125,89.9331130981445,31.188289642334,16.0734405517578,NA,21.8730087280273,0 +11,"Comunidad Valenciana",1966,3.97814892592072,NA,NA,NA,NA,NA,NA,266.74658203125,1677.83056640625,95.4620513916016,32.992431640625,16.4965915679932,NA,24.0267810821533,0 +11,"Comunidad Valenciana",1967,4.07340761526283,15.4799995422363,2.4300000667572,28.6399993896484,6.88000011444092,40.5,6.07000017166138,267.350677490234,1711.10729980469,101.888916015625,35.0352592468262,17.0129070281982,NA,24.6811904907227,0 +11,"Comunidad Valenciana",1968,4.27977739828241,NA,NA,NA,NA,NA,NA,267.894744873047,1738.71472167969,117.650833129883,37.2143325805664,17.2655029296875,NA,25.7572746276855,0 +11,"Comunidad Valenciana",1969,4.48628970126839,13.7299995422363,2.69000005722046,28.5200004577637,7.34999990463257,41.0699996948242,6.6399998664856,268.380218505859,1766.76818847656,133.849853515625,39.6187858581543,18.4009895324707,128.699996948242,26.178804397583,0 +11,"Comunidad Valenciana",1970,4.65474132809153,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.8410549163818,0 +11,"Comunidad Valenciana",1971,4.81712387071886,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.1996517181396,0 +11,"Comunidad Valenciana",1972,5.13888906322524,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.7017269134521,0 +11,"Comunidad Valenciana",1973,5.44937164297924,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.9421882629395,0 +11,"Comunidad Valenciana",1974,5.55798362507476,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.4274501800537,0 +11,"Comunidad Valenciana",1975,5.65595542386885,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.3052940368652,0 +11,"Comunidad Valenciana",1976,5.76135395118694,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.5974979400635,0 +11,"Comunidad Valenciana",1977,5.86103991214564,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.1891632080078,0 +11,"Comunidad Valenciana",1978,5.81091123797219,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.0904788970947,0 +11,"Comunidad Valenciana",1979,5.77549262919858,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.7497692108154,0 +11,"Comunidad Valenciana",1980,5.79098807985665,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.2112274169922,0 +11,"Comunidad Valenciana",1981,5.8996715398558,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.3309440612793,0 +11,"Comunidad Valenciana",1982,6.01813788415364,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.8458118438721,0 +11,"Comunidad Valenciana",1983,6.13981724738802,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.3196887969971,0 +11,"Comunidad Valenciana",1984,6.23686070511662,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.531343460083,0 +11,"Comunidad Valenciana",1985,6.33611823449439,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.1207828521729,0 +11,"Comunidad Valenciana",1986,6.73936039200853,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.4207916259766,0 +11,"Comunidad Valenciana",1987,7.14438731790917,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.7567386627197,0 +11,"Comunidad Valenciana",1988,7.56069729240306,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3779983520508,0 +11,"Comunidad Valenciana",1989,7.96915167095116,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.738317489624,0 +11,"Comunidad Valenciana",1990,8.13838868303253,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.6734676361084,0 +11,"Comunidad Valenciana",1991,8.30619788040471,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.7543029785156,0 +11,"Comunidad Valenciana",1992,8.0805482404023,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.4616050720215,0 +11,"Comunidad Valenciana",1993,7.85704119414542,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.1841773986816,0 +11,"Comunidad Valenciana",1994,8.06840920032892,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.6151676177979,0 +11,"Comunidad Valenciana",1995,8.28906061716742,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.7886962890625,0 +11,"Comunidad Valenciana",1996,8.42973436161097,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +11,"Comunidad Valenciana",1997,8.72536435599873,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1955,1.24343048331057,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1956,1.3325478478317,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1957,1.42245070657128,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1958,1.4402313799015,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1959,1.45808342217514,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1960,1.53584691396069,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1961,1.59625816399695,46.5,1.60000002384186,10.6999998092651,7.98999977111816,26.6100006103516,6.6100001335144,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1962,1.70558416240728,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1963,1.81769496609367,42.4500007629395,1.97000002861023,10.9700002670288,8.18000030517578,30.1299991607666,6.30999994277954,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1964,1.88281929040085,NA,NA,NA,NA,NA,NA,198.069030761719,713.536682128906,20.4298152923584,9.42423725128174,4.34637928009033,NA,16.0566520690918,0 +12,"Extremadura",1965,1.9488718468133,37.5800018310547,2.9300000667572,10.8599996566772,9.39000034332275,32,7.26000022888184,189.69157409668,703.732543945312,21.3291072845459,10.0519132614136,4.4210057258606,NA,18.0707759857178,0 +12,"Extremadura",1966,2.03263348969212,NA,NA,NA,NA,NA,NA,181.497375488281,695.042297363281,22.3267841339111,10.7636795043945,4.50280570983887,NA,18.9422512054443,0 +12,"Extremadura",1967,2.11760916733054,33.0499992370605,3.13000011444092,11.0299997329712,9.25,34.1100006103516,9.43000030517578,173.483093261719,688.875,23.4936294555664,11.5677585601807,4.60787534713745,NA,21.0937271118164,0 +12,"Extremadura",1968,2.24550122432932,NA,NA,NA,NA,NA,NA,165.645462036133,681.676818847656,26.7379055023193,12.4326190948486,4.6397008895874,NA,23.2899532318115,0 +12,"Extremadura",1969,2.38196234878762,30.1599998474121,3.59999990463257,10.8699998855591,9.03999996185303,35.75,10.5799999237061,157.981262207031,674.467956542969,29.9734573364258,13.3898468017578,4.90563201904297,28.9699993133545,23.9835433959961,0 +12,"Extremadura",1970,2.51849473322912,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.9763240814209,0 +12,"Extremadura",1971,2.65402751662192,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.9220218658447,0 +12,"Extremadura",1972,2.84640092367582,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5212917327881,0 +12,"Extremadura",1973,3.04562988978595,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.5006084442139,0 +12,"Extremadura",1974,3.10304190083525,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.9990043640137,0 +12,"Extremadura",1975,3.15674098346343,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.8751811981201,0 +12,"Extremadura",1976,3.26256794236245,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.0869369506836,0 +12,"Extremadura",1977,3.36539566227451,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.5106010437012,0 +12,"Extremadura",1978,3.4870750255089,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.2648887634277,0 +12,"Extremadura",1979,3.51585250893718,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.247013092041,0 +12,"Extremadura",1980,3.53956007337747,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,36.3258094787598,0 +12,"Extremadura",1981,3.57276482842235,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,39.4098014831543,0 +12,"Extremadura",1982,3.61096824247202,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,36.0924377441406,0 +12,"Extremadura",1983,3.64895744073122,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.8806972503662,0 +12,"Extremadura",1984,3.73714635522663,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.9459609985352,0 +12,"Extremadura",1985,3.82847768243671,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.619384765625,0 +12,"Extremadura",1986,4.16173948556551,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.492696762085,0 +12,"Extremadura",1987,4.49857169714903,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0931091308594,0 +12,"Extremadura",1988,4.76949430812737,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0579814910889,0 +12,"Extremadura",1989,5.05141405608442,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.2050437927246,0 +12,"Extremadura",1990,5.23407583882284,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.0556392669678,0 +12,"Extremadura",1991,5.39831462774029,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,29.6530513763428,0 +12,"Extremadura",1992,5.36546704429315,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.14089012146,0 +12,"Extremadura",1993,5.33261946084601,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.5728073120117,0 +12,"Extremadura",1994,5.43987423445422,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8011627197266,0 +12,"Extremadura",1995,5.50135656344281,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.2228889465332,0 +12,"Extremadura",1996,5.90581253779612,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +12,"Extremadura",1997,6.22457870923598,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1955,1.63467579383708,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1956,1.72557836258558,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1957,1.81648104029429,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1958,1.84090251208918,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1959,1.86539567970814,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1960,1.98329041869734,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1961,2.00578416629847,33.25,5.26000022888184,17.4400005340576,5.82999992370605,31.4099998474121,6.80999994277954,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1962,2.18566114376082,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1963,2.36639542022592,32.3800010681152,5.23999977111816,17.4400005340576,6.44000005722046,32.4300003051758,6.07000017166138,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1964,2.45879739054746,NA,NA,NA,NA,NA,NA,235.295440673828,1684.7021484375,63.6056671142578,25.5104503631592,13.0256786346436,NA,15.5867547988892,0 +13,"Galicia",1965,2.54970006825617,29.1700000762939,5.32999992370605,17.8400001525879,7.40000009536743,33.6300010681152,6.65000009536743,235.123489379883,1674.84020996094,67.2105178833008,26.8081111907959,13.1926946640015,NA,19.1050872802734,0 +13,"Galicia",1966,2.66966570516683,NA,NA,NA,NA,NA,NA,234.916793823242,1667.1708984375,71.2246398925781,28.2892608642578,13.3779163360596,NA,21.5290145874023,0 +13,"Galicia",1967,2.78784635577056,26.5300006866455,5.17999982833862,18.2299995422363,7.3600001335144,34.7900009155273,7.92000007629395,234.676239013672,1664.88940429688,75.8933563232422,29.9673519134521,13.6285676956177,NA,22.2095241546631,0 +13,"Galicia",1968,2.9783632986139,NA,NA,NA,NA,NA,NA,234.402770996094,1658.34289550781,87.4872055053711,31.753547668457,13.6594495773315,NA,22.6949615478516,0 +13,"Galicia",1969,3.17737783101313,22.9799995422363,5.63000011444092,18.4599990844727,8.13000011444092,36.310001373291,8.48999977111816,234.097274780273,1651.63696289062,99.365478515625,33.7229232788086,14.3740549087524,91.0199966430664,22.5122718811035,0 +13,"Galicia",1970,3.36196799170661,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0963020324707,0 +13,"Galicia",1971,3.53477584808358,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0592098236084,0 +13,"Galicia",1972,3.78320471146079,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.6944522857666,0 +13,"Galicia",1973,4.0299913264179,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0349578857422,0 +13,"Galicia",1974,4.13239083266946,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.3730659484863,0 +13,"Galicia",1975,4.23100549675539,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5215072631836,0 +13,"Galicia",1976,4.359040291641,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.5404987335205,0 +13,"Galicia",1977,4.48564683597655,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.460054397583,0 +13,"Galicia",1978,4.52170787420112,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.8565540313721,0 +13,"Galicia",1979,4.55748343665203,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.8971176147461,0 +13,"Galicia",1980,4.58169091057689,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.4147644042969,0 +13,"Galicia",1981,4.61875176777059,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.2603149414062,0 +13,"Galicia",1982,4.71079700233527,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.9320793151855,0 +13,"Galicia",1983,4.80634116745104,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.9381408691406,0 +13,"Galicia",1984,4.89931430724034,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.6235446929932,0 +13,"Galicia",1985,4.99778645135988,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.136173248291,0 +13,"Galicia",1986,5.2779205592487,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.2311058044434,0 +13,"Galicia",1987,5.56576687143517,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.396068572998,0 +13,"Galicia",1988,5.90952590205812,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.4397640228271,0 +13,"Galicia",1989,6.25449874952023,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.4401321411133,0 +13,"Galicia",1990,6.45322780614579,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5044841766357,0 +13,"Galicia",1991,6.64160259108447,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5099925994873,0 +13,"Galicia",1992,6.54448721961136,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.2633037567139,0 +13,"Galicia",1993,6.44751480394553,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.2256679534912,0 +13,"Galicia",1994,6.55648373971834,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.7619285583496,0 +13,"Galicia",1995,6.68866033044688,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.0208110809326,0 +13,"Galicia",1996,6.86246800579299,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +13,"Galicia",1997,7.13853179574251,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1955,4.5944728159421,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1956,4.78663244304641,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1957,4.96343913945904,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1958,4.90616964837613,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1959,4.84640104571127,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1960,5.16109689921183,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1961,5.63260481310585,2.41000008583069,2.45000004768372,23.0799999237061,8.14000034332275,53.3699989318848,10.5600004196167,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1962,5.84083127825644,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1963,6.02449287996398,2.35999989509583,2.19000005722046,23.1900005340576,8.23999977111816,54.7400016784668,9.27999973297119,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1964,6.09932893754686,NA,NA,NA,NA,NA,NA,122.863471984863,1709.73864746094,220.303680419922,55.36328125,51.7932586669922,NA,14.7288694381714,0 +14,"Madrid (Comunidad De)",1965,6.15202820120591,1.76999998092651,2.07999992370605,24.9300003051758,8.67000007629395,51.7000007629395,10.8500003814697,127.15397644043,1765.89123535156,233.303817749023,57.7607727050781,53.8703651428223,NA,15.7073917388916,0 +14,"Madrid (Comunidad De)",1966,6.11046859449198,NA,NA,NA,NA,NA,NA,131.313262939453,1826.0283203125,247.787292480469,60.510440826416,56.1019859313965,NA,16.8277263641357,0 +14,"Madrid (Comunidad De)",1967,6.057340899252,1.46000003814697,1.87000000476837,24.2800006866455,8.56999969482422,51.8199996948242,12.0100002288818,135.343994140625,1893.69494628906,264.619781494141,63.6322860717773,58.7012596130371,NA,16.7447471618652,0 +14,"Madrid (Comunidad De)",1968,6.2531419768738,NA,NA,NA,NA,NA,NA,139.248809814453,1944.68811035156,305.730285644531,66.9298782348633,60.433292388916,NA,15.9584579467773,0 +14,"Madrid (Comunidad De)",1969,6.43558976174218,1.32000005245209,1.77999997138977,23.6900005340576,8.17000007629395,51.939998626709,13.1099996566772,143.030258178711,1996.40441894531,348.025024414062,70.555290222168,65.3297500610352,442.450012207031,17.4400634765625,0 +14,"Madrid (Comunidad De)",1970,6.54334488067583,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.7359714508057,0 +14,"Madrid (Comunidad De)",1971,6.67473565030527,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3634643554688,0 +14,"Madrid (Comunidad De)",1972,7.08690382895624,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.7570724487305,0 +14,"Madrid (Comunidad De)",1973,7.47500748948961,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.0428199768066,0 +14,"Madrid (Comunidad De)",1974,7.65566963428976,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.3821659088135,0 +14,"Madrid (Comunidad De)",1975,7.81633801475248,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.2960662841797,0 +14,"Madrid (Comunidad De)",1976,7.70708403904688,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.8248138427734,0 +14,"Madrid (Comunidad De)",1977,7.59997178540506,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.1333360671997,0 +14,"Madrid (Comunidad De)",1978,7.39003159393075,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.932092666626,0 +14,"Madrid (Comunidad De)",1979,7.32147992593634,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.4670705795288,0 +14,"Madrid (Comunidad De)",1980,7.41716617517763,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,12.1411733627319,0 +14,"Madrid (Comunidad De)",1981,7.53213402476525,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,10.5835418701172,0 +14,"Madrid (Comunidad De)",1982,7.54284525012943,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,11.082971572876,0 +14,"Madrid (Comunidad De)",1983,7.55855469865753,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,10.7892055511475,0 +14,"Madrid (Comunidad De)",1984,7.69922844310108,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,9.3316707611084,0 +14,"Madrid (Comunidad De)",1985,7.83918915187179,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,10.2985944747925,0 +14,"Madrid (Comunidad De)",1986,8.34761496715224,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,11.6767740249634,0 +14,"Madrid (Comunidad De)",1987,8.84961474455959,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,12.7537307739258,0 +14,"Madrid (Comunidad De)",1988,9.25449871465296,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.9098491668701,0 +14,"Madrid (Comunidad De)",1989,9.65795487003713,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.5876665115356,0 +14,"Madrid (Comunidad De)",1990,9.80648421042649,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.4658737182617,0 +14,"Madrid (Comunidad De)",1991,9.96358218243448,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.876277923584,0 +14,"Madrid (Comunidad De)",1992,9.84004570122822,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3416538238525,0 +14,"Madrid (Comunidad De)",1993,9.7186518137675,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.5834465026855,0 +14,"Madrid (Comunidad De)",1994,9.88217669533035,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.076150894165,0 +14,"Madrid (Comunidad De)",1995,10.0985429246778,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.1340103149414,0 +14,"Madrid (Comunidad De)",1996,10.322764749971,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +14,"Madrid (Comunidad De)",1997,10.73264781491,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1955,1.67952011531164,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1956,1.76428168611981,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1957,1.85032844271056,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1958,1.88738929990427,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1959,1.92409320342067,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1960,2.11860898629967,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1961,2.30548404278465,23.8500003814697,9.85999965667725,19.4699993133545,5.23999977111816,34.7099990844727,6.88000011444092,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1962,2.52142249431245,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1963,2.73907445424354,22.3400001525879,9.40999984741211,19.8899993896484,5.73999977111816,35.8199996948242,6.78999996185303,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1964,2.85125684479379,NA,NA,NA,NA,NA,NA,110.026252746582,429.20654296875,24.4162006378174,8.23727226257324,4.22677516937256,NA,16.8633575439453,0 +15,"Murcia (Region de)",1965,2.96593834692599,18.5200004577637,11.1999998092651,19.8099994659424,6.51000022888184,36.2299995422363,7.73999977111816,108.333282470703,429.5390625,25.2321701049805,8.52175331115723,4.29338550567627,NA,18.2309856414795,0 +15,"Murcia (Region de)",1966,3.09918601660686,NA,NA,NA,NA,NA,NA,106.665687561035,430.465148925781,26.1326198577881,8.85069179534912,4.36662483215332,NA,24.4878005981445,0 +15,"Murcia (Region de)",1967,3.22729228939611,17.3700008392334,10.7700004577637,18.8600006103516,6.36999988555908,36.939998626709,9.69999980926514,105.023094177246,432.902587890625,27.1941242218018,9.22547721862793,4.46204042434692,NA,24.7873210906982,0 +15,"Murcia (Region de)",1968,3.46115404318074,NA,NA,NA,NA,NA,NA,103.405128479004,433.731048583984,30.591402053833,9.61625862121582,4.48619937896729,NA,19.002290725708,0 +15,"Murcia (Region de)",1969,3.70615545391049,14.6800003051758,9.89999961853027,20.1900005340576,7.59000015258789,37.189998626709,10.4499998092651,101.81143951416,434.636444091797,33.8780860900879,10.0437860488892,4.73614072799683,73.3600006103516,20.8658714294434,0 +15,"Murcia (Region de)",1970,3.90638380314888,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.7672672271729,0 +15,"Murcia (Region de)",1971,4.08647543224257,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0152168273926,0 +15,"Murcia (Region de)",1972,4.37946292340023,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.2000560760498,0 +15,"Murcia (Region de)",1973,4.66695228190941,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0234603881836,0 +15,"Murcia (Region de)",1974,4.69780066325916,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.4522171020508,0 +15,"Murcia (Region de)",1975,4.72272204453861,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.6638603210449,0 +15,"Murcia (Region de)",1976,4.78434732933448,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.5997066497803,0 +15,"Murcia (Region de)",1977,4.84454436358027,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.6332473754883,0 +15,"Murcia (Region de)",1978,4.8646100416622,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8282051086426,0 +15,"Murcia (Region de)",1979,4.83640372770168,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.6917476654053,0 +15,"Murcia (Region de)",1980,4.76321035437978,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.4614429473877,0 +15,"Murcia (Region de)",1981,4.70808345704241,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5449638366699,0 +15,"Murcia (Region de)",1982,4.81112539274493,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.812520980835,0 +15,"Murcia (Region de)",1983,4.90688355573073,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.5108871459961,0 +15,"Murcia (Region de)",1984,5.03199080745345,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.9793586730957,0 +15,"Murcia (Region de)",1985,5.15459851175334,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.2883014678955,0 +15,"Murcia (Region de)",1986,5.4715080010622,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.2614326477051,0 +15,"Murcia (Region de)",1987,5.78856044617832,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5431385040283,0 +15,"Murcia (Region de)",1988,6.12489274827728,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.6007595062256,0 +15,"Murcia (Region de)",1989,6.45044278294929,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.3808155059814,0 +15,"Murcia (Region de)",1990,6.57819210206124,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.4517707824707,0 +15,"Murcia (Region de)",1991,6.71036825694891,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.5779037475586,0 +15,"Murcia (Region de)",1992,6.66288208600556,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.1358852386475,0 +15,"Murcia (Region de)",1993,6.61632382028684,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1626472473145,0 +15,"Murcia (Region de)",1994,6.78484736085449,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.1073265075684,0 +15,"Murcia (Region de)",1995,6.88581818071511,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.9480438232422,0 +15,"Murcia (Region de)",1996,7.04541570014594,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +15,"Murcia (Region de)",1997,7.29505838036231,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1955,2.55512715884189,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1956,2.69815771282145,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1957,2.83983149415457,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1958,2.88189079243263,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1959,2.93087690975012,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1960,3.16352462877515,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1961,3.33590405357119,29.2399997711182,3.45000004768372,24.7199993133545,6.23000001907349,29.9099998474121,6.44999980926514,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1962,3.62339341208037,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1963,3.89481571462283,29.7900009155273,2.79999995231628,26.0400009155273,6.46000003814697,29.5400009155273,5.36999988555908,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1964,3.98514722286378,NA,NA,NA,NA,NA,NA,13.3898134231567,274.247863769531,15.8491640090942,5.54356098175049,3.86828541755676,NA,19.1858196258545,0 +16,"Navarra (Comunidad Foral De)",1965,4.0729791836819,24.3999996185303,2.6800000667572,28.8500003814697,7.34999990463257,30.5699996948242,6.15999984741211,13.4128046035767,276.363555908203,17.1040096282959,5.88534259796143,3.97963047027588,NA,21.209753036499,0 +16,"Navarra (Comunidad Foral De)",1966,4.21001147760796,NA,NA,NA,NA,NA,NA,13.4329719543457,278.747283935547,18.5064468383789,6.27357196807861,4.09997081756592,NA,19.8447246551514,0 +16,"Navarra (Comunidad Foral De)",1967,4.35239938421612,22.5200004577637,2.35999989509583,29.8199996948242,7.09000015258789,30.7600002288818,7.44999980926514,13.4503841400146,281.893890380859,20.1286563873291,6.71248388290405,4.24443912506104,NA,17.9839763641357,0 +16,"Navarra (Comunidad Foral De)",1968,4.55698352716746,NA,NA,NA,NA,NA,NA,13.4651098251343,283.630767822266,23.6790332794189,7.18329238891602,4.32393074035645,NA,18.7410049438477,0 +16,"Navarra (Comunidad Foral De)",1969,4.76570990180261,17.5699996948242,2.41000008583069,32.560001373291,7.78000020980835,31.5599994659424,8.11999988555908,13.4772138595581,285.131011962891,27.4383029937744,7.70386505126953,4.62594985961914,44.1699981689453,22.7656745910645,0 +16,"Navarra (Comunidad Foral De)",1970,4.97943449960168,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.24440574646,0 +16,"Navarra (Comunidad Foral De)",1971,5.19965704901836,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.5905246734619,0 +16,"Navarra (Comunidad Foral De)",1972,5.46679525367194,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.1721096038818,0 +16,"Navarra (Comunidad Foral De)",1973,5.73050556983719,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.518383026123,0 +16,"Navarra (Comunidad Foral De)",1974,5.98478995538105,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.0085678100586,0 +16,"Navarra (Comunidad Foral De)",1975,6.22293646081588,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.484489440918,0 +16,"Navarra (Comunidad Foral De)",1976,6.29777208255789,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.7473335266113,0 +16,"Navarra (Comunidad Foral De)",1977,6.38446170444047,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.1056308746338,0 +16,"Navarra (Comunidad Foral De)",1978,6.32419319229104,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.790994644165,0 +16,"Navarra (Comunidad Foral De)",1979,6.23964572831312,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.8123836517334,0 +16,"Navarra (Comunidad Foral De)",1980,6.20844039328607,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.2924785614014,0 +16,"Navarra (Comunidad Foral De)",1981,6.17787792355086,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.8062381744385,0 +16,"Navarra (Comunidad Foral De)",1982,6.36203943474298,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.8658895492554,0 +16,"Navarra (Comunidad Foral De)",1983,6.54470165332227,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.9711418151855,0 +16,"Navarra (Comunidad Foral De)",1984,6.79720083463876,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,13.3959035873413,0 +16,"Navarra (Comunidad Foral De)",1985,7.04777185592063,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.864387512207,0 +16,"Navarra (Comunidad Foral De)",1986,7.44929985127017,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.9428825378418,0 +16,"Navarra (Comunidad Foral De)",1987,7.87917755222838,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.5691547393799,0 +16,"Navarra (Comunidad Foral De)",1988,8.34975756089778,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.5950679779053,0 +16,"Navarra (Comunidad Foral De)",1989,8.80391334200273,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.3473205566406,0 +16,"Navarra (Comunidad Foral De)",1990,9.19737217937732,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.6747093200684,0 +16,"Navarra (Comunidad Foral De)",1991,9.59154492410651,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.5686073303223,0 +16,"Navarra (Comunidad Foral De)",1992,9.34518674073033,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.5794525146484,0 +16,"Navarra (Comunidad Foral De)",1993,9.11739537866413,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.4554958343506,0 +16,"Navarra (Comunidad Foral De)",1994,9.3658952841041,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.159740447998,0 +16,"Navarra (Comunidad Foral De)",1995,9.75864021412409,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.0543174743652,0 +16,"Navarra (Comunidad Foral De)",1996,10.0606971180667,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +16,"Navarra (Comunidad Foral De)",1997,10.5227076234357,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1955,3.85318463000527,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1956,3.94565829615088,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1957,4.03356173487263,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1958,4.02342189689665,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1959,4.01378196840523,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1960,4.28591839622273,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1961,4.57433609579741,7.96000003814697,4.73000001907349,44.2599983215332,5.53000020980835,33.5900001525879,3.94000005722046,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1962,4.89895735356304,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1963,5.19701498162913,8.10999965667725,4.25,44.4300003051758,5.92000007629395,33.8699989318848,3.4300000667572,NA,NA,NA,NA,NA,NA,NA,0 +17,"Basque Country (Pais Vasco)",1964,5.33890297875272,NA,NA,NA,NA,NA,NA,36.3632392883301,969.140014648438,67.9733352661133,21.6917819976807,12.2613000869751,NA,23.8191528320312,0 +17,"Basque Country (Pais Vasco)",1965,5.46515300525185,6.73000001907349,4.05000019073486,46.2200012207031,6.36999988555908,33,3.64000010490417,37.8310432434082,992.827880859375,74.1777114868164,23.062183380127,12.6866369247437,NA,25.4604396820068,0 +17,"Basque Country (Pais Vasco)",1966,5.54591562706414,NA,NA,NA,NA,NA,NA,39.2548980712891,1018.14312744141,81.1203002929688,24.6179656982422,13.1446952819824,NA,25.0173454284668,0 +17,"Basque Country (Pais Vasco)",1967,5.61489572663949,5.8899998664856,3.75,45.1100006103516,6.48000001907349,34.2200012207031,4.55000019073486,40.6356964111328,1046.98718261719,89.1361541748047,26.3764228820801,13.684739112854,NA,23.4360332489014,0 +17,"Basque Country (Pais Vasco)",1968,5.85218493307158,NA,NA,NA,NA,NA,NA,41.9743118286133,1070.11999511719,105.888786315918,28.264331817627,14.0191659927368,NA,23.7783908843994,0 +17,"Basque Country (Pais Vasco)",1969,6.08140541736959,5.53000020980835,3.75,45.3899993896484,6.44999980926514,34.0900001525879,4.80000019073486,43.271598815918,1093.23559570312,123.85572052002,30.3524646759033,15.0817813873291,246.889999389648,26.3729362487793,0 +17,"Basque Country (Pais Vasco)",1970,6.17009424134957,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.4970436096191,1 +17,"Basque Country (Pais Vasco)",1971,6.28363340454625,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.7163124084473,1 +17,"Basque Country (Pais Vasco)",1972,6.55555539865284,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,24.5688056945801,1 +17,"Basque Country (Pais Vasco)",1973,6.81076856110308,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.0518627166748,1 +17,"Basque Country (Pais Vasco)",1974,7.1051843028108,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.7866611480713,1 +17,"Basque Country (Pais Vasco)",1975,7.37789168217563,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.8656044006348,1 +17,"Basque Country (Pais Vasco)",1976,7.23293362192275,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.3656425476074,1 +17,"Basque Country (Pais Vasco)",1977,7.08983137211913,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.0888938903809,1 +17,"Basque Country (Pais Vasco)",1978,6.78670360714461,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.5496139526367,1 +17,"Basque Country (Pais Vasco)",1979,6.6398173868571,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3017597198486,1 +17,"Basque Country (Pais Vasco)",1980,6.56283917136956,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.5340042114258,1 +17,"Basque Country (Pais Vasco)",1981,6.50078545499277,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.7956075668335,1 +17,"Basque Country (Pais Vasco)",1982,6.54505860699956,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.4020538330078,1 +17,"Basque Country (Pais Vasco)",1983,6.59532980113941,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.0416736602783,1 +17,"Basque Country (Pais Vasco)",1984,6.76149675009149,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.7125797271729,1 +17,"Basque Country (Pais Vasco)",1985,6.93716067172772,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.6840753555298,1 +17,"Basque Country (Pais Vasco)",1986,7.33219115130052,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.3246994018555,1 +17,"Basque Country (Pais Vasco)",1987,7.74278812359415,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15.5889568328857,1 +17,"Basque Country (Pais Vasco)",1988,8.12053664075889,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.8475723266602,1 +17,"Basque Country (Pais Vasco)",1989,8.50971116232416,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.7838096618652,1 +17,"Basque Country (Pais Vasco)",1990,8.7767778890741,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.9347801208496,1 +17,"Basque Country (Pais Vasco)",1991,9.02527866619582,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.0072422027588,1 +17,"Basque Country (Pais Vasco)",1992,8.87389282470633,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.9394264221191,1 +17,"Basque Country (Pais Vasco)",1993,8.71822353908928,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.9752731323242,1 +17,"Basque Country (Pais Vasco)",1994,9.01813784928637,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.893705368042,1 +17,"Basque Country (Pais Vasco)",1995,9.44087386165337,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.266321182251,1 +17,"Basque Country (Pais Vasco)",1996,9.68651813767495,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,1 +17,"Basque Country (Pais Vasco)",1997,10.1706658728087,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,1 +18,"Rioja (La)",1955,2.39045993834351,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1956,2.53520421864679,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1957,2.68001997685371,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1958,2.72643528676517,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1959,2.77285059667662,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1960,2.96986570905801,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1961,3.1531705750087,32.7200012207031,3.08999991416931,27.5699996948242,4.34000015258789,26.2299995422363,6.05000019073486,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1962,3.4043844615824,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1963,3.66923755252406,34.4300003051758,2.99000000953674,25.3899993896484,4.73999977111816,27.0699996948242,5.38000011444092,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1964,3.80398473273819,NA,NA,NA,NA,NA,NA,10.23792552948,151.320892333984,8.60982704162598,3.06339836120605,1.81167531013489,NA,16.6301879882812,0 +18,"Rioja (La)",1965,3.92180821174419,31.9500007629395,3.01999998092651,25.9599990844727,5.34999990463257,27.6399993896484,6.07999992370605,9.79054164886475,151.513656616211,8.87244129180908,3.16899657249451,1.77959680557251,NA,17.7282848358154,0 +18,"Rioja (La)",1966,4.03270487171076,NA,NA,NA,NA,NA,NA,9.35300159454346,151.885833740234,9.16163349151611,3.29110646247864,1.74683582782745,NA,17.6105575561523,0 +18,"Rioja (La)",1967,4.16031123501544,27.6000003814697,2.78999996185303,27.0300006866455,5.65000009536743,29.5100002288818,7.42999982833862,8.92512512207031,152.711868286133,9.50362682342529,3.43024039268494,1.71896767616272,NA,16.5963726043701,0 +18,"Rioja (La)",1968,4.37303644968625,NA,NA,NA,NA,NA,NA,8.50673580169678,152.956436157227,10.6550092697144,3.57529520988464,1.66027390956879,NA,18.9667510986328,0 +18,"Rioja (La)",1969,4.60354179288618,24.9099998474121,2.54999995231628,27.1100006103516,6.1100001335144,31.0499992370605,8.27999973297119,8.09766006469727,153.228103637695,11.7576704025269,3.73398232460022,1.67922723293304,46.5800018310547,20.8034381866455,0 +18,"Rioja (La)",1970,4.79341630627856,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.5658359527588,0 +18,"Rioja (La)",1971,4.9833618617337,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.5932922363281,0 +18,"Rioja (La)",1972,5.23007699878718,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.8908863067627,0 +18,"Rioja (La)",1973,5.47464997793599,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.7731380462646,0 +18,"Rioja (La)",1974,5.57583566734839,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,23.3648471832275,0 +18,"Rioja (La)",1975,5.67609257985442,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.5327281951904,0 +18,"Rioja (La)",1976,5.81655232642794,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,31.1758613586426,0 +18,"Rioja (La)",1977,5.9557982561623,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,36.6737174987793,0 +18,"Rioja (La)",1978,6.06655239616248,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,39.0718536376953,0 +18,"Rioja (La)",1979,6.10104266387059,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,31.2611522674561,0 +18,"Rioja (La)",1980,6.10868339026461,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,25.4913959503174,0 +18,"Rioja (La)",1981,6.13996020319529,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.7470569610596,0 +18,"Rioja (La)",1982,6.30976860266486,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,19.5771312713623,0 +18,"Rioja (La)",1983,6.5021422276392,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.3251037597656,0 +18,"Rioja (La)",1984,6.62689252568463,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.8476257324219,0 +18,"Rioja (La)",1985,6.7755639501995,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.5573043823242,0 +18,"Rioja (La)",1986,7.16509586128294,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.6409587860107,0 +18,"Rioja (La)",1987,7.58069105674049,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.5728778839111,0 +18,"Rioja (La)",1988,8.00271316175289,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.577579498291,0 +18,"Rioja (La)",1989,8.45329940608487,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,22.7909698486328,0 +18,"Rioja (La)",1990,8.85889728353283,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.0927810668945,0 +18,"Rioja (La)",1991,9.22950585546987,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.9119243621826,0 +18,"Rioja (La)",1992,9.18094795181288,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,21.8620376586914,0 +18,"Rioja (La)",1993,9.13239091983764,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.7657871246338,0 +18,"Rioja (La)",1994,9.498000396929,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.4694519042969,0 +18,"Rioja (La)",1995,9.75221330456923,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,20.2756500244141,0 +18,"Rioja (La)",1996,10.0564128022574,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 +18,"Rioja (La)",1997,10.4762923135243,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,0 diff --git a/tests/test_methodology_synthetic_control.py b/tests/test_methodology_synthetic_control.py new file mode 100644 index 00000000..9584f12e --- /dev/null +++ b/tests/test_methodology_synthetic_control.py @@ -0,0 +1,764 @@ +"""Methodology + R-parity tests for the classic Synthetic Control estimator. + +Covers Abadie-Diamond-Hainmueller (2010) ``SyntheticControl``: + +* **Validation gates** (10 baked-in checks): predictor-period leakage, absorbing + post-period suffix + no-anticipation cross-check, post canonicalization, donor + filtering, empty windows, poor-fit warning, duplicate predictor labels, + inner-solve non-convergence warning, order-independent gap path, and the + ``standardize="none"`` deviation. +* **custom_v cross-field** + degenerate ``J==1`` / ``T0`` paths + ``get_params`` + round-trip + the NaN-inference contract. +* **Two-tier R `Synth` parity** on the Basque dataset (Abadie-Gardeazabal 2003): + Tier-1 feeds R's ``solution.v`` through ``custom_v`` and asserts the donor + weights match deterministically (optimizer-independent); Tier-2 checks the + data-driven nested fit lands in a tolerance band (the nested V legitimately + differs because our outer objective uses all pre periods, not R's + ``time.optimize.ssr`` window). + +The Basque fixtures live in ``tests/data/`` (not ``benchmarks/data/``) so the +deterministic Tier-1 test runs in isolated-install CI without R; regenerate via +``Rscript benchmarks/R/generate_synth_basque_golden.R``. +""" + +from __future__ import annotations + +import json +from pathlib import Path + +import numpy as np +import pandas as pd +import pytest + +from diff_diff import SyntheticControl, SyntheticControlResults, synthetic_control +from tests.conftest import assert_nan_inference + +DATA_DIR = Path(__file__).parent / "data" +GOLDEN_PATH = DATA_DIR / "synth_basque_golden.json" +PANEL_PATH = DATA_DIR / "synth_basque_panel.csv" + +PREDICTORS = [ + "school.illit", + "school.prim", + "school.med", + "school.high", + "school.post.high", + "invest", +] + + +# --------------------------------------------------------------------------- +# Synthetic panel builders (fast; no R needed) +# --------------------------------------------------------------------------- + + +def _make_panel(n_donors=4, T=8, T0=6, effect=3.0, seed=0): + """Balanced panel where the treated unit is a convex mix of two donors.""" + rng = np.random.default_rng(seed) + years = list(range(2000, 2000 + T)) + donors = {} + for j in range(n_donors): + base = rng.normal(10, 2) + trend = rng.normal(0, 0.3) + donors[j] = base + trend * np.arange(T) + rng.normal(0, 0.15, T) + if n_donors >= 2: + treated = 0.6 * donors[0] + 0.4 * donors[1] + rng.normal(0, 0.08, T) + else: + treated = donors[0] + rng.normal(0, 0.08, T) + treated = treated.copy() + treated[T0:] += effect + rows = [] + for j in range(n_donors): + for t in range(T): + rows.append( + {"unit": f"d{j}", "year": years[t], "y": donors[j][t], "treated": 0, "x": float(j)} + ) + for t in range(T): + rows.append( + { + "unit": "treated", + "year": years[t], + "y": treated[t], + "treated": int(t >= T0), + "x": 0.5, + } + ) + return pd.DataFrame(rows), years, T0 + + +# --------------------------------------------------------------------------- +# Validation 1: predictor periods must be within the pre window (no leakage) +# --------------------------------------------------------------------------- + + +def test_predictor_window_outside_pre_rejected(): + df, years, T0 = _make_panel() + post_year = years[T0] + with pytest.raises(ValueError, match="outside the pre-treatment window"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + predictors=["y"], + predictor_window=[years[0], post_year], + ) + + +def test_special_predictor_period_outside_pre_rejected(): + df, years, T0 = _make_panel() + with pytest.raises(ValueError, match="outside the pre-treatment window"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + special_predictors=[("y", [years[T0]], "mean")], + ) + + +def test_pre_period_outcomes_outside_pre_rejected(): + df, years, T0 = _make_panel() + with pytest.raises(ValueError, match="outside the pre-treatment window"): + SyntheticControl(seed=0).fit( + df, "y", "treated", "unit", "year", pre_period_outcomes=[years[T0]] + ) + + +# --------------------------------------------------------------------------- +# Validation 2: post must be a contiguous suffix + no-anticipation +# --------------------------------------------------------------------------- + + +def test_non_contiguous_post_rejected(): + df, years, T0 = _make_panel() + # Drop a MIDDLE post period -> the remaining set is not a suffix of the axis. + bad_post = [years[T0]] + years[T0 + 2 :] + with pytest.raises(ValueError, match="contiguous suffix"): + SyntheticControl(seed=0).fit(df, "y", "treated", "unit", "year", post_periods=bad_post) + + +def test_anticipation_in_pre_rejected(): + df, years, T0 = _make_panel() + # Mark a pre period as treated for the treated unit, but declare the standard + # post window -> D==1 appears inside the pre window (anticipation). + df = df.copy() + mask = (df["unit"] == "treated") & (df["year"] == years[T0 - 1]) + df.loc[mask, "treated"] = 1 + with pytest.raises(ValueError, match="no-anticipation"): + SyntheticControl(seed=0).fit(df, "y", "treated", "unit", "year", post_periods=years[T0:]) + + +def test_untreated_period_in_post_rejected(): + # Absorbing exposure: a D==0 period inside the (contiguous) post suffix must be + # rejected, not averaged into the ATT (treated path 0,...,1,0 with post=[T0:]). + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "treated") & (df["year"] == years[-1]), "treated"] = 0 + with pytest.raises(ValueError, match="uninterrupted exposure|D==0 in post"): + SyntheticControl(seed=0).fit(df, "y", "treated", "unit", "year", post_periods=years[T0:]) + + +def test_non_binary_treatment_rejected(): + # A non-{0,1} treatment code must fail closed (else the unit is silently dropped + # from both treated and donor sets, changing the donor pool / weights / ATT). + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"] == years[0]), "treated"] = 2 + with pytest.raises(ValueError, match="binary"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_missing_treatment_value_rejected(): + # A donor with a missing treatment cell would be silently classified by + # groupby(...).max() (NaN dropped) — must fail closed before classification. + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"] == years[0]), "treated"] = np.nan + with pytest.raises(ValueError, match="missing"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_estimators_module_reexport(): + # Backward-compat import surface (mirrors SyntheticDiD / TwoWayFixedEffects). + from diff_diff.estimators import SyntheticControl as SC + + assert SC is SyntheticControl + + +# --------------------------------------------------------------------------- +# Validation 3 + 9: explicit post canonicalized; gap path order-independent +# --------------------------------------------------------------------------- + + +def test_post_periods_canonicalized_and_gap_order_independent(): + df, years, T0 = _make_panel() + ordered = years[T0:] + scrambled = list(reversed(ordered)) + [ordered[-1]] # unsorted + duplicate + r1 = synthetic_control(df, "y", "treated", "unit", "year", post_periods=ordered, seed=0) + r2 = synthetic_control(df, "y", "treated", "unit", "year", post_periods=scrambled, seed=0) + assert r1.post_periods == r2.post_periods == ordered + assert abs(r1.att - r2.att) < 1e-12 + gdf = r2.get_gap_df() + # Calendar-sorted regardless of input order. + assert list(gdf["period"]) == sorted(gdf["period"]) + assert (gdf[gdf["phase"] == "post"]["period"].tolist()) == ordered + + +# --------------------------------------------------------------------------- +# Validation 4: donor pool filtering +# --------------------------------------------------------------------------- + + +def test_donor_pool_restricts_donors(): + df, years, T0 = _make_panel(n_donors=4) + res = synthetic_control(df, "y", "treated", "unit", "year", donor_pool=["d0", "d1"], seed=0) + assert res.n_donors == 2 + assert set(res.get_weights_df()["unit"]) <= {"d0", "d1"} + + +def test_contaminated_donor_pool_rejected(): + df, years, T0 = _make_panel() + # The treated unit itself must never appear in the donor pool. + with pytest.raises(ValueError, match="treated unit|ever-treated|never-treated"): + synthetic_control(df, "y", "treated", "unit", "year", donor_pool=["d0", "treated"], seed=0) + + +def test_ever_treated_donor_rejected(): + # A second ever-treated unit (not the designated treated) cannot be a donor. + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"] >= years[T0]), "treated"] = 1 + with pytest.raises(ValueError, match="ever-treated|never-treated"): + synthetic_control( + df, + "y", + "treated", + "unit", + "year", + treated_unit="treated", + donor_pool=["d0", "d1"], + seed=0, + ) + + +# --------------------------------------------------------------------------- +# Validation 5: empty windows rejected +# --------------------------------------------------------------------------- + + +def test_empty_predictor_window_rejected(): + df, _, _ = _make_panel() + with pytest.raises(ValueError, match="must not be empty"): + SyntheticControl(seed=0).fit( + df, "y", "treated", "unit", "year", predictors=["y"], predictor_window=[] + ) + + +def test_empty_special_period_list_rejected(): + df, _, _ = _make_panel() + with pytest.raises(ValueError, match="must not be empty"): + SyntheticControl(seed=0).fit( + df, "y", "treated", "unit", "year", special_predictors=[("y", [], "mean")] + ) + + +# --------------------------------------------------------------------------- +# Fail-closed on non-finite data entering the matching problem +# --------------------------------------------------------------------------- + + +def test_non_finite_predictor_rejected(): + # PARTIAL missingness in a predictor window: fail closed (deliberate deviation + # from R Synth's na.rm=TRUE — see REGISTRY). All-NA windows behave identically. + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"] == years[0]), "x"] = np.nan + with pytest.raises(ValueError, match="non-finite"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + predictors=["x"], + predictor_window=[years[0], years[1]], + ) + + +def test_all_na_predictor_window_rejected(): + # FULLY-missing predictor window: same fail-closed contract as partial (no na.rm). + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"].isin([years[0], years[1]])), "x"] = np.nan + with pytest.raises(ValueError, match="non-finite"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + predictors=["x"], + predictor_window=[years[0], years[1]], + ) + + +def test_outer_v_nonconvergence_warning(): + # Outer V-search non-convergence must not be silent (optimizer capped at 1 iter). + df, _, _ = _make_panel() + with pytest.warns(UserWarning, match="Outer V-search"): + synthetic_control( + df, "y", "treated", "unit", "year", seed=0, optimizer_options={"maxiter": 1} + ) + + +def test_inner_v_search_nonconvergence_warning(): + # Intermediate inner solves during the nested V search must not be silent: forcing + # inner_max_iter=1 makes them truncate, and the estimator emits an aggregated warning. + df, _, _ = _make_panel() + with pytest.warns(UserWarning, match="during nested V selection"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0, inner_max_iter=1) + + +def test_single_inner_nonconvergence_excluded_from_v_ranking(monkeypatch): + # A single LOW-RATE non-converged objective evaluation must be EXCLUDED from V + # ranking (penalized out of the argmin), not merely warned about: force exactly one + # objective eval (the uniform-start eval, max(v) < 0.9) to report conv=False and + # assert (a) the any-occurrence warning fires, and (b) the selected V is a genuine + # small-MSPE fit (mspe_v << penalty), i.e. the truncated candidate did not win. + import importlib + + # NB: ``diff_diff.synthetic_control`` the attribute is the convenience *function* + # (it shadows the submodule, same as ``diff_diff.trop``), so reach the module via + # importlib to monkeypatch its module-global _inner_solve_W. + sc = importlib.import_module("diff_diff.synthetic_control") + + df, _, _ = _make_panel() + real_solve = sc._inner_solve_W + state = {"failed": False} + + def patched(X1s, X0s, v, max_iter, min_decrease): + w, conv = real_solve(X1s, X0s, v, max_iter, min_decrease) + if not state["failed"] and float(np.max(v)) < 0.9: # a spread V => an objective eval + state["failed"] = True + return w, False + return w, conv + + monkeypatch.setattr(sc, "_inner_solve_W", patched) + with pytest.warns(UserWarning, match="during nested V selection"): + res = synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + assert state["failed"] # the patch actually fired on an objective evaluation + assert np.isfinite(res.att) + # Exclusion proof: the chosen V's outer-objective MSPE is a real (small) value, not + # the large penalty a truncated candidate would have carried. + assert res.mspe_v is not None and res.mspe_v < 1.0 + + +def test_n_starts_one_runs(): + # n_starts=1 uses only the uniform start (short-circuits the heuristic candidates) + # and still produces a valid nested fit. + df, _, _ = _make_panel() + res = synthetic_control(df, "y", "treated", "unit", "year", seed=0, n_starts=1) + assert np.isfinite(res.att) + assert abs(sum(res.donor_weights.values()) - 1.0) < 1e-6 + + +def test_non_finite_outcome_rejected(): + df, years, T0 = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d1") & (df["year"] == years[2]), "y"] = np.nan + with pytest.raises(ValueError, match="non-finite"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_distinct_special_period_sets_not_duplicate(): + # Same var/op, same endpoints + length, different intermediate period -> distinct + # predictors, must NOT be rejected as duplicates. + df, years, T0 = _make_panel(T=8, T0=6) + res = SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + special_predictors=[ + ("y", [years[0], years[2], years[4]], "mean"), + ("y", [years[0], years[3], years[4]], "mean"), + ], + ) + assert len(res.v_weights) == 2 + assert len(set(res.v_weights)) == 2 # two distinct labels + + +def test_reordered_special_periods_are_duplicates(): + # Same var/op with reordered periods canonicalize to the same spec -> duplicate. + df, years, T0 = _make_panel(T=8, T0=6) + with pytest.raises(ValueError, match="Duplicate predictor label"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + special_predictors=[ + ("y", [years[0], years[1], years[2]], "mean"), + ("y", [years[2], years[1], years[0]], "mean"), + ], + ) + + +def test_duplicate_predictor_window_periods_deduped(): + # A repeated period in predictor_window must not re-weight the mean: the + # deduped window [y0,y0,y1] matches the explicit [y0,y1]. + df, years, T0 = _make_panel() + r_dup = synthetic_control( + df, + "y", + "treated", + "unit", + "year", + predictors=["y"], + predictor_window=[years[0], years[0], years[1]], + seed=0, + ) + r_uniq = synthetic_control( + df, + "y", + "treated", + "unit", + "year", + predictors=["y"], + predictor_window=[years[0], years[1]], + seed=0, + ) + assert abs(r_dup.att - r_uniq.att) < 1e-9 + + +def test_median_op_rejected(): + # median is a non-linear aggregation, not an ADH linear combination. + df, _, _ = _make_panel() + with pytest.raises(ValueError, match="must be one of"): + SyntheticControl(seed=0).fit( + df, "y", "treated", "unit", "year", predictors=["x"], predictors_op="median" + ) + + +# --------------------------------------------------------------------------- +# Validation 6: poor pre-fit warning +# --------------------------------------------------------------------------- + + +def test_poor_fit_warning(): + # Donors are all ~constant near 10; treated is centred near 50 with a trend, + # so no convex combination can reproduce it -> RMSPE >> SD(treated pre). + rng = np.random.default_rng(1) + years = list(range(2000, 2010)) + T0 = 7 + rows = [] + for j in range(4): + for t, yr in enumerate(years): + rows.append({"unit": f"d{j}", "year": yr, "y": 10 + rng.normal(0, 0.1), "treated": 0}) + for t, yr in enumerate(years): + rows.append({"unit": "treated", "year": yr, "y": 50 + 2.0 * t, "treated": int(t >= T0)}) + df = pd.DataFrame(rows) + with pytest.warns(UserWarning, match="Pre-treatment fit is poor"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_poor_fit_warning_flat_treated_pre_path(): + # Flat treated pre-path (SD == 0) that donors near 10 cannot reproduce: RMSPE > 0 + # must still warn (the former `pre_sd > 0` gate suppressed this case). + rng = np.random.default_rng(2) + years = list(range(2000, 2010)) + T0 = 7 + rows = [] + for j in range(4): + for yr in years: + rows.append({"unit": f"d{j}", "year": yr, "y": 10 + rng.normal(0, 0.1), "treated": 0}) + for i, yr in enumerate(years): + rows.append( + {"unit": "treated", "year": yr, "y": (5.0 if i < T0 else 8.0), "treated": int(i >= T0)} + ) + df = pd.DataFrame(rows) + with pytest.warns(UserWarning, match="Pre-treatment fit is poor"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +# --------------------------------------------------------------------------- +# Validation 7: duplicate predictor labels rejected +# --------------------------------------------------------------------------- + + +def test_duplicate_predictor_label_rejected(): + df, years, T0 = _make_panel() + pre = years[:T0] + with pytest.raises(ValueError, match="Duplicate predictor label"): + SyntheticControl(seed=0).fit( + df, + "y", + "treated", + "unit", + "year", + special_predictors=[("y", pre, "mean"), ("y", pre, "mean")], + ) + + +def test_duplicate_regular_predictor_rejected(): + df, _, _ = _make_panel() + with pytest.raises(ValueError, match="Duplicate predictor label"): + SyntheticControl(seed=0).fit(df, "y", "treated", "unit", "year", predictors=["x", "x"]) + + +# --------------------------------------------------------------------------- +# Validation 8: inner-solve non-convergence warning +# --------------------------------------------------------------------------- + + +def test_inner_nonconvergence_warning(): + df, _, _ = _make_panel(n_donors=4) + with pytest.warns(UserWarning, match="did not converge"): + SyntheticControl(seed=0, v_method="nested", inner_max_iter=1).fit( + df, "y", "treated", "unit", "year" + ) + + +# --------------------------------------------------------------------------- +# Validation 10: standardize="none" deviation runs +# --------------------------------------------------------------------------- + + +def test_standardize_none_runs(): + df, _, _ = _make_panel() + res = synthetic_control(df, "y", "treated", "unit", "year", standardize="none", seed=0) + assert res.standardize == "none" + assert np.isfinite(res.att) + + +# --------------------------------------------------------------------------- +# custom_v cross-field rules (fail-closed) +# --------------------------------------------------------------------------- + + +def test_custom_v_required_when_method_custom(): + with pytest.raises(ValueError, match="custom_v is required"): + SyntheticControl(v_method="custom") + + +def test_custom_v_rejected_when_method_nested(): + with pytest.raises(ValueError, match="must be None when v_method='nested'"): + SyntheticControl(v_method="nested", custom_v=[1.0, 1.0]) + + +def test_custom_v_negative_rejected(): + with pytest.raises(ValueError, match="non-negative"): + SyntheticControl(v_method="custom", custom_v=[1.0, -1.0]) + + +def test_custom_v_wrong_length_rejected(): + df, _, _ = _make_panel() + # 3 entries but the default (all-pre-outcomes) predictor count differs. + with pytest.raises(ValueError, match="custom_v has length"): + SyntheticControl(v_method="custom", custom_v=[1.0, 1.0, 1.0]).fit( + df, "y", "treated", "unit", "year" + ) + + +# --------------------------------------------------------------------------- +# Degenerate paths: J==1, T0==0, T0==1 +# --------------------------------------------------------------------------- + + +def test_single_donor_degenerate_warns(): + df, _, _ = _make_panel(n_donors=1) + with pytest.warns(UserWarning, match="single donor"): + res = synthetic_control(df, "y", "treated", "unit", "year", seed=0) + assert res.n_donors == 1 + assert abs(sum(res.donor_weights.values()) - 1.0) < 1e-9 + + +def test_no_pre_period_rejected(): + # All periods treated for the treated unit -> no pre period. + rows = [] + years = [2000, 2001] + for j in range(3): + for yr in years: + rows.append({"unit": f"d{j}", "year": yr, "y": 10.0 + yr, "treated": 0}) + for yr in years: + rows.append({"unit": "treated", "year": yr, "y": 12.0 + yr, "treated": 1}) + df = pd.DataFrame(rows) + with pytest.raises(ValueError, match="No pre-treatment periods|Cannot infer"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_single_pre_period_nested_warns(): + rows = [] + years = [2000, 2001, 2002] + rng = np.random.default_rng(0) + for j in range(3): + for yr in years: + rows.append({"unit": f"d{j}", "year": yr, "y": 10.0 + rng.normal(), "treated": 0}) + for i, yr in enumerate(years): + rows.append({"unit": "treated", "year": yr, "y": 11.0 + i, "treated": int(i >= 1)}) + df = pd.DataFrame(rows) + with pytest.warns(UserWarning, match="single pre period"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +def test_multiple_treated_units_rejected(): + df, _, _ = _make_panel() + df = df.copy() + df.loc[(df["unit"] == "d0") & (df["year"] >= 2006), "treated"] = 1 + with pytest.raises(ValueError, match="exactly one"): + synthetic_control(df, "y", "treated", "unit", "year", seed=0) + + +# --------------------------------------------------------------------------- +# sklearn-like API: get_params round-trip + transactional set_params +# --------------------------------------------------------------------------- + + +def test_get_set_params_roundtrip(): + est = SyntheticControl(n_starts=3, standardize="none", alpha=0.1, seed=7) + params = est.get_params() + assert set(params) == { + "v_method", + "custom_v", + "optimizer_options", + "n_starts", + "inner_max_iter", + "inner_min_decrease", + "standardize", + "alpha", + "seed", + } + est2 = SyntheticControl().set_params(**params) + assert est2.get_params() == params + + +def test_set_params_rolls_back_on_invalid(): + est = SyntheticControl(alpha=0.05) + with pytest.raises(ValueError): + est.set_params(alpha=1.5) + assert est.alpha == 0.05 # unchanged after failed update + + +# --------------------------------------------------------------------------- +# NaN-inference contract + result accessors +# --------------------------------------------------------------------------- + + +def test_nan_inference_contract(): + df, _, _ = _make_panel() + res = synthetic_control(df, "y", "treated", "unit", "year", seed=0) + assert_nan_inference( + {"se": res.se, "t_stat": res.t_stat, "p_value": res.p_value, "conf_int": res.conf_int} + ) + assert np.isfinite(res.att) + + +def test_result_accessors_render(): + df, _, _ = _make_panel() + res = synthetic_control(df, "y", "treated", "unit", "year", seed=0) + assert isinstance(res, SyntheticControlResults) + assert isinstance(res.summary(), str) and "Synthetic Control" in res.summary() + assert "att" in res.to_dict() + assert res.to_dataframe().shape[0] == 1 + gdf = res.get_gap_df() + assert set(gdf.columns) == {"period", "gap", "phase"} + wdf = res.get_weights_df() + assert list(wdf.columns) == ["unit", "weight"] + # Reserved PR-2/3 attributes present and None. + assert res._placebo_gaps is None and res._rmspe_ratio is None and res._fit_snapshot is None + + +def test_inferred_post_matches_explicit(): + df, years, T0 = _make_panel() + r_inf = synthetic_control(df, "y", "treated", "unit", "year", seed=0) + r_exp = synthetic_control(df, "y", "treated", "unit", "year", post_periods=years[T0:], seed=0) + assert r_inf.post_periods == r_exp.post_periods == years[T0:] + assert abs(r_inf.att - r_exp.att) < 1e-12 + + +# --------------------------------------------------------------------------- +# R-parity (Basque / Abadie-Gardeazabal 2003 via R `Synth`) +# --------------------------------------------------------------------------- + + +def _load_golden(): + if not GOLDEN_PATH.exists() or not PANEL_PATH.exists(): + pytest.skip( + "Basque golden fixtures missing — regenerate via " + "`Rscript benchmarks/R/generate_synth_basque_golden.R`." + ) + return json.load(open(GOLDEN_PATH)), pd.read_csv(PANEL_PATH) + + +def _basque_kwargs(golden): + special = [ + ( + s["var"], + list(s["periods"]) if isinstance(s["periods"], list) else [s["periods"]], + s["op"], + ) + for s in golden["config"]["special"] + ] + return dict( + treated_unit=golden["config"]["treated_regionno"], + donor_pool=list(golden["config"]["controls"]), + predictors=PREDICTORS, + predictors_op="mean", + predictor_window=list(range(1964, 1970)), + special_predictors=special, + ) + + +def test_basque_tier1_custom_v_parity(): + """Tier-1 (hard gate): given R's solution.v, donor weights match R deterministically.""" + golden, df = _load_golden() + custom_v = np.asarray(golden["solution_v"], dtype=float) + res = SyntheticControl(v_method="custom", custom_v=custom_v).fit( + df, "gdpcap", "treated", "regionno", "year", **_basque_kwargs(golden) + ) + # Predictor matrix + ordering: X1 matches R's dataprep exactly. + X1_py = res.predictor_balance["treated"].to_numpy(dtype=float) + X1_r = np.asarray(golden["X1"], dtype=float) + np.testing.assert_allclose(X1_py, X1_r, atol=1e-6) + + # Donor weights match R's solution.w (the published Cataluna/Madrid mix). + controls = sorted(int(c) for c in golden["config"]["controls"]) + w_r = {int(k): v for k, v in golden["solution_w"].items()} + w_py = {int(k): v for k, v in res.donor_weights.items()} + wr = np.array([w_r.get(c, 0.0) for c in controls]) + wp = np.array([w_py.get(c, 0.0) for c in controls]) + np.testing.assert_allclose(wp, wr, atol=1e-3) + # Published anchor: region 10 ~ 0.85, region 14 ~ 0.15. + assert w_py.get(10, 0) > 0.80 and w_py.get(14, 0) > 0.10 + + +@pytest.mark.slow +def test_basque_tier2_nested_band(): + """Tier-2 (band): the data-driven nested fit lands near R's solution. + + Loose by design — our outer objective minimizes MSPE over all pre periods, + while R uses the ``time.optimize.ssr`` (1960-1969) window, so the nested V + legitimately differs; multistart Nelder-Mead/Powell is also BLAS/platform + sensitive. We check fit quality and that the dominant donors agree. + """ + golden, df = _load_golden() + res = SyntheticControl(v_method="nested", seed=0).fit( + df, "gdpcap", "treated", "regionno", "year", **_basque_kwargs(golden) + ) + r_sqrt_loss = golden["loss_v"] ** 0.5 + assert res.pre_rmspe <= r_sqrt_loss * 1.5 # comparable pre-fit quality + + years = np.asarray(golden["years"]) + r_att = float(np.asarray(golden["gap"])[years >= 1970].mean()) + assert abs(res.att - r_att) < 0.2 # avg post-gap within band + + # Dominant donors agree with R (Cataluna region 10, Madrid region 14). + top2 = [u for u, _ in sorted(res.donor_weights.items(), key=lambda kv: -kv[1])[:2]] + assert set(top2) == {10, 14} + assert res.donor_weights.get(10, 0) + res.donor_weights.get(14, 0) > 0.7