nsw_long is the same dataset as nsw but in a long format.

nsw_long

Format

A data frame in "long" format with 38408 observations on the following and 15 variables:

id

unique identifier for each cross-sectional unit (worker).

year

year. 1975 is the pre-treatment and 1978 is the post-treatment

treated

an indicator variable for treatment status. Missing if not part of the NSW experimental sample.

age

age in years.

educ

years of schooling.

black

indicator variable for blacks.

married

indicator variable for martial status.

nodegree

indicator variable for high school diploma.

dwincl

indicator variable for inclusion in Dehejia and Wahba sample. Missing if not part of the experimental sample

re74

real earnings in 1974 (pre-treatment).

hisp

indicator variable for Hispanics.

early_ra

indicator variable for inclusion in the early random assignment sample in Smith and Todd (2005). Missing if not part of the experimental sample

sample

1 if NSW (experimental sample), 2 if CPS comparison group, 3 if PSID comparison group.

re

real earnings (outcome of interest).

experimental

1 if in experimental sample, 0 otherwise.

Source

https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/23407/DYEWLO&version=1.0.

References

Diamond, Alexis, and Sekhon, Jasjeet S. (2013), 'Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies' Review of Economics and Statistics, vol. 95 , pp. 932-945, doi: 10.1162/REST_a_00318

Smith, Jeffrey, and Todd, Petra (2005), Does matching overcome LaLonde's critique of nonexperimental estimators?' Journal of Econometrics, vol. 125, pp. 305-353, doi: 10.1016/j.jeconom.2004.04.011