Economics 475:  Econometrics

Finding Data
 

Perhaps the single thing that slows students down the most in 475 is finding data used in their final project.  The purpose of this handout is to suggest a few places to search for easy to use data.

1.  US Census
The US Census bureau as a method for disseminating many large cross sectional data sets.  These data include the American Community Survey, the American Housing Survey, Behavioral Risk Factor Surveillance System, Consumer Expenditure Survey, the Current Population Survey, Decennial Census of Population and Housing, Decennial Public Use Microdata Samples (the Census), Mortality, National Ambulator Medical Care Survey, National Survey of Fishing, Hunting, and Wildlife Associations, Social Security Administration, Survey of Income and Program Participation, and the Survey of Program Dynamics.  Each of these surveys are large-scale, important surveys in their own right (of ten containing thousands of observations of hundreds of variables).

2.  ICPSR
The Inter-University Consortium for Political and Social Research (http://www.icpsr.org) is a service provided through Western’s computer network that supplies a huge range of mostly cross-sectional data sets.  Included in this data set are a large number of responses to telephone questionnaires sponsored by news organizations, famous surveys used by economists (like the Panel Study of Income Dynamics), and surveys used by other fields such as political science and sociology (one interesting sociology survey is the General Social Survey).  The easiest way to navigate through this web site is to have some type of topic in mind and then searching for data that might be acceptable.  If you do nothing else with my data sources, open this up and search for a while.  If more Western students use ICPSR it becomes easier to justify its large expense to the administration.

3.  Create your own
Perhaps the most enjoyable way of answering a question is to create your own questionnaire and give it to fellow students and/or members of the community.  If this is something you hope to pursue, be sure to have your survey proofread before administering it.  We can arrange for you to survey some introduction to economics courses to get data.

4.  The Harvard Public Health Survey focuses on college drinking and associated social and educational issues.  A codebook for the survey can be found here.

5.  The Freshmen 2002 Data data set observes all freshmen that began at WWU during the fall quarter of 2002.  This data includes pre-WWU information (high school GPA, hours transferred, SAT, gender and demographics) and their WWU gpa during their fall quarter of 2002.

6.  2010 Undergraduate Exit Data:  This data set observes 1,707 respondents to the 2010 undergraduate exit survey at Western Washington University.  This survey is detailed here.

7.  9th/10th Grade Data  This data observes one complete year of Washington 9th graders.  These students took the ITBS tests in 9th grade and then were followed into the 10th grade when they took the WASL.  Also included are demographic and questions about the student's high school activities.  60,296 observations are included.

8.  The Angrist and Krueger Data is a compilation of the 1970 and 1980 census data used in their work that examined the impact of cumpulsory school attendance laws on earnings.  The paper can be found here and a description of the data with Stata .do files is here.

9.  The Mroz Data observes only women; no men.  This data was originally used to analyze why some women enter the labor force while others do not.  The first variable in the data set (inlf) is equal to zero if the observation is not in the labor force.  For obvious reasons, using observations not in the labor force to determine the impact of education on wages is not a great idea.

10.  Wage2 Data  This data set comes from Jeffrey Wooldridge and includes 935 observations of  men's wages, IQ, education, experience and a number of other important variables.

11.  The Tennessee Star Data (Tennessee Student Teacher Achievement Ratio--STAR) project was a large scale project during which young school aged children were randomized into classrooms based upon size.  A description of this can be found here.