Skip to Main Content

In order to help minimize spread of the coronavirus and protect our campus community, Cowles Library is adjusting our services, hours, and building access. Read more...

Cowles Library

 

Data Analytics

Data Analytics

General Data Analytics Resources

Government Data Sets

Consumer Expenditure Survey

A Bureau of Labor Statistics (BLS) survey that collects information on the buying habits of U.S. consumers. The program consists of two components — the Interview Survey and the Diary Survey — each with its own sample. The surveys collect data on expenditures, income, and consumer unit characteristics. Dates:  1996 - 2014.  (Website)

General Social Survey

The GSS aims to gather data on contemporary American society in order to monitor and explain trends and constants in attitudes, behaviors, and attributes.  Dates 1972 - 2014. (Website)

National Longitudinal Study of Adolescent Health, Waves I-IV, 1994-2008 (Add Health)

A longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States during the 1994-95 school year with four in-home interviews, the most recent in 2008, when the sample was aged 24-32.  Dates:  1994 - 2008.  (Website)

Panel Study of Income Dynamics

A longitudinal panel survey of American families, which measures economic, social, and health factors over the life course of families over multiple generations. Dates: 1968 - 2013.  (Website)

Survey of Consumer Finances

A triennial statistical survey of the balance sheet, pension, income and other demographic characteristics of families in the United States; the survey also gathers information on the use of financial institutions.  Dates:  1992 - 2013.  (Website)

Survey of Income and Program Participation

A survey designed to provide accurate and comprehensive information about the incomes of American individuals and households and their participation in income transfer programs.  Dates:  1984 - 2008.  (Website)

World Values Survey

global research project that explores people’s values and beliefs, how they change over time and what social and political impact they have.  Dates:  1981 - 2012.  (Website)

Terra Populus:  Integrated Data on Population and Environment

TerraPop integrates population census data from around the world with global environmental data, allowing users to obtain customized datasets that incorporate data from multiple sources in a single coherent structure.  Dates:  Varies.   (Website)

 

Sports Data Sets

Baseball - Lahman’s Baseball Database​

Updated yearly.  Database contains complete batting and pitching statistics. 1871 - Current complete season.  Additional data includes: fielding, standings, team stats, managerial records, post-season data and more.

Basketball - Doug's NBA & MLB Statistics Home Page

Updated yearly.  Contains player, team, team opposing, current draft position.  1988 - Current complete season.  For data, select year of interest; "Data in 'Raw Form' for Fantasy Leagues."  For documentation, select data set of interest, then select "Format."  For MLB statistics try Lahman's Baseball database first.

Bicycling - Union Cycliste Internationale

Updated yearly.  Contains race results for road, track, mountain bike, BMX Racing, BMX Freestyly, Trials, Cyclo-Cross and Indoor.  2014 - Current. 

Instructions:  Select type of cycling from top menu; Select "Results"; Select Race; Select General Classification; Select "Export Results"

 

Golf - PGA Tour Golf Data (kaggle)

Updated regularly.  Contains (2010 - current):  Off the Tee; Approach the green; Around the green; Putting; Scoring; Streaks; Money/finishes; and Points/Rankings. 

Instructions:  Use box, "Data Sources" for download of current and historical data.  This is a significantly large database and will not load completely into Excel.  Free registration to "kaggle" is required for download.

Hockey - MoneyPuck

Updated regularly.  Contains (1940 - current):  Player scoring comparison; individual offense contribution; and NHL league standings.

Instructions:  Select dataset or sub-dataset desired.  Click "Download" in top right corner of page.  Variables are listed in the purple menu bar.  Kagle also has a dataset available for 2010 - Current.  Free registration required.

Soccer - European Soccer Database (kaggle)

Contains (2008 - 2016):  25 000 matches; 10 000 players; 11 European countries; Player and team attributes; Team line up; Betting ods; Detailed match events for 10 000 matches. 

Instructions:  In box titled, "database.sqlite (34.45MB)" select download icon on right side box.  This is available only in SQLite.

Tennis - ATP World Tour Tennis Data

Contains (1877 - 2017):  Tournaments; Match scores; Rankings; Player overviews.

Instructions:  Scroll down for field information.

 

Open Source Data Repositories

Kaggle

Online community of data scientists and machine learners.  16 000 public datasets.  Contains both quantitative and qualitative data.  Owned by Google.  Requires free registration.

Data Hub

A project by Datopian and Open Knowledge International. 

FiveThirtyEight

Data sets on Culture, Politics, Sports, Science & Health, and Economics