Skip to Main Content

Data Resources for Sociology and Sociologists

Census Data, Demographics, and Population Surveys

Current Population Survey (CPS)
https://www.census.gov/programs-surveys/cps.html
The Current Population Survey is a joint project between the Bureau of Labor Statistics and the U.S. Census Bureau. The CPS is a monthly survey that collects basic socio-demographic information, labor force characteristics, and economic status. To access CPS data files, users can download the data from an FTP site or use various tools for creating tables from the microdata. Alternately, they can go to the National Bureau of Economic Research's CPS site or to IPUMS-CPS.

Data.Census.gov
https://data.census.gov/
Data.Census.gov is the successor to the now-decommissioned American FactFinder and is is an extensive source for census statistics from the U.S. Census Bureau. Users can create data tables from the Decennial Census (2000 and 2010), the American Community Survey (2000-present), and other Census Bureau data collections and download those tables into spreadsheet files. See https://ask.census.gov/prweb/PRServletCustom?pyActivity=pyMobileSnapStart&ArticleID=KCP-5489# for a list of available data collections. There are also multiple tools in R and in Stata and Stata (again) for grabbing data from the Census Bureau's various Application Programming Interfaces (APIs).

IPUMS (Integrated Public Use Microdata Series) USA
https://usa.ipums.org/usa/
The IPUMS project at the University of Minnesota is an excellent source for Census data in the form of microdata samples from each decennial Census from 1850-2000 and from the American Community Survey for 2001 and onwards. See https://usa.ipums.org/usa/sampdesc.shtml for a list of the samples available via IPUMS. The data include standardized/harmonized variables for topics such as industry, occupation, race and ethnicity, and educational attainment, for easier comparisons of data over time. The microdata come with geographic tools and boundary files, to assist with geographic analyses of the individual-level data. IPUMS-USA is one of many IPUMS efforts - see https://www.ipums.org/ for the full list.

National Neighborhood Data Archive (NaNDA)
https://www.openicpsr.org/openicpsr/nanda
NaNDA provides access to datasets that "include socioeconomic disadvantage and affluence, walkability, crime, land use, recreational centers, libraries, fast food, climate, healthcare, housing, public transit, and more." Depending on the underlying source(s) for the data, they are available for counites, Census tracts, or ZIP Code Tabulation Areas (ZCTAs). The data are collected as part of a project at the University of Michigan on health and neighborhood context. Note that the data will be moving to a new location at https://www.icpsr.umich.edu/web/ICPSR/series/1920

Social Explorer
http://www.socialexplorer.com
Social Explorer provides quick and easy access to current and historical census data and demographic information. Its contents include the entire U.S. Census from 1790 to 2020, annual updates from the American Community Survey, data on religious congregations for the United States for 2009, decennial religious congregation data for 1980-2010, economic data on businesses, crime data, health indicators, and carbon emissions data for 2002. Users can create reports and maps at various levels of geography, including counties, Census tracts, Census block groups, and ZIP codes, depending on data availability. Social Explorer is also available via Databases at Emory.

Courts, Criminal Justice, and Violence

Bureau of Justice Statistics
https://bjs.ojp.gov/
The BJS provides a wealth of crime and criminal justice data compiled by the U.S. government via a variety of data-collection programs. The BJS also provides various tools to produce and download tables on topics such as crime rates, crime victimization, and corrections populations. Many of the BJS' data collections are available via the National Archive of Criminal Justice Data.

Historical Violence Database
https://cjrc.osu.edu/research/interdisciplinary/hvd
The Historical Violence Database, which is hosted at the Criminal Justice Research Center at Ohio State University, is (to quote the website) "a collaborative research project on the history of violent crime, violent death, and collective violence." Data from the project are available for the United States and other countries.

National Archive of Criminal Justice Data (NACJD)
https://www.icpsr.umich.edu/web/pages/NACJD/index.html
The NACJD provides online access to and analysis of crime and justice data from federal and state agencies. The data cover topics such as attitudes towards crime, crime stats from official agencies, and the functioning of the criminal justice system.

Education

College and Beyond II: Outcomes of a Liberal Arts Education (CBII)
https://www.icpsr.umich.edu/web/pages/about/cbII/index.html
The CBII project is an attempt to assess the merits of liberal arts educations and higher education more broadly by collecting data on students at various public institutions during their time in college and, for a subset of those students, their post-graduation experiences in terms of health, labor markets, civic participation, and inequality. The data from the project are being released via the ICPSR at the University of Michigan. Note that the data are not available for download and are instead available via application only.

Higher Education Research Institute (HERI)
https://heri.ucla.edu/
UCLA's Higher Education Research Institute conducts recurring surveys on college freshmen, college seniors, and faculty. Data from many of their surveys are available for researchers upon application.

IPUMS-Higher Ed
https://highered.ipums.org/highered/index.shtml
The IPUMS Project at the University of Minnesota is taking data from the National Science Foundation's surveys of college graduates, graduate students, and doctorates and harmonized the data for easier analysis and comparisons over time.

National Center for Education Statistics (NCES)
https://nces.ed.gov/
The NCES is the primary federal entity for collecting and analyzing data and statistics in the United States and in various other countries. The NCES conducts or participates in various data-gathering programs, such as this collection of surveys pertaining to higher education and this collection of data for international comparisons. For quick reference, researchers should check out https://nces.ed.gov/quicktables/ and the Digest of Education Statistics.

National Longitudinal Survey of Freshmen (NLSF)
http://nlsf.princeton.edu/
The NLSF, which is housed at Princeton University's Office of Population Research, follows a cohort of college freshmen over time, with the intent of testing competing theories/explanations for underperformance of minorities in higher education. The data are available upon registration from the OPR.

US Schools - Desegregation Court Cases and School Demographic Data
https://s4.ad.brown.edu/Projects/USSchools/index.html
This project at Brown University "includes data for the period 1970-2010 about desegregation court cases, trends in racial composition and segregation of elementary schools, and additional information about poverty, teacher-student ratios, and performance on standardized state tests. For every U.S. school district, information on racial and ethnic composition of the elementary student population for available years, and summary indices of school segregation and disparities are posted." This project is part of the American Communities Project's collection of projects on urbanization, demographics, education, and inequalities.

Health Data

Data Resources for SARS-CoV-2
https://guides.libraries.emory.edu/SARS_CoV2
This guide is a compilation of sources for data on Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) and COVID-19. The guide includes links to data on different facets of the virus and the disease it causes: cases, deaths, testing, projections, policy responses, public opinion and reactions, economic consequences, and some tools for working with data in different statistical applications. The focus is on data that are directly about the virus and the disease it causes or have been re-framed to focus on some aspect of their effects.

IPUMS Health Surveys
https://www.ipums.org/healthsurveys.shtml
The University of Minnesota's IPUMS project has created harmonized microdata from the National Health Interview Surveys series and harmonized the data for easier comparisons over time. The integrated data consist of samples from different iterations of the NHIS dating back to the 1960s. IPUMS has a similar project for data from the Medical Expenditure Panel Survey series. IPUMS also provides harmonized microdata from the Youth Risk Behavior Surveillance System (YRBSS) and the National Youth Tobacco Survey (NYTS).

National Center for Health Statistics (NCHS)
http://www.cdc.gov/nchs/
The NCHS, which is part of the CDC, contains much data on various health indicators at both national and state levels. The Data Warehouse may be of particular interest, as it provides access to public-use microdata from surveys such as the National Health Interview Survey. The National Vital Statistics System may also be of interest for data and historical reports on births, deaths, and marriages.

National Survey of Family Growth (NSFG)
https://www.cdc.gov/nchs/nsfg/
The NSFG is an on-going survey series focusing on matters of family history and reproductive history (e.g. marital history, childbirth, usage of contraceptives) as well as more general matters such as employment history and demographics. Earlier NSFG waves are available via the Social Science Electronic Data Library (see below), via the National Center for Health Statistics, and in harmonized form via the Integrated Fertility Survey Series.

Social Determinants of Health (SDOH) Database
https://www.ahrq.gov/sdoh/data-analytics/sdoh-data.html
The SDOH Database from the Agency for Healthcare Research and Quality provides data for "five key SDOH domains: social context (e.g., age, race/ethnicity, veteran status), economic context (e.g., income, unemployment rate), education, physical infrastructure (e.g, housing, crime, transportation), and healthcare context (e.g., health insurance). The files can be linked to other data by geography (county, ZIP Code, and census tract). The database includes data files and codebooks by year at three levels of geography, as well as a documentation file." The data are available for 2009 onward, with separate Excel files for separate years.

Social Science Electronic Data Library (SSEDL)
http://www.socio.com/members/memonly.htm
The Social Science Electronic Data Library is an archive of over 300 datasets covering a variety of topic areas, including Adolescent Pregnancy, Aging, AIDS/STD's, the American Family, Disability in the US, and Maternal Drug Abuse. The archive is well-indexed and allows variable-level searches. This resource is also available via Databases at Emory. Many of the older studies are also available on CD-ROM's in the Data Center.

Public Opinion and Survey Data

"Black Amercia and Public Opinion"
https://ropercenter.cornell.edu/black_america_public_opinion
This is a project from the Roper Center for Public Opinion Research focusing on polling data related to Black Americans. The available resources include guidance for finding polls with samples sufficient for analyzing public opinion among Black Americans. There are also overviews of historic polling data on public opinion and Black Americans and research into social attitudes among the Black population.

Gallup Analytics
https://analyticscampus.gallup.com/
Gallup Analytics is a portal for trends in public opinion drawn from Gallup surveys, covering topics such as health, economic well-being, political attitudes, and religious views. The trends can be broken down by demographics or by geographic area and can be exported into spreadsheet-friendly formats. Our subscription also provides access to respondent-level microdata from Gallup's Daily Tracking Polls, Social Series Polls, and the World Poll. Please contact Dr. Robert O'Reilly for questions about accessing Gallup microdata. The Roper Center for Public Opinion Research also has microdata for many Gallup polls. Gallup Analytics is also available via Databases at Emory.

General Social Survey (GSS)
https://www.norc.org/research/projects/gss.html
The GSS measures public opinion in the United States on a wide variety of topics of interest to social scientists. The survey, which began in the early 1970's, provides a biennial perspective on American attitudes toward government, life, race, religion, and other social issues. The link here is to the GSS homepage within the National Opinion Research Center. Sites where researchers can extract and download specific variables of interest are listed here. The SDA Archive at Berkeley also holds GSS data from 1972 onward in an interface that allows for basic on-line data analysis and the creation of subsets of GSS data. GSS data are also available via the Roper Center.

National Archive of Data on Arts & Culture (NADAC)
https://www.icpsr.umich.edu/web/pages/NADAC/index.html
The ICPSR's National Archive of Data on Arts & Culture contains many datasets pertaining to topics such as participation in the arts (e.g. attending exhibits or concerts), "cultural policy" (e.g. support for public funding of the arts), the vitality of the arts in local areas, and media coverage of the arts and culture. Much of the contents of this archive formerly resided at the Cultural Policy and the Arts National Data Archive (CPANDA).

Pew Research Center
https://www.pewresearch.org/
The Pew Research Center is a major center for the study of public opinion and regularly conducts polls about various social and political topics and contemporary issues. They make many of their studies available for download here. Many of the more recent datafiles are in SPSS format. Users are required to register before downloading a dataset, but registration is free. Users should also check out Pew's various research projects and topics, such as Hispanic Trends, the Global Attitudes Project, and Religion and Public Life.

Roper Center for Public Opinion Research
https://ropercenter.cornell.edu/
The Roper Center is one of the country's premier centers for polling data, with holdings dating back to 1935 and data from a large and growing list of providers. While the bulk of the Center's data are for national polls, it also includes many state-level polls as well. The iPOLL interface may be of particular use because it allows users to search through surveys at the question level. Roper also has a large compilation of Presidential approval ratings and various other collections of contemporary and historic opinion polls. See https://ropercenter.cornell.edu/data-highlights/featured-projects for other Roper projects related to public-opinion data. The Roper Center is also available via Databases at Emory.

Religion

Association of Religion Data Archives (ARDA)
https://www.thearda.com/
The ARDA contains many datasets pertaining to religion, such as surveys on topics such as the public's religious attitudes and practices, surveys of church leaders, and studies on the provision of social services by individual congregations. ARDA also provides geographic profiles of congregations and demographic profiles of denominations.

Pew Research Center's Religion and Public Life Project
https://www.pewresearch.org/topic/religion//
The Religion and Public Life Project is devoted to the intersections between religious faith and public affairs, including topics such as "shifting religious composition to the influence of religion on politics to the extent of government and social restrictions on religion." In addition to reports and analyses, it provides access to a variety of data-related resources, such as individual datasets and interactive databases. The Project's undertakings range from large-scale surveys such as the U.S. Religious Landscape Survey to summaries of state-level legislation on religious law.