Skip to Main Content

Lost in Time, Like Tears in Rain: Access to At-Risk Public Data

Sites for Accessing Federal Data

DataLumos
https://archive.icpsr.umich.edu/datalumos/home
DataLumos is hosted by the ICPSR at the University of Michigan and is an archive for valuable government data resources that have been taken down or are considered at risk for being taken down. The data are a mix of administrative data from federal sources, population surveys conducted by federal agencies, and other data produced or collected by the federal government. The data come from agencies such as the CDC, the Bureau of Labor Statistics, the National Center for Education Statistics, and many others. For a full list of government agencies from which data have been deposited in DataLumos, see https://www.datalumos.org/datalumos/browse/facet/studies/agencies?start=0&ARCHIVE=datalumos&sort=DATEUPDATED%20desc&rows=25. For a full list of the data available here, see https://www.datalumos.org/datalumos/search/studies.

Dataverse
https://dataverse.harvard.edu/
Harvard's Dataverse platform mainly focuses on replication datasets for publications, data from ongoing research projects, data from various NGOs, and other sources. The platform is also being used to provide new homes for federal data that have been taken down or are at risk for being taken down, mainly data related to topics such as the environment, public health, and civil rights. See in particular https://dataverse.harvard.edu/dataverse/cafe-extracted-data for data extracted by the CARE Climate and Health Resource Consulting Center and https://dataverse.harvard.edu/dataverse/held-extracted-data for data from the Harvard Environment and Law Data (HELD) Collection.

Federal Reserve Economic Data
https://fred.stlouisfed.org/
FRED, which is hosted by the St. Louis Federal Reserve, includes time-series data for variables such as GDP, interest rates, exchange rates, consumer prices, labor markets, and money/finance. Most of the data are from the 1950s onwards, though some series extend back prior to WWII. While most of the data are national, there is also much in the way of state/local data and international data available here. FRED draws data from multiple sources - public, private, and academic. See https://fred.stlouisfed.org/sources for a full list of sources that provide data to FRED. There are many add-ins available to access and use FRED data in Excel, R, R (again), Stata, and Stata (again).

IPUMS CPS
https://cps.ipums.org/cps/
The Current Population Survey (CPS) is a joint project between the Bureau of Labor Statistics and the U.S. Census Bureau. The CPS is a monthly survey that collects basic socio-demographic information, labor force characteristics, and economic status. The monthly surveys also include period modules on topics like voter registration, food security, participation in the arts, education, disability, and contingent work. The data include standardized/harmonized variables for topics such as industry, occupation, race and ethnicity, and educational attainment, for easier comparisons of data over time. See https://tech.popdata.org/ipumsr/ for the ipumsr package for working with IPUMS microdata.

IPUMS Global Health
https://globalhealth.ipums.org/
IPUMS Global Health provides access to "harmonized international survey data on maternal, child, and reproductive health." The IPUMS-DHS data are taken from DHS data collected in African and South Asian countries. The IPUMS-PMA data cover family planning, water/sanitation, and menstrual hygiene and are taken from the Performance Monitoring for Action data. IPUMS-MICS data cover the health and well-being of children, adolescents, and mothers and are taken from the UNICEF Multiple Indicator Cluster Surveys. Note that access to the IPUMS DHS data is managed by the DHS Program, which is not currently processing new requests for data access. Previously-approved users can still access data they have been approved to use. See https://youtu.be/EBirvFdF7Lo for a webinar from IPUMS Global on the current state of access to DHS data and similarities among the DHS surveys, PMA surveys, and MICS surveys.

IPUMS Health Surveys
https://healthsurveys.ipums.org/
The University of Minnesota's IPUMS project has created harmonized microdata from the National Health Interview Surveys series and harmonized the data for easier comparisons over time. The integrated data consist of samples from different iterations of the NHIS dating back to the 1960s. IPUMS has a similar project for data from the Medical Expenditure Panel Survey series. The data include standardized/harmonized variables for topics such as industry, occupation, race and ethnicity, and educational attainment, for easier comparisons of data over time. See https://tech.popdata.org/ipumsr/ for the ipumsr package for working with IPUMS microdata. IPUMS also provides harmonized microdata from the Youth Risk Behavior Surveillance System (YRBSS) and the National Youth Tobacco Survey (NYTS).

IPUMS USA
https://usa.ipums.org/usa/
IPUMS USA is an excellent source for Census data in the form of microdata samples from each decennial Census from 1850-2000 and from the American Community Survey for 2001 and onwards. See https://usa.ipums.org/usa/sampdesc.shtml for a list of the samples available via IPUMS. The data include standardized/harmonized variables for topics such as industry, occupation, race and ethnicity, and educational attainment, for easier comparisons of data over time. The microdata come with geographic tools and boundary files, to assist with geographic analyses of the individual-level data. See https://tech.popdata.org/ipumsr/ for the ipumsr package for working with IPUMS microdata.

PolicyMap
https://www.policymap.com/
PolicyMap allows you to map out data on population demographics, income, poverty, housing, crime, health, the environment, education, and other topics. Uses can map out data at various levels of geography - states, counties, ZIP codes, or Congressional districts, depending on the data. You can also download data in spreadsheet-friendly formats. For federal data that have been taken off-line, see https://www.policymap.com/blog/purged-federal-agency-data-available for which of those data are available in PolicyMap. PolicyMap is also available via Databases at Emory.

Public Environmental Data Partners (PEDP)
https://screening-tools.com/
The PEDP is a voluntary and collaborative effort to capture data that have been taken down from federal agencies and provide alternative means of accessing them. Their efforts have included both captures of the underlying data and, in some cases, re-creations of sites and tools that were available with the data. THE PEDP have mainly focused on data from the CDC, the EPA, and FEMA. See https://screening-tools.com/archived-data for a full list of data that have been archived by the PEDP.

Social Determinants of Health and Place Project
https://sdohplace.org/
The SDOH and Place Project at the University of Illinois Urbana-Champaign has been capturing data from agencies such as the CDC, the EPA, the Health Resources and Services Administration (HRSA), and the Substance Abuse and Mental Health Services Administration (SAMHSA). See https://emails.illinois.edu/newsletter/02/615978402.html for access to the data that the project has been capturing. The long-term plan is for the data that have been captured to be discoverable via the project's Data Discovery Platform at https://sdohplace.org/search.

Social Explorer
https://www.socialexplorer.com/home
Social Explorer provides quick and easy access to current and historical census data and demographic information. Its contents include the U.S. Census from 1790 to 2020, annual updates from the American Community Survey, data on religious congregations for the United States for 2009, decennial religious congregation data for 1980-2020, economic data on businesses, crime data, health indicators, and carbon emissions data for 2002. Users can create reports and maps at various levels of geography, including counties, Census tracts, Census block groups, and ZIP codes, depending on data availability. Social Explorer is also available via Databases at Emory.