Skip to Main Content

Data Resources and Support

Support for locating and working with datasets, statistical information, and geographic data.

Health

Area Resource File (ARF) -- The ARF is a database of health resources data, measured at the county level for over 6,000 indicators. The information includes measures of employment in various health professions; availability of health facilities; and frequency of utilization; hospital and Medicare expenditures. The database is freely available to download.

Behavioral Risk Factor Surveillance System (BRFSS) -- The BRFSS was established by the Centers for Disease Control to provide data on personal behaviors that present health risks (e.g. alcohol and tobacco consumption, exercise patterns, dietary issues). The site provides both time-series data at the national and state levels for various categories of "behavioral risk" and also microdata files from which the national and state estimates are produced.

Centers for Disease Control and Prevention (CDC) -- The website for the CDC includes a section devoted to data and statistics at both the national and state levels The site provides access to data on topics such as mortality, chronic illnesses and conditions, reproductive health, and health behaviors. There are multiple tools for accessing statistical information. The CDC's Morbidity and Mortality Weekly Report (MMWR) is also useful for locating quantitative information on various topics.

CDC WONDER -- CDC WONDER is a portal to numerous databases concerning health-related topics such as AIDS/STDs and other communicable diseases, risk behaviors (the Behavioral Risk Surveillance System), mortality and natality statistics, and population estimates. There are several online data tools as well as a link to the SEER database.

County Health Rankings and Roadmaps -- The County Health Rankings data are a compilation of county-level data on health outcomes, health behavior, health facilities, and socio-economic and environmental measures. There are data for annual rankings for 2010 onward and trend data for select indicators that cover longer ranges of years.

Data Resources for SARS-CoV-2 -- This guide is a compilation of sources for data on Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) and COVID-19. The guide includes links to data on different facets of the virus and the disease it causes: cases, deaths, testing, projections, policy responses, public opinion and reactions, economic consequences, and some tools for working with data in different statistical applications. The focus is on data that are directly about the virus and the disease it causes or have been re-framed to focus on some aspect of their effects.

Demographic and Health Surveys (DHS) -- The DHS is a project funded by the United States Agency for International Development to promote better gathering of survey data on health issues pertaining to family life and reproductive health in developing countries. Aggregated data from DHS surveys are readily accessible via the STATcompiler interface that allows for on-line visualization and for download in tabular form. Microdata files from the surveys are also available upon registration and application for access. The DHS Program also has a code library on GitHub that includes sample code to calculate common DHS indicators in R, SPSS, and Stata.

Global Health Data Exchange (GHDx) -- The Global Health Data Exchange is an undertaking by the Institute for Health Metrics and Evaluation (IHME) and is a "comprehensive catalog of surveys, censuses, vital statistics, and other health-related data." While it is not a data archive and thus does not necessarily provide access to specific data resources, it is a very useful tool for locating health/demographic data at both the micro- and macro- levels. The IHME has various other data-related projects, including its estimates of Global Burden of Disease.

Global Health Impact Project -- This project from SUNY-Binghamton's Human Rights Institute provides indices for the impacts of drugs and pharamceutical companies on the severity of diseases in terms of disability-adjusted life years, in terms of both losses from specific diseases and losses overall by country.

Health in the United States -- Health in the United States is produced by the National Center for Health Statistics and is a yearbook of statistics on health indicators. Data from tables in the HUS are available in both .pdf and Excel. Note that data coverage at the state level is more limited than that for the national level.

HIV/AIDS Surveillance Data Base -- The HIV/AIDS Surveillance Database contains data on estimates of HIV/AIDS prevalence and is, to quote the website, "a compilation of information from those studies appearing in the medical and scientific literature, presented at international conferences, and appearing in the press."

IPUMS Contextual Determinants of Health (CDOH) -- IPUMS Contextual Determinants of Health "provides access to measures of disparities, policies, and counts, by state and county, for historically marginalized populations in the United States, including Black, Asian, Hispanic/Latina/o/e/x, LGBTQ+ persons and women." Depending on the topic, the data are available by state or by county; time coverage likewise varies by topic. IPUMS CDOH data are also available via the ICPSR: https://www.icpsr.umich.edu/web/DSDR/series/2200.

IPUMS Global Health -- IPUMS Global Health provides access to "harmonized international survey data on maternal, child, and reproductive health." The IPUMS-DHS data are taken from Demographic and Health Surveys data collected in African and South Asian countries. The IPUMS-PMA data cover family planning, water/sanitation, and menstrual hygeine and are taken from the Performance Monitoring for Action data.

IPUMS Health Surveys -- The University of Minnesota's IPUMS project has created harmonized microdata from the National Health Interview Surveys series and harmonized the data for easier comparisons over time. The integrated data consist of samples from different iterations of the NHIS dating back to the 1960s. IPUMS has a similar project for data from the Medical Expenditure Panel Survey series. IPUMS also provides harmonized microdata from the Youth Risk Behavior Surveillance System (YRBSS) and the National Youth Tobacco Survey (NYTS).

Kaiser Family Foundation State Health Facts -- This site is an excellent source for state-level health data for the U.S. It covers topics such as health insurance coverage, health policy reform, health status and conditions, health care providers, and indicators focusing specifically on women's health and the health of ethnic/racial minorities.

National Center for Health Statistics (NCHS) -- The NCHS, which is part of the CDC, contains much data on various health indicators at both national and state levels. The Data Warehouse may be of particular interest, as it provides access to public-use microdata from surveys such as the National Health Interview Survey. The NCHS also has various databases and interactive tools for accessing and visualizing health data. The National Vital Statistics System may also be of interest for data and historical reports on births, deaths, and marriages.

National Neighborhood Data Archive (NaNDA) -- NaNDA provides access to datasets that "include socioeconomic disadvantage and affluence, walkability, crime, land use, recreational centers, libraries, fast food, climate, healthcare, housing, public transit, and more." Depending on the underlying source(s) for the data, they are available for counites, Census tracts, or ZIP Code Tabulation Areas (ZCTAs). The data are collected as part of a project at the University of Michigan on health and neighborhood context. Note that the data will be moving to a new location at https://www.icpsr.umich.edu/web/ICPSR/series/1920.

National Survey of Family Growth (NSFG) -- The NSFG is an on-going survey series focusing on matters of family history and reproductive history (e.g. marital history, childbirth, usage of contraceptives) as well as more general matters such as employment history and demographics. The latest wave of the NSFG is also available on CD-ROM in the Data Center. Older waves are also available via the Social Science Electronic Data Library and in harmonized form via the Integrated Fertility Survey Series.

PolicyMap -- PolicyMap allows you to map out data on population demographics, income, poverty, housing, crime, health, the environment, education, and other topics. Uses can map out data at various levels of geogprahy - states, counties, ZIP codes, or Congressional districts, depending on the data. You can also download data in spreadsheet-friendly formats. PolicyMap is also available via Databases at Emory.

Roper Center Health Poll Database -- The Health Poll Database from the Roper Center for Public Opinion Research is "the most comprehensive database for health-related U.S. survey questions, covering eighty years of national polling. Searchable questions and results, demographic crosstabs, and trends are available on every topic related to health, from social determinants and influences on health to insurance, costs and healthcare utilization." The database will let you search for topline results from individual questions from surveys and, when available, download the respondent-level microdata for those surveys. There is also a visual tool for an overview of the general topics covered by questions in the database and the individual sub-topics that make up the larger topics.

Social Determinants of Health (SDOH) Database -- The SDOH Database from the Agency for Healthcare Research and Quality provides data for "five key SDOH domains: social context (e.g., age, race/ethnicity, veteran status), economic context (e.g., income, unemployment rate), education, physical infrastructure (e.g, housing, crime, transportation), and healthcare context (e.g., health insurance). The files can be linked to other data by geography (county, ZIP Code, and census tract). The database includes data files and codebooks by year at three levels of geography, as well as a documentation file." The data are available for 2009 onward, with separate Excel files for separate years.

Social Explorer -- Social Explorer provides quick and easy access to current and historical census data and demographic information. Its contents include the entire U.S. Census from 1790 to 2020, annual updates from the American Community Survey, data on religious congregations for the United States for 2009, decennial religious congregation data for 1980-2010, economic data on businesses, crime data, health indicators, and carbon emissions data for 2002. Users can create reports and maps at various levels of geography, including counties, Census tracts, Census block groups, and ZIP codes, depending on data availability. Social Explorer is also available via Databases at Emory.

Social Science Electronic Data Library (SSEDL) -- The Social Science Electronic Data Library is an archive of over 300 datasets covering a variety of topic areas, including Adolescent Pregnancy, Aging, AIDS/STD's, the American Family, Disability in the US, and Maternal Drug Abuse. The archive is well-indexed and allows variable-level searches. This resource is also available via Databases at Emory.

Social Vulnerability Index -- The CDC and the Agency for Toxic Substances and Disease Registry have compiled the Social Vulnerability Index, which measures "the resilience of communities (the ability to survive and thrive) when confronted by external stresses on human health, stresses such as natural or human-caused disasters, or disease outbreaks." The data are available in both .csv and GIS formats for counties and Census tracts and are constructed from American Community Survey data for topics such as socio-economic status, household composition, minority populations, housing types, and transportation access.

Sources for Data on Social Determinants of Health (SDOH) -- As part of its collection of resources on social determinants of health, the CDC has compiled a list of relevant data sources covering topics like chronic health conditions, environmental conditions and hazards, healt care access and disparities, and socio-economic conditions/vulnerabilities. An archived copy of an older version of this site at https://web.archive.org/web/20221005233047/https://www.cdc.gov/socialdeterminants/data/ has additional relevant sites for data.

United Nations Children's Fund (UNICEF) -- UNICEF is a great source for cross-national indicators on the health and well-being of children and their mothers. Much of UNICEF's indicators are derived from the Multiple Indicator Cluster Surveys, data from which are available upon application.

Wharton Research Data Services (WRDS) -- WRDS is an excellent source for data on both company financials (via COMPUSTAT) and stock prices (via CRSP). The university's WRDS subscription also provides access to data for measures of market volatility, balance sheets of financial institutions, and hospital-level data on health services and finances. Access to the university's WRDS subscription requires registration to request an account.

World Bank Health, Nutrition and Population (HNP) Data and Statistics -- HealthStats is "the World Bank’s comprehensive database of Health, Nutrition and Population (HNP) statistics," covering topics such as reproductive health, population growth, communicable and non-communicable diseases, and health facilities. The data are accessible through various means, such as via an interface that allows for both queries and bulk downloads and via tools to break health data down by household wealth. The database is part of the Bank's Data Catalog of statistical databases and other data collections. There are also multiples tools available for importing WDI data directly into programs such as R and Stata.

World Health Organization (WHO) Global Health Expenditure Database -- The WHO's Global Health Expenditure Database provides data for 2000 onward on health expenditure from both public and private sources and on consumption of health care services. You can download the data in bulk or query the data to create extracts for particular countries, years, and indicators.

World Health Organization (WHO) Health Statistics and Information Systems -- The WHO's Health Statistics and Information Systems compiles various resources related to health data and indicators, including a registry of health-related indicators with definitions and sources and links to various WHO databases for health measures.

WHO Multi-Country Studies Data Archive -- The WHO provides access to various of its international health surveys, covering topics such as ageing and adult health, HIV, responsiveness of health systems, reproductive health and history, and risk factors.

World Population Prospects -- This database is produced by the United Nations' Population Division and contains cross-national data on basic population demographics and vital statistics (e.g. birth and death rates, maternal mortality, population by age and gender and urban/rural areas, median age of population). Data are available in five-year increments back to 1950 and with projections up to 2050. See https://www.un.org/en/development/desa/population/publications/dataset/index.asp for other data from the Population Division.

Youth Risk Behavior Surveillance System (YRBSS) -- The YRBSS is similar to the Behavioral Risk Factor Surveillance System, but focuses specifically on adolescents. Harmonized microdata from the YRBSS for the years 1991-2013 are also available via the IPUMS YRBSS Data.