Skip to Main Content

Data Resources for SARS-CoV-2

Socio-Economic Contexts

Bringing Greater Precision to the COVID-19 Response
The Surgo Foundation, a development-oriented NGO, has assembled datasets on different facets to the pandemic, including case counts, various estimates/measures of social distancing, Twitter data, and a community vulnerability index that estimates which areas may need the most assistance in facing the virus based on characteristics such as healthcare capacity, transportation access, underlying health conditions, and household composition. Data are available by state, by county, and by census tract.

CDC Social Vulnerability Index
The CDC has produced state, county-, and Census-tract-level estimates of "social vulnerability," i.e. "the resilience of communities when confronted by external stresses on human health, stresses such as natural or human-caused disasters, or disease outbreaks." The index is constructed from data from the American Community Survey for topics such as poverty, employment states, household compositions and population demographics, and access to transportation. See for an example of empirical research on the relationship between this index and COVID-19 incidence.

COVID-19 Pandemic Vulnerability Index (PVI)
The National Institute of Environmental Health Sciences has created a county-level index of pandemic vulnerability that is a weighted aggregation of data on infection rates, population density and demographics, social/policy interventions, environmental conditions, and health conditions. There is a quick-start guide for the index and its dashboard, and the underlying data that make up the index are available via GitHub.

Demographic and Health Surveys (DHS)
The DHS is a project funded by the United States Agency for International Development to promote better gathering of survey data on health issues pertaining to family life and reproductive health in developing countries. Various of the data and indicators collected from these surveys, such as data on households' water supplies and sanitation practices, household composition, and household members' underlying health conditions, may be useful for assessing potential vulnerability to pandemics. Aggregated data from DHS surveys are readily accessible via the STATcompiler interface that allows for on-line visualization and for download in tabular form. Microdata files from the surveys are also available upon registration and application for access. The DHS Program also has a code library on GitHub that includes sample code to calculate common DHS indicators in R, SPSS, and Stata. Harmonized extracts of selected DHS modules are available via

Kaiser Family Foundation State Health Facts
This resource from the Kaiser Family Foundation is an excellent source for state-level health data for the U.S. It covers topics such as health insurance coverage, health policy reform, health status and conditions, health care providers, and indicators focusing specifically on women's health and the health of ethnic/racial minorities. The "Health Status" section includes COVID-19-related data on cases and deaths (including breakdowns by ethnicity/race), testing, and adults at higher risk for serious cases.

National Neighborhood Data Archive (NaNDA)
NaNDA provides access to datasets that "include socioeconomic disadvantage and affluence, walkability, crime, land use, recreational centers, libraries, fast food, climate, healthcare, housing, public transit, and more." Depending on the underlying source(s) for the data, they are available for counites, Census tracts, or ZIP Code Tabulation Areas (ZCTAs). The data are collected as part of a project at the University of Michigan on health and neighborhood context.

PolicyMap allows you to map out data on population demographics, income, poverty, housing, crime, health, the environment, education, and other topics. Uses can map out data at various levels of geogprahy - states, counties, ZIP codes, or Congressional districts, depending on the data. You can also download data in spreadsheet-friendly formats. The database has also been adding COVID-19-related data on cases/deaths/testing, health conditions, healthcare access and capacity, food access, and internet access. PolicyMap is also available via Databases at Emory.

Social Explorer
Social Explorer provides quick and easy access to current and historical census data and demographic information. Its contents include the entire U.S. Census from 1790 to 2010, annual updates from the American Community Survey, data on religious congregations for the United States for 2009, decennial religious congregation data for 1980-2010, economic data on businesses, crime data, health indicators, and carbon emissions data for 2002. Users can create reports and maps at various levels of geography, including counties, Census tracts, Census block groups, and ZIP codes, depending on data availability. See the Social Explorer blog for examples of the use of the data for COVID-19-related questions such as the distribution of people working in health care occupations or in services. Social Explorer is also available via Databases at Emory.

World Bank Data Catalog
As part of its Open Data Initiative, the World Bank has opened up access to dozens of its data collections and compiled into a single data catalog. The holdings here range from larger databases covering multiple topics to more narrowly-focused collections associated with particular research projects. The Bank has flagged a collection of "Coronavirus (COVID-19) Related Datasets" that may be of particular relevance for analysis of the pandemic. You can also search for World Bank data by country or by looking for specific indicators.