Skip to main content

Data Resources for SARS-CoV-2

Cases, Deaths, and Testing

"The Challenges of Using Real-Time Epidemiological Data in a Public Health Crisis"
https://medium.com/pew-research-center-decoded/the-challenges-of-using-real-time-epidemiological-data-in-a-public-health-crisis-c7a6c2e9c950
As is often the case, the Pew Research Center is doing much research into and analysis of public attitudes about and responses to the pandemic. In this post, analysts from Pew talk about working with data on COVID-19 cases from different sources, how results from those sources sometimes differ, and how different definitions and collection methods can contribute to those different results. The post also discusses how Pew makes use of aggregated data on case counts in its analyses of public opinion about COVID-19.

Novel Coronavirus (COVID-19) Cases, provided by JHU CSSE
https://github.com/CSSEGISandData/COVID-19
Johns Hopkins has been making availabe datasets collected as part of its much-touted/much-referenced site for tracking new cases, fatalities, and recoveries.

An Ongoing Repository of Data on Coronavirus Cases and Deaths in the U.S.
https://github.com/nytimes/covid-19-data
The New York Times has likewise been making available county-level data for its maps of the spread of the disease in the United States.

COVID Tracking Project
https://covidtracking.com/
The COVID Tracking Project provides state-level data for tests conducted and test results. In recognition of the diverging effects of the virus across races, the Project has also begun providing breakdowns of data by race/ethnicity and scoring states on whether they provide such breakdowns in their reporting of data.

Coronavirus (COVID-19) in Prisons in the United States, April - June 2020
https://doi.org/10.3886/E119901V1
"This is a collection of publicly reported data relevant to the COVID-19 pandemic scraped from state and federal prisons in the United States. Data are collected each night from every state and federal correctional agency’s site that has data available ... The data primarily cover the number of people incarcerated in these facilities who have tested positive, negative, recovered, and have died from COVID-19. Many - but not all - states also provide this information for staff members."

Covid-19 Superspreading Events Database
https://medium.com/@codecodekoen/covid-19-superspreading-events-database-4c0a7aa2342b
This database is an attempt to code data on superspreader events that have exerted a disproportionate effect on growth in COVID-19 cases. The events are coded by location, date, number of people infected, and type of event. Note the author's caveats about limitations of the data.

China Data Lab
https://dataverse.harvard.edu/dataverse/cdl_dataverse
The China Data Lab has focused on research related to the outbreak in China, but the site has added data from the U.S. and data on global policy measures in response to outbreaks. The non-China-related data are not always updated on a regular basis.

COVID-19 Coronavirus Europe Data
https://dataverse.harvard.edu/dataverse/covid-19-eu
COVID-19 Coronavirus Europe data has been focusing on a handful of EU members, including the "hot spot" countries of Italy and Spain.

Casos de coronavírus (Covid-19) nos municípios do Amazonas (Brasil)
https://dataverse.harvard.edu/dataverse/covid19-amazonia
Municipal-level data for cities in the Brazilian state of Amazonas. See https://blog.brasil.io/2020/03/23/dados-coronavirus-por-municipio-mais-atualizados/ and https://brasil.io/dataset/covid19/caso/. Note that the sites are all in Portuguese.

Data Development Lab COVID India
http://www.devdatalab.org/covid
This site from the Development Data Lab focuses on India "includes estimates of hospital and clinic doctor and bed capacity (district level, and soon subdistrict), CFR predictions based on variation in local population age distribution (subdistrict level), urbanization rates and population density (subdistrict level and lower), as well as deaths and infections at the highest resolution possible." See https://github.com/devdatalab/covid for additional information about the data.

SIG Litoral Norte
https://dataverse.harvard.edu/dataverse/siglitoral
More Brazil-themed COVID-19 data from SIG Litoral Norte/Geographic Information System of the North Coast of Rio Grande do Sul. Once again, the site is in Portuguese.

Hospital Bed Capacity and COVID-19
https://www.propublica.org/datastore/dataset/hospital-bed-capacity-and-covid-19
"A dataset of hospital bed capacity data for each of 306 U.S. hospital markets, including data for nine different models of COVID-19 infection scenarios ... [They] modeled various scenarios, in which 20%, 40% and 60% of the adult population would be infected with the novel coronavirus, many of whom would have no or few symptoms, and examined whether hospitals had the capacity to handle them if the cases came in over six months, 12 months and 18 months." Note that the data are from March of 2020 and may not reflect the various ways by which hopsitals have been building up capacity. The data are sourced from Harvard's Global Data Institute.

University of Maryland COVID-19 Impact Analysis Platform
https://data.covid.umd.edu/
This project provides visualizations of state- and county-level data for the U.S on a variety of metrics for 4 different categories: Mobility and Social Distancing, COVID and Health, Economic Impact, and Vulnerable Population. The data are a mix of publicly-available data and estimated calculated by the project. See https://data.covid.umd.edu/about/index.html for a complete list of available indicators and for how to request access to the data for those indicators. See "Replication Data for: Quantifying Human Mobility Behavior Changes During the COVID-19 Outbreak in the United States" for a replication dataset making use of some data from this project.