Data Resources

The CAFÉ data management objective is to aid the climate and health community of practice by identifying climate and health research data needs, defining common data elements, developing and promoting data science and software tools for processing data, and providing data management and dissemination guidance. 

What we provide:

  • Data Collection

    A collection of climate and health data on Harvard Dataverse allowing researchers to contribute, deposit, share, and reuse data. Commonly used datasets will be curated and deposited into the collection by the CAFÉ RCC team.

  • Processing & Analysis Guides

    A collection of code, software, and tutorials on GitHub allowing researchers to contribute, share, and reuse existing code and software for data processing and analysis to facilitate reproducibility and reusability. This will include commonly used data processing and analysis tasks such as spatial aggregations, data harmonization, and analysis.

  • Data Guidelines

    Data management and dissemination guidelines for the community based on CAFÉ RCC standards of practice.

  • Additional Data Assistance

    For pilot awardees and/or collaborators in low- or middle-income countries, additional services and guidance on utilizing cloud computing infrastructure will be provided by the CAFÉ RCC team.

CAFÉ Climate Health Data Sharing with Community of Practice

Harvard Dataverse is an open source generalist data repository where we are amassing a collection of commonly-used climate and health data and linkages, including spatial data. Following FAIR principles, we strongly encourage the community of practice to help expand the Climate and Health Dataverse collection by contributing data for sharing and reuse. 

To learn more about Harvard Dataverse, click here.

How to Upload Data to the CAFÉ Dataverse

Guidelines for Dataset Contributions

We strongly encourage the community of practice to contribute to the expansion of the CAFE Dataverse Collection. Emphasizing open access and collaborative research, the CAFÉ Collection invites contributions from a diverse array of stakeholders, including government agencies, NGOs, community-based organizations, industry partners, and academics. 

Parameters for what datasets are appropriate and inappropriate for the CAFE Dataverse are described below:

General Guidance

  • Contributions should be relevant to climate and health research. 

  • Contributions should not be identical to data stored in other repositories. The submission of processed derivatives or expansions of data accessible through existing sharing resources (ie: SEDAC, Google Earth Engine) are encouraged.

  • Contributions should be in line with the licensing of source data.

  • Data contributors should only post data that they own, have generated, or have been granted permission to reshare in a manipulated version (ie: census data).

  • No restricted access data (ie: data including personal identifying information) should be shared through the CAFE Dataverse. Contributions will be widely accessible to Harvard Dataverse users. 

File Formatting and Size Limitations

  • All file types are supported for upload and download

  • A maximum of 1,000 files are allowed per upload

  • The file upload limit is 300 GB per file

  • Dataverse can ingest data in certain formats as specifically as tabular data, which will allow for exploration and manipulation of the data with external tools. Tabular file ingest is limited to 143.1MB. For more information, see: Dataverse Tabular Data File Guide

Climate & Health Tutorials and Code Sharing

CAFÉ RCC has a established a community GitHub organization to share code, examples, and tutorials on working with common data formats and sources. We are assembling a collection of standard operating procedures (SOPs) for data processing, integration, harmonization, and analysis with code and tutorials in order to facilitate reproducibility and reusability for the climate and health Community of Practice. We encourage community members to contribute, share, and reuse code and software within the CAFÉ GitHub. Code and tutorials will include tasks such as spatial aggregations, data harmonization, and analysis across languages including R and Python and platforms such as ArcGIS. 

Need help getting started? Common Places to Find Climate & Health Data

Help us make climate and health research more accessible.