Data sharing guidelines

Data Sharing Feeds Critical Zone Science

CZO investigators and data managers are expected to share their data with other critical zone scientists and the public for scientific synthesis and for outreach. Such sharing includes submitting metadata to CriticalZone.org, which enables data searching across observatories. 

To increase the effectiveness and streamline the task of sharing data, tools and guides have been developed for CZO investigators, CZO data managers and the broad critical zone science community.

Data Policies & GuidelinesData Sharing Guidelines


Benefits and Outcomes

Data sharing is a foundation for integration and synthesis of our understanding of critical zone processes that occur both within a CZO and across CZOs. Effective sharing of data in a way that best facilitates data discovery, integration, analysis and synthesis, however, can be a challenging if not overwhelming task for critical zone scientists. Furthermore, all CZO investigators and collaborators who receive material or logistical support from a CZO agree to sharing data within 1-3 years according to the CZO Data Sharing Policy.

To increase the effectiveness and streamline the task of sharing data, tools and guides have been developed  for CZO investigators, CZO data managers and the broad critical zone science community. For NSF-funded CZOs, these Guidelines & Instructions are grouped into seven expectations and two options for data sharing:


CZO Data Sharing Guidelines 

CZOs are expected to share data using the following approaches:
  1. List Datasets at CriticalZone.org
    1. The CriticalZone.org website is the starting point for CZOs to share data and for the public to access it.  CZO data managers only need to enter metadata and the URL pointing to the actual data files, which are stored on the CZO's data server or at a data center. The metadata entry is quite flexible and can handle any data type, file format, data quality, etc. Once a dataset is listed, it becomes discoverable at the main website CriticalZone.org as well as via the more powerful CZO Data Search Portal.
    2. Instructions: http://criticalzone.org/national/data/list-datasets/
  2. Share Data as YODA files, as much as possible (not fully operational)
    1. The new YAML Observations Data Archive and Exchange (YODA) file format is being developed to extend the original CZO Display File specification to accommodate the full diversity of critical zone science data -- such as hydrological time series, soil profile geochemistry, biodiversity transects, etc. -- that can be organized with the Observations Data Model v2 (ODM2). As an implementation of ODM2, YODA will serve as a text-file encoding both for archiving observational data at recommended data centers and for integrating diverse data from multiple sources using ODM2-based cyberinfrastructure at these data centers and the CZO Central system.
    2. Instructions: http://criticalzone.org/national/data/yoda-files/
  3. Use Controlled Vocabularies, starting with ODM2 vocabularies
    1. Describing data in clear and consistent terms is crucial for better data sharing, discovery, and integrations.  Several systems are available to help CZO investigators select appropriate terms from Controlled Vocabularies (CVs) shared by the broader scientific community.
    2. Instructions:: http://criticalzone.org/national/data/vocabularies/
  4. Obtain IGSNs for Sites and Specimens
    1. CZOs benefit from consistent use of unambiguous identifiers for integrating data from repeated or diverse measurements at the same location or on the same specimen. The CZOs have adopted the International Geo Sample Number (IGSN) system of globally unique and persistent identifiers for tracking, sharing and citing information and data associated with sites, specimens and other sampling features.
    2. Instructions: http://criticalzone.org/national/data/igsn/
  5. List and share geospatial data on CriticalZone.org
    1. The geospatial data used by critical zone observatories has tremendous value, especially to those who are already interested in your other datasets. By sharing your geospatial data files, you’re providing fellow researchers with the proper spatial context to understand what your observatory is about.
    2. Instructions: http://criticalzone.org/national/data/geospatial/
  6. Share LIDAR via OpenTopography
    1. The Critical Zone Observatories program partners with OpenTopography for LiDAR data.  OpenTopography is an NSF-funded project for the specific purpose of hosting a centralized LiDAR repository. OpenTopography allows users to access and process LiDAR point cloud data on the fly for an area of interest. The goal of the system is to provide a web-based toolset that can democratize access to massive and potentially computationally challenging LiDAR topography datasets.
    2. Instructions: http://criticalzone.org/national/data/lidar/
  7. Publish Datasets to Archival Data Centers with DOI
    1. CZO investigators benefit from publishing datasets to Data Centers with a mandate to maintain persistent data archives that can be individually cited and retrieved with a permanent digital object identifier (DOI). These datasets are legitimate, internationally recognized, citable contributions to the scientific record. A CZO Dataset Listing that uses a DOI to point to data archived with a Data Center increases the discoverability and long-term access to that data.
    2. Instructions: http://criticalzone.org/national/data/data-doi/

CZOs may implement these optional approaches:
  1. Use ODM2 Database for Local Data Management
    1. Observations Data Model Version 2 (ODM2) is a new information model aimed at facilitating greater interoperability across scientific disciplines and domain cyberinfrastructures. ODM2 was specifically designed to better integrate diverse types of critical zone data, from hydrological time series to soil geochemistry. At present, ODM2 is serving as a foundation for new cyberinfrastructure at partner data centers (i.e. IEDA & CUAHSI). However, ODM2 and associated software tools can also be used by CZO data managers as an option for local management of data collection, quality assurance and publication workflows for data derived from both their CZO’s sensors and physical specimens.
    2. Instructions: http://criticalzone.org/national/data/odm2/
  2. Stream Live Data via Water One Flow Web Services
    1. Near-real-time streaming sensor data can feed the CZO Data Visualization Portal for interactive public assessment of current conditions. CZO investigators can benefit from such streaming data services to better plan storm sampling, to better maintain their sensor networks, and for assimilation for near-real-time model predictions. Several implementation options are available for CZOs that choose to provide access to near-real-time streaming data.
    2. Instructions: http://criticalzone.org/national/data/streaming-data/