Elag et al., 2017

Paper/Book

Identification and characterization of information-networks in long-tail data collections

Elag, M.M., Kumar, P., Marini, L., Myers, J.D., Hedstrom, M., and Plale, B.A. (2017)
Environmental Modelling & Software  

Abstract

The geographic locations of data nodes available in Lake Superior from SGGLM.

The geographic locations of data nodes available in Lake Superior from SGGLM.
Scientists' ability to synthesize and reuse long-tail scientific data lags far behind their ability to collect and produce these data. Many Earth Science Cyberinfrastructures enable sharing and publishing their data over the web using metadata standards. While profiling data attributes advances the Linked Data approach, it has become clear that building information-networks among distributed data silos is essential to increase their integration and reusability. In this research, we developed a Long-Tail Information- Network (LTIN) model, which uses a metadata-driven approach to build semantic information networks among datasets published over the web and aggregate them around environmental events. The model identifies and characterizes the spatial and temporal contextual association links and dependencies among datasets. This paper presents the design and application of the LTIN model, and an evaluation of its performance. The model capabilities were demonstrated by inferring the informationnetwork of a stream discharge located at the downstream end of the Illinois River.

Citation

Elag, M.M., Kumar, P., Marini, L., Myers, J.D., Hedstrom, M., and Plale, B.A. (2017): Identification and characterization of information-networks in long-tail data collections. Environmental Modelling & Software. DOI: 10.1016/j.envsoft.2017.03.032

This Paper/Book acknowledges NSF CZO grant support.