Presented at the Fall AGU Meeting, New Orleans, LA, 11-15 December 2017As cross-disciplinary geoscience research increasingly relies on machines to discover and access data, one of the critical
questions facing data repositories is how data and supporting materials should be packaged for consumption.
Traditionally, data repositories have relied on a human's involvement throughout discovery and access workflows. This
human could assess fitness for purpose by reading loosely coupled, unstructured information from web pages and
documentation. In attempts to shorten the time to science and access data resources across may disciplines,
expectations for machines to mediate the process of discovery and access is challenging data repository infrastructure.
This challenge is to find ways to deliver data and information in ways that enable machines to make better decisions by
enabling them to understand the data and metadata of many data types. Additionally, once machines have
recommended a data resource as relevant to an investigator's needs, the data resource should be easy to integrate into
that investigator's toolkits for analysis and visualization.
The Biological and Chemical Oceanography Data Management Office (BCO-DMO) supports NSF-funded OCE and PLR
investigators with their project's data management needs. These needs involve a number of varying data types some of
which require multiple files with differing formats. Presently, BCO-DMO has described these data types and the
important relationships between the type's data files through human-readable documentation on web pages. For
machines directly accessing data files from BCO-DMO, this documentation could be overlooked and lead to
misinterpreting the data. Instead, BCO-DMO is exploring the idea of data containerization, or packaging data and related
information for easier transport, interpretation, and use. In researching the landscape of data containerization, the
Frictionlessdata Data Package (http://frictionlessdata.io/) provides a number of valuable advantages over similar
solutions. This presentation will focus on these advantages and how the Frictionlessdata Data Package addresses a
number of real-world use cases faced for data discovery, access, analysis and visualization.National Science Foundation Award #1435578, Award #163971