27 research outputs found

    Data Provenance for Distributed Data Sets

    Get PDF
    No abstract availabl

    Perl Modules for Constructing Iterators

    Get PDF
    The Iterator Perl Module provides a general-purpose framework for constructing iterator objects within Perl, and a standard API for interacting with those objects. Iterators are an object-oriented design pattern where a description of a series of values is used in a constructor. Subsequent queries can request values in that series. These Perl modules build on the standard Iterator framework and provide iterators for some other types of values. Iterator::DateTime constructs iterators from DateTime objects or Date::Parse descriptions and ICal/RFC 2445 style re-currence descriptions. It supports a variety of input parameters, including a start to the sequence, an end to the sequence, an Ical/RFC 2445 recurrence describing the frequency of the values in the series, and a format description that can refine the presentation manner of the DateTime. Iterator::String constructs iterators from string representations. This module is useful in contexts where the API consists of supplying a string and getting back an iterator where the specific iteration desired is opaque to the caller. It is of particular value to the Iterator::Hash module which provides nested iterations. Iterator::Hash constructs iterators from Perl hashes that can include multiple iterators. The constructed iterators will return all the permutations of the iterations of the hash by nested iteration of embedded iterators. A hash simply includes a set of keys mapped to values. It is a very common data structure used throughout Perl programming. The Iterator:: Hash module allows a hash to include strings defining iterators (parsed and dispatched with Iterator::String) that are used to construct an overall series of hash values

    Formal Provenance Representation of the Data and Information Supporting the National Climate Assessment

    Get PDF
    The Global Change Information System (GCIS) provides a framework for the formal representation of structured metadata about data and information about global change. The pilot deployment of the system supports the National Climate Assessment (NCA), a major report of the U.S. Global Change Research Program (USGCRP). A consumer of that report can use the system to browse and explore that supporting information. Additionally, capturing that information into a structured data model and presenting it in standard formats through well defined open inter- faces, including query interfaces suitable for data mining and linking with other databases, the information becomes valuable for other analytic uses as well

    Linked Open Data in the Global Change Information System (GCIS)

    Get PDF
    The U.S. Global Change Research Program (http://globalchange.gov) coordinates and integrates federal research on changes in the global environment and their implications for society. The USGCRP is developing a Global Change Information System (GCIS) that will centralize access to data and information related to global change across the U.S. federal government. The first implementation will focus on the 2013 National Climate Assessment (NCA) . (http://assessment.globalchange.gov) The NCA integrates, evaluates, and interprets the findings of the USGCRP; analyzes the effects of global change on the natural environment, agriculture, energy production and use, land and water resources, transportation, human health and welfare, human social systems, and biological diversity; and analyzes current trends in global change, both human-induced and natural, and projects major trends for the subsequent 25 to 100 years. The NCA has received over 500 distinct technical inputs to the process, many of which are reports distilling and synthesizing even more information, coming from thousands of individuals around the federal, state and local governments, academic institutions and non-governmental organizations. The GCIS will present a web-based version of the NCA including annotations linking the findings and content of the NCA with the scientific research, datasets, models, observations, etc. that led to its conclusions. It will use semantic tagging and a linked data approach, assigning globally unique, persistent, resolvable identifiers to all of the related entities and capturing and presenting the relationships between them, both internally and referencing out to other linked data sources and back to agency data centers. The developing W3C PROV Data Model and ontology will be used to capture the provenance trail and present it in both human readable web pages and machine readable formats such as RDF and SPARQL. This will improve visibility into the assessment process, increase understanding and reproducibility, and ultimately increase credibility and trust of the resulting report. Building on the foundation of the NCA, longer term plans for the GCIS include extending these capabilities throughout the U.S. Global Change Research Program, centralizing access to global change data and information across the thirteen agencies that comprise the program

    Provenance Challenges for Earth Science Dataset Publication

    Get PDF
    No abstract availabl

    ESIP Federation Preservation and Stewardship: Use Case Workshop

    Get PDF
    No abstract availabl

    Preservation Strategies: Intro to the OAIS Reference Model

    Get PDF
    No abstract availabl

    U.S. Global Change Research Program National Climate Assessment Global Change Information System

    Get PDF
    The program: a) Coordinates Federal research to better understand and prepare the nation for global change. b) Priori4zes and supports cutting edge scientific work in global change. c) Assesses the state of scientific knowledge and the Nation s readiness to respond to global change. d) Communicates research findings to inform, educate, and engage the global community

    Data Identifiers and Citations Enable Reproducible Science

    Get PDF
    No abstract availabl

    Distinguishing Provenance Equivalence of Earth Science Data

    Get PDF
    Reproducibility of scientific research relies on accurate and precise citation of data and the provenance of that data. Earth science data are often the result of applying complex data transformation and analysis workflows to vast quantities of data. Provenance information of data processing is used for a variety of purposes, including understanding the process and auditing as well as reproducibility. Certain provenance information is essential for producing scientifically equivalent data. Capturing and representing that provenance information and assigning identifiers suitable for precisely distinguishing data granules and datasets is needed for accurate comparisons. This paper discusses scientific equivalence and essential provenance for scientific reproducibility. We use the example of an operational earth science data processing system to illustrate the application of the technique of cascading digital signatures or hash chains to precisely identify sets of granules and as provenance equivalence identifiers to distinguish data made in an an equivalent manner
    corecore