641 research outputs found

    Data Provenance for Distributed Data Sets

    Get PDF
    No abstract availabl

    Perl Modules for Constructing Iterators

    Get PDF
    The Iterator Perl Module provides a general-purpose framework for constructing iterator objects within Perl, and a standard API for interacting with those objects. Iterators are an object-oriented design pattern where a description of a series of values is used in a constructor. Subsequent queries can request values in that series. These Perl modules build on the standard Iterator framework and provide iterators for some other types of values. Iterator::DateTime constructs iterators from DateTime objects or Date::Parse descriptions and ICal/RFC 2445 style re-currence descriptions. It supports a variety of input parameters, including a start to the sequence, an end to the sequence, an Ical/RFC 2445 recurrence describing the frequency of the values in the series, and a format description that can refine the presentation manner of the DateTime. Iterator::String constructs iterators from string representations. This module is useful in contexts where the API consists of supplying a string and getting back an iterator where the specific iteration desired is opaque to the caller. It is of particular value to the Iterator::Hash module which provides nested iterations. Iterator::Hash constructs iterators from Perl hashes that can include multiple iterators. The constructed iterators will return all the permutations of the iterations of the hash by nested iteration of embedded iterators. A hash simply includes a set of keys mapped to values. It is a very common data structure used throughout Perl programming. The Iterator:: Hash module allows a hash to include strings defining iterators (parsed and dispatched with Iterator::String) that are used to construct an overall series of hash values

    Formal Provenance Representation of the Data and Information Supporting the National Climate Assessment

    Get PDF
    The Global Change Information System (GCIS) provides a framework for the formal representation of structured metadata about data and information about global change. The pilot deployment of the system supports the National Climate Assessment (NCA), a major report of the U.S. Global Change Research Program (USGCRP). A consumer of that report can use the system to browse and explore that supporting information. Additionally, capturing that information into a structured data model and presenting it in standard formats through well defined open inter- faces, including query interfaces suitable for data mining and linking with other databases, the information becomes valuable for other analytic uses as well

    Linked Open Data in the Global Change Information System (GCIS)

    Get PDF
    The U.S. Global Change Research Program (http://globalchange.gov) coordinates and integrates federal research on changes in the global environment and their implications for society. The USGCRP is developing a Global Change Information System (GCIS) that will centralize access to data and information related to global change across the U.S. federal government. The first implementation will focus on the 2013 National Climate Assessment (NCA) . (http://assessment.globalchange.gov) The NCA integrates, evaluates, and interprets the findings of the USGCRP; analyzes the effects of global change on the natural environment, agriculture, energy production and use, land and water resources, transportation, human health and welfare, human social systems, and biological diversity; and analyzes current trends in global change, both human-induced and natural, and projects major trends for the subsequent 25 to 100 years. The NCA has received over 500 distinct technical inputs to the process, many of which are reports distilling and synthesizing even more information, coming from thousands of individuals around the federal, state and local governments, academic institutions and non-governmental organizations. The GCIS will present a web-based version of the NCA including annotations linking the findings and content of the NCA with the scientific research, datasets, models, observations, etc. that led to its conclusions. It will use semantic tagging and a linked data approach, assigning globally unique, persistent, resolvable identifiers to all of the related entities and capturing and presenting the relationships between them, both internally and referencing out to other linked data sources and back to agency data centers. The developing W3C PROV Data Model and ontology will be used to capture the provenance trail and present it in both human readable web pages and machine readable formats such as RDF and SPARQL. This will improve visibility into the assessment process, increase understanding and reproducibility, and ultimately increase credibility and trust of the resulting report. Building on the foundation of the NCA, longer term plans for the GCIS include extending these capabilities throughout the U.S. Global Change Research Program, centralizing access to global change data and information across the thirteen agencies that comprise the program

    Distinguishing Provenance Equivalence of Earth Science Data

    Get PDF
    Reproducibility of scientific research relies on accurate and precise citation of data and the provenance of that data. Earth science data are often the result of applying complex data transformation and analysis workflows to vast quantities of data. Provenance information of data processing is used for a variety of purposes, including understanding the process and auditing as well as reproducibility. Certain provenance information is essential for producing scientifically equivalent data. Capturing and representing that provenance information and assigning identifiers suitable for precisely distinguishing data granules and datasets is needed for accurate comparisons. This paper discusses scientific equivalence and essential provenance for scientific reproducibility. We use the example of an operational earth science data processing system to illustrate the application of the technique of cascading digital signatures or hash chains to precisely identify sets of granules and as provenance equivalence identifiers to distinguish data made in an an equivalent manner

    Impact of biogenic very short-lived bromine on the Antarctic ozone hole during the 21st century

    Get PDF
    Active bromine released from the photochemical decomposition of biogenic very short-lived bromocarbons (VSL_Br ) enhances stratospheric ozone depletion. Based on a dual set of 1960-2100 coupled chemistry-climate simulations (i.e. with and without VSL Br ), we show that the maximum Antarctic ozone hole depletion increases by up to 14% when natural VSLBr are considered, in better agreement with ozone observations. The impact of the additional 5 pptv VSL Br on Antarctic ozone is most evident in the periphery of the ozone hole, producing an expansion of the ozone hole area of ~5 million km 2 , which is equivalent in magnitude to the recently estimated Antarctic ozone healing due to the implementation of the Montreal Protocol. We find that the inclusion of VSL Br in CAM-Chem does not introduce a significant delay of the modelled ozone return date to 1980 October levels, but instead affect the depth and duration of the simulated ozone hole. Our analysis further shows that total bromine-catalysed ozone destruction in the lower stratosphere surpasses that of chlorine by year 2070, and indicates that natural VSL Br chemistry would dominate Antarctic ozone seasonality before the end of the 21 st century. This work suggests a large influence of biogenic bromine on the future Antarctic ozone layer.Fil: Fernandez, Rafael Pedro. Consejo Superior de Investigaciones Científicas. Instituto de Química Física; España. Universidad Tecnologica Nacional. Facultad Regional Mendoza. Secretaría de Ciencia, Tecnología y Postgrado; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Mendoza; ArgentinaFil: Kinnison, Douglas E.. National Center For Atmospheric Research. Amospheric Chemistry División; Estados UnidosFil: Lamarque, Jean Francois. National Center For Atmospheric Research. Amospheric Chemistry División; Estados UnidosFil: Tilmes, Simone. National Center For Atmospheric Research. Amospheric Chemistry División; Estados UnidosFil: Saiz-lopez, Alfonso. Consejo Superior de Investigaciones Científicas. Instituto de Química Física; Españ

    Lessons Learned From Developing Three Generations of Remote Sensing Science Data Processing Systems

    Get PDF
    The Biospheric Information Systems Branch at NASA s Goddard Space Flight Center has developed three generations of Science Investigator-led Processing Systems for use with various remote sensing instruments. The first system is used for data from the MODIS instruments flown on NASA s Earth Observing Systems @OS) Terra and Aqua Spacecraft launched in 1999 and 2002 respectively. The second generation is for the Ozone Measuring Instrument flying on the EOS Aura spacecraft launched in 2004. We are now developing a third generation of the system for evaluation science data processing for the Ozone Mapping and Profiler Suite (OMPS) to be flown by the NPOESS Preparatory Project (NPP) in 2006. The initial system was based on large scale proprietary hardware, operating and database systems. The current OMI system and the OMPS system being developed are based on commodity hardware, the LINUX Operating System and on PostgreSQL, an Open Source RDBMS. The new system distributes its data archive across multiple server hosts and processes jobs on multiple processor boxes. We have created several instances of this system, including one for operational processing, one for testing and reprocessing and one for applications development and scientific analysis. Prior to receiving the first data from OMI we applied the system to reprocessing information from the Solar Backscatter Ultraviolet (SBUV) and Total Ozone Mapping Spectrometer (TOMS) instruments flown from 1978 until now. The system was able to process 25 years (108,000 orbits) of data and produce 800,000 files (400 GiB) of level 2 and level 3 products in less than a week. We will describe the lessons we have learned and tradeoffs between system design, hardware, operating systems, operational staffing, user support and operational procedures. During each generational phase, the system has become more generic and reusable. While the system is not currently shrink wrapped we believe it is to the point where it could be readily adopted, with substantial cost savings, for other similar tasks

    Provenance Challenges for Earth Science Dataset Publication

    Get PDF
    No abstract availabl
    corecore