564,054 research outputs found

    OpenAIREplus

    Get PDF
    Directions the outcomes of the OpenAIRE project, which implements the EC Open Access (OA) pilot. Capitalizing on the OpenAIRE infrastructure, built for managing FP7 and ERC funded articles, and the associated supporting mechanism of the European Helpdesk System, OpenAIREplus will “develop an open access, participatory infrastructure for scientific information”. It will significantly expand its base of harvested publications to also include all OA publications indexed by the DRIVER infrastructure (more than 270 validated institutional repositories) and any other repository containing “peer-reviewed literature” that complies with certain standards. It will also generically harvest and index the metadata of scientific datasets in selected diverse OA thematic data repositories. It will support the concept of linked publications by deploying novel services for “linking peer- reviewed literature and associated data sets and collections”, from link discovery based on diverse forms of mining (textual, usage, etc.), to storage, visual representation, and on-line exploration. It will offer both user-level services to experts and “non-scientists” alike as well as programming interfaces for “providers of value-added services” to build applications on its content. Deposited articles and data will be openly accessible through an enhanced version of the OpenAIRE portal, together with any available relevant information on associated project funding and usage statistics. OpenAIREplus will retain its European footprint, engaging people and scientific repositories in almost all 27 EU member states and beyond. The technical work will be complemented by a suite of studies and associated research efforts that will partly proceed in collaboration with “different European initiatives” and investigate issues of “intellectual property rights, efficient financing models, and standards”.Acknowledgments. This work was supported in part by Open Access Infrastructure for Research in Europe (OpenAIRE) EU project, the Bulgarian National Science Fund under the Project D002-308 "Automated Metadata Generating for e-Documents Specifications and Standards"

    Hydrological modelling in a "big data" era: a proof of concept of hydrological models as web services

    No full text
    Dealing with the massive increase in global data availability of all sorts is increasingly being known as big data science. Indeed, largely leveraged by the internet, a new resource of data sets emerges that are so large and heterogeneous that they become awkward to work with. New algorithms, methods and models are needed to filter such data to find trends, test hypotheses, make predictions and quantify uncertainties. As a considerable share of the data relate to environmental processes (e.g., satellite images, distributed sensor networks), this evolution provides exciting challenges for environmental sciences, and hydrology in particular. Web-enabled models are a promising approach to process large and distributed data sets, and to provide tailored products for a variety of end-users. It will also allow hydrological models to be used as building blocks in larger earth system simulation systems. However, in order to do so we need to reconsider the ways that hydrological models are built, results are made available, and uncertainties are quantified. We present the results of an experimental proof of concept of a hydrological modelling web-service to process heterogeneous hydrological data sets. The hydrological model itself consists of a set of conceptual model routines implemented with on a common platform. This framework is linked to global and local data sets through web standards provided by the Open Geospatial Consortium, as well as to a web interface that enables an end-user to request stream flow simulations from a self-defined location. In essence, the proof-of-concept can be seen as an implementation of the Models of Everywhere concept introduced by Beven in 2007. Although the setup is operational and effectively simulates stream flow, we identify several bottlenecks for optimal hydrological simulation in a web-context. The major challenges we identify are related to (1) model selection; (2) uncertainty quantification, and (3) user interaction and scenario analysis. Model selection is inherent to hydrological modelling, because of the large spatial and temporal variability of processes, which inhibits the use of one optimal model structure. However, in a web context it becomes paramount that such selection is automatic, yet objective and transparent. Similarly, uncertainty quantification is a mainstream practice in hydrological modelling, but in a web-context uncertainty analysis face unprecedented challenges in terms of tracking uncertainties throughout a possibly geographically distributed workflow, as well as dealing with an extreme heterogeneity of data availability. Lastly, the ability of end-users to interact directly with hydrological models poses specific challenges in terms of mapping user scenarios (e.g., a scenario of land-use change) into the model parameter space for prediction and uncertainty quantification. The setup has been used in several scientific experiments, including the large-scale UK consortium project on an Environmental Virtual Observatory pilot

    NASA GeneLab Concept of Operations

    Get PDF
    NASA's GeneLab aims to greatly increase the number of scientists that are using data from space biology investigations on board ISS, emphasizing a systems biology approach to the science. When completed, GeneLab will provide the integrated software and hardware infrastructure, analytical tools and reference datasets for an assortment of model organisms. GeneLab will also provide an environment for scientists to collaborate thereby increasing the possibility for data to be reused for future experimentation. To maximize the value of data from life science experiments performed in space and to make the most advantageous use of the remaining ISS research window, GeneLab will apply an open access approach to conducting spaceflight experiments by generating, and sharing the datasets derived from these biological studies in space.Onboard the ISS, a wide variety of model organisms will be studied and returned to Earth for analysis. Laboratories on the ground will analyze these samples and provide genomic, transcriptomic, metabolomic and proteomic data. Upon receipt, NASA will conduct data quality control tasks and format raw data returned from the omics centers into standardized, annotated information sets that can be readily searched and linked to spaceflight metadata. Once prepared, the biological datasets, as well as any analysis completed, will be made public through the GeneLab Space Bioinformatics System webb as edportal. These efforts will support a collaborative research environment for spaceflight studies that will closely resemble environments created by the Department of Energy (DOE), National Center for Biotechnology Information (NCBI), and other institutions in additional areas of study, such as cancer and environmental biology. The results will allow for comparative analyses that will help scientists around the world take a major leap forward in understanding the effect of microgravity, radiation, and other aspects of the space environment on model organisms. These efforts will speed the process of scientific sharing, iteration, and discovery

    Viewpoints: A high-performance high-dimensional exploratory data analysis tool

    Full text link
    Scientific data sets continue to increase in both size and complexity. In the past, dedicated graphics systems at supercomputing centers were required to visualize large data sets, but as the price of commodity graphics hardware has dropped and its capability has increased, it is now possible, in principle, to view large complex data sets on a single workstation. To do this in practice, an investigator will need software that is written to take advantage of the relevant graphics hardware. The Viewpoints visualization package described herein is an example of such software. Viewpoints is an interactive tool for exploratory visual analysis of large, high-dimensional (multivariate) data. It leverages the capabilities of modern graphics boards (GPUs) to run on a single workstation or laptop. Viewpoints is minimalist: it attempts to do a small set of useful things very well (or at least very quickly) in comparison with similar packages today. Its basic feature set includes linked scatter plots with brushing, dynamic histograms, normalization and outlier detection/removal. Viewpoints was originally designed for astrophysicists, but it has since been used in a variety of fields that range from astronomy, quantum chemistry, fluid dynamics, machine learning, bioinformatics, and finance to information technology server log mining. In this article, we describe the Viewpoints package and show examples of its usage.Comment: 18 pages, 3 figures, PASP in press, this version corresponds more closely to that to be publishe

    LAGOVirtual: A Collaborative Environment for the Large Aperture GRB Observatory

    Full text link
    We present the LAGOVirtual Project: an ongoing project to develop platform to collaborate in the Large Aperture GRB Observatory (LAGO). This continental-wide observatory is devised to detect high energy (around 100 GeV) component of Gamma Ray Bursts, by using the single particle technique in arrays of Water Cherenkov Detectors (WCD) at high mountain sites (Chacaltaya, Bolivia, 5300 m a.s.l., Pico Espejo, Venezuela, 4750 m a.s.l., Sierra Negra, Mexico, 4650 m a.s.l). This platform will allow LAGO collaboration to share data, and computer resources through its different sites. This environment has the possibility to generate synthetic data by simulating the showers through AIRES application and to store/preserve distributed data files collected by the WCD at the LAGO sites. The present article concerns the implementation of a prototype of LAGO-DR adapting DSpace, with a hierarchical structure (i.e. country, institution, followed by collections that contain the metadata and data files), for the captured/simulated data. This structure was generated by using the community, sub-community, collection, item model; available at the DSpace software. Each member institution-country of the project has the appropriate permissions on the system to publish information (descriptive metadata and associated data files). The platform can also associate multiple files to each item of data (data from the instruments, graphics, postprocessed-data, etc.).Comment: Second EELA-2 Conference Choroni, Venezuela, November 25th to 27th 200

    Linked Data - the story so far

    No full text
    The term “Linked Data” refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions— the Web of Data. In this article, the authors present the concept and technical principles of Linked Data, and situate these within the broader context of related technological developments. They describe progress to date in publishing Linked Data on the Web, review applications that have been developed to exploit the Web of Data, and map out a research agenda for the Linked Data community as it moves forward

    Creating information delivery specifications using linked data

    Get PDF
    The use of Building Information Management (BIM) has become mainstream in many countries. Exchanging data in open standards like the Industry Foundation Classes (IFC) is seen as the only workable solution for collaboration. To define information needs for collaboration, many organizations are now documenting what kind of data they need for their purposes. Currently practitioners define their requirements often a) in a format that cannot be read by a computer; b) by creating their own definitions that are not shared. This paper proposes a bottom up solution for the definition of new building concepts a property. The authors have created a prototype implementation and will elaborate on the capturing of information specifications in the future

    A-posteriori provenance-enabled linking of publications and datasets via crowdsourcing

    No full text
    This paper aims to share with the digital library community different opportunities to leverage crowdsourcing for a-posteriori capturing of dataset citation graphs. We describe a practical approach, which exploits one possible crowdsourcing technique to collect these graphs from domain experts and proposes their publication as Linked Data using the W3C PROV standard. Based on our findings from a study we ran during the USEWOD 2014 workshop, we propose a semi-automatic approach that generates metadata by leveraging information extraction as an additional step to crowdsourcing, to generate high-quality data citation graphs. Furthermore, we consider the design implications on our crowdsourcing approach when non-expert participants are involved in the process<br/
    • …
    corecore