6,987 research outputs found
Data Representation Model for Management and Distribution of Scientific Data
Scientific tools and computer simulations enable rapid creation of various types of data and a number of studies have been conducted on data provenance and web-based data representation models to enhance the distribution, reproduction and reusability of scientific data. Ontology is a knowledge representation model, which is also used as data and workflow technology for data provenance. In this study, as part of managing and distributing for scientific data studies, metadata and data representation model were defined for the management and distribution of Visible Korean online. In addition, additional metadata required for re-distributing the user data created through the Visible Korean study is defined using an ontology-based data representation model, and an RDFa-based web page generation method is proposed to search and extract data from existing web pages. This study enables to manage and distribute the Visible Korean online, which has been managed and distributed offline, and a virtuous recycling of distributing research results as wells
The Research Object Suite of Ontologies: Sharing and Exchanging Research Data and Methods on the Open Web
Research in life sciences is increasingly being conducted in a digital and
online environment. In particular, life scientists have been pioneers in
embracing new computational tools to conduct their investigations. To support
the sharing of digital objects produced during such research investigations, we
have witnessed in the last few years the emergence of specialized repositories,
e.g., DataVerse and FigShare. Such repositories provide users with the means to
share and publish datasets that were used or generated in research
investigations. While these repositories have proven their usefulness,
interpreting and reusing evidence for most research results is a challenging
task. Additional contextual descriptions are needed to understand how those
results were generated and/or the circumstances under which they were
concluded. Because of this, scientists are calling for models that go beyond
the publication of datasets to systematically capture the life cycle of
scientific investigations and provide a single entry point to access the
information about the hypothesis investigated, the datasets used, the
experiments carried out, the results of the experiments, the people involved in
the research, etc. In this paper we present the Research Object (RO) suite of
ontologies, which provide a structured container to encapsulate research data
and methods along with essential metadata descriptions. Research Objects are
portable units that enable the sharing, preservation, interpretation and reuse
of research investigation results. The ontologies we present have been designed
in the light of requirements that we gathered from life scientists. They have
been built upon existing popular vocabularies to facilitate interoperability.
Furthermore, we have developed tools to support the creation and sharing of
Research Objects, thereby promoting and facilitating their adoption.Comment: 20 page
Towards Exascale Scientific Metadata Management
Advances in technology and computing hardware are enabling scientists from
all areas of science to produce massive amounts of data using large-scale
simulations or observational facilities. In this era of data deluge, effective
coordination between the data production and the analysis phases hinges on the
availability of metadata that describe the scientific datasets. Existing
workflow engines have been capturing a limited form of metadata to provide
provenance information about the identity and lineage of the data. However,
much of the data produced by simulations, experiments, and analyses still need
to be annotated manually in an ad hoc manner by domain scientists. Systematic
and transparent acquisition of rich metadata becomes a crucial prerequisite to
sustain and accelerate the pace of scientific innovation. Yet, ubiquitous and
domain-agnostic metadata management infrastructure that can meet the demands of
extreme-scale science is notable by its absence.
To address this gap in scientific data management research and practice, we
present our vision for an integrated approach that (1) automatically captures
and manipulates information-rich metadata while the data is being produced or
analyzed and (2) stores metadata within each dataset to permeate
metadata-oblivious processes and to query metadata through established and
standardized data access interfaces. We motivate the need for the proposed
integrated approach using applications from plasma physics, climate modeling
and neuroscience, and then discuss research challenges and possible solutions
Enhancing Workflow with a Semantic Description of Scientific Intent
Peer reviewedPreprin
A Linked Data Approach to Sharing Workflows and Workflow Results
A bioinformatics analysis pipeline is often highly elaborate, due to the inherent complexity of biological systems and the variety and size of datasets. A digital equivalent of the âMaterials and Methodsâ section in wet laboratory publications would be highly beneficial to bioinformatics, for evaluating evidence and examining data across related experiments, while introducing the potential to find associated resources and integrate them as data and services. We present initial steps towards preserving bioinformatics âmaterials and methodsâ by exploiting the workflow paradigm for capturing the design of a data analysis pipeline, and RDF to link the workflow, its component services, run-time provenance, and a personalized biological interpretation of the results. An example shows the reproduction of the unique graph of an analysis procedure, its results, provenance, and personal interpretation of a text mining experiment. It links data from Taverna, myExperiment.org, BioCatalogue.org, and ConceptWiki.org. The approach is relatively âlight-weightâ and unobtrusive to bioinformatics users
The lifecycle of provenance metadata and its associated challenges and opportunities
This chapter outlines some of the challenges and opportunities associated
with adopting provenance principles and standards in a variety of disciplines,
including data publication and reuse, and information sciences
- âŠ