9 research outputs found

    SCOPE - A Scientific Compound Object Publishing and Editing System

    Get PDF
    This paper presents the SCOPE (Scientific Compound Object Publishing and Editing) system which is designed to enable scientists to easily author, publish and edit scientific compound objects. Scientific compound objects enable scientists to encapsulate the various datasets and resources generated or utilized during a scientific experiment or discovery process, within a single compound object, for publishing and exchange. The adoption of “named graphs” to represent these compound objects enables provenance information to be captured via the typed relationships between the components. This approach is also endorsed by the OAI-ORE initiative and hence ensures that we generate OAI-ORE-compliant Scientific Compound Objects. The SCOPE system is an extension of the Provenance Explorer tool – which enables access-controlled viewing of scientific provenance trails. Provenance Explorer provided dynamic rendering of RDF graphs of scientific discovery processes, showing the lineage from raw data to publication. Views of different granularity can be inferred automatically using SWRL (Semantic Web Rules Language) rules and an inferencing engine. SCOPE extends the Provenance Explorer tool and GUI by: 1) Adding an embedded web browser that can be used for incorporating objects discoverable via the Web; 2) Representing compound objects as Named Graphs, that can be saved in RDF, TriX, TriG or as an Atom syndication feed; 3) Enabling scientists to attach Creative Commons Licenses to the compound objects to specify how they may be re-used; 4) Enabling compound objects to be published as Fedora Object XML (FOXML) files within a Fedora digital library

    SCOPE: A Scientific Compound Object Publishing and Editing System

    Full text link

    From Artifacts to Aggregations: Modeling Scientific Life Cycles on the Semantic Web

    Full text link
    In the process of scientific research, many information objects are generated, all of which may remain valuable indefinitely. However, artifacts such as instrument data and associated calibration information may have little value in isolation; their meaning is derived from their relationships to each other. Individual artifacts are best represented as components of a life cycle that is specific to a scientific research domain or project. Current cataloging practices do not describe objects at a sufficient level of granularity nor do they offer the globally persistent identifiers necessary to discover and manage scholarly products with World Wide Web standards. The Open Archives Initiative's Object Reuse and Exchange data model (OAI-ORE) meets these requirements. We demonstrate a conceptual implementation of OAI-ORE to represent the scientific life cycles of embedded networked sensor applications in seismology and environmental sciences. By establishing relationships between publications, data, and contextual research information, we illustrate how to obtain a richer and more realistic view of scientific practices. That view can facilitate new forms of scientific research and learning. Our analysis is framed by studies of scientific practices in a large, multi-disciplinary, multi-university science and engineering research center, the Center for Embedded Networked Sensing (CENS).Comment: 28 pages. To appear in the Journal of the American Society for Information Science and Technology (JASIST

    SCOPE: A Scientific Compound Object Publishing and Editing System

    Get PDF
    This paper presents the SCOPE (Scientific Compound Object Publishing and Editing) system which is designed to enable scientists to easily author, publish and edit scientific compound objects. Scientific compound objects encapsulate the various datasets and resources generated or utilized during a scientific experiment or discovery process, within a single compound object, for publishing and exchange. The adoption of “named graphs” to represent these compound objects enables provenance information to be captured via the typed relationships between the components. This approach is also endorsed by the OAI-ORE initiative and hence ensures that we generate OAI-ORE-compliant Scientific Compound Objects. The SCOPE system is an extension of the Provenance Explorer tool – which supports access-controlled viewing of scientific provenance trails. Provenance Explorer provided dynamic rendering of RDF graphs of scientific discovery processes, showing the lineage from raw data to publication. Views of different granularity can be inferred automatically using SWRL (Semantic Web Rules Language) rules and an inferencing engine. SCOPE extends the Provenance Explorer tool and GUI by: 1) Adding an embedded web browser that can be used for incorporating objects discoverable via the Web; 2) Representing compound objects as Named Graphs, that can be saved in RDF, TriX, TriG or as an Atom syndication feed; 3) Enabling scientists to attach Creative Commons Licenses to the compound objects to specify how they may be re-used; 4) Enabling compound objects to be published as Fedora Object XML (FOXML) files within a Fedora digital library

    Provenance explorer: A graphical interface for constructing scientific publication packages from provenance trails

    No full text
    Scientific communities are under increasing pressure from funding organizations to publish their raw data, in addition to their traditional publications, in open archives. Many scientists would be willing to do this if they had tools that streamlined the process and exposed simple provenance information, i.e., enough to explain the methodology and validate the results without compromising the author's intellectual property or competitive advantage. This paper presents Provenance Explorer, a tool that enables the provenance trail associated with a scientific discovery process to be visualized and explored through a graphical user interface (GUI). Based on RDF graphs, it displays the sequence of data, states and events associated with a scientific workflow, illustrating the methodology that led to the published results. The GUI also allows permitted users to expand selected links between nodes to reveal more fine-grained information and sub-workflows. But more importantly, the system enables scientists to selectively construct "scientific publication packages" by choosing particular nodes from the visual provenance trail and dragging-and-dropping them into an RDF package which can be uploaded to an archive or repository for publication or e-learning. The provenance relationships between the individual components in the package are automatically inferred using a rules-based inferencing engine

    Open Data, Grey Data, and Stewardship: Universities at the Privacy Frontier

    Full text link
    As universities recognize the inherent value in the data they collect and hold, they encounter unforeseen challenges in stewarding those data in ways that balance accountability, transparency, and protection of privacy, academic freedom, and intellectual property. Two parallel developments in academic data collection are converging: (1) open access requirements, whereby researchers must provide access to their data as a condition of obtaining grant funding or publishing results in journals; and (2) the vast accumulation of 'grey data' about individuals in their daily activities of research, teaching, learning, services, and administration. The boundaries between research and grey data are blurring, making it more difficult to assess the risks and responsibilities associated with any data collection. Many sets of data, both research and grey, fall outside privacy regulations such as HIPAA, FERPA, and PII. Universities are exploiting these data for research, learning analytics, faculty evaluation, strategic decisions, and other sensitive matters. Commercial entities are besieging universities with requests for access to data or for partnerships to mine them. The privacy frontier facing research universities spans open access practices, uses and misuses of data, public records requests, cyber risk, and curating data for privacy protection. This paper explores the competing values inherent in data stewardship and makes recommendations for practice, drawing on the pioneering work of the University of California in privacy and information security, data governance, and cyber risk.Comment: Final published version, Sept 30, 201

    Provenance-aware knowledge representation: A survey of data models and contextualized knowledge graphs

    Get PDF
    Expressing machine-interpretable statements in the form of subject-predicate-object triples is a well-established practice for capturing semantics of structured data. However, the standard used for representing these triples, RDF, inherently lacks the mechanism to attach provenance data, which would be crucial to make automatically generated and/or processed data authoritative. This paper is a critical review of data models, annotation frameworks, knowledge organization systems, serialization syntaxes, and algebras that enable provenance-aware RDF statements. The various approaches are assessed in terms of standard compliance, formal semantics, tuple type, vocabulary term usage, blank nodes, provenance granularity, and scalability. This can be used to advance existing solutions and help implementers to select the most suitable approach (or a combination of approaches) for their applications. Moreover, the analysis of the mechanisms and their limitations highlighted in this paper can serve as the basis for novel approaches in RDF-powered applications with increasing provenance needs

    Developing Materials Informatics Workbench for Expediting the Discovery of Novel Compound Materials

    Get PDF

    Enhanced Publication Management System: a systemic approach towards modern scientific communication.

    Get PDF
    The impact of the digital revolution and the mass adoption of ICT affected only partially the scientific communication workflow. Scientists are today acquainted to scientific workflows, electronic data, software, e-science infrastructures for carrying out their daily research activities, but the dissemination of research results still relies on the bare scientific article, which simply shifted from being printed to digital. The scientific article alone, however, cannot support an effective assessment of research results or enable science reproducibility: to achieve this goal all products related to a research activity should be shared and disseminated. In the last decades, on the wave of Open Science, the scientific community has approached the problem of publishing research products different from the scientific article. One of the solutions is the paradigm of enhanced publications (EPs). EPs are digital objects that aggregate a digital scientific article and the other research products that have been used and produced during the research investigation described by the article and are useful to: (i) better interpret the article, (ii) enable more effective peer re- view, and (iii) facilitate or support reproducibility of science. Theory and practice of EPs is still not advanced and most Enhanced Publication Information Systems (EPISs) are custom implementations serving community specific needs. EPIS designers and developers have little or no technological support oriented to EPs. In fact, they realize EP-oriented software with a “from scratch” approach, addressing the peculiarities of the community to serve. Approach The aim of this thesis is to propose a systemic approach to the realisation of EPISs inspired by the lessons learned from the database domain. The state of the art of information systems and data models for EPs has been analyzed to identify the common features across different domains. Those common features have served as building blocks for the definition of a data model and functionalities for the representation and manipulation of EPs. The notion of Enhanced Publication Management System (EPMS) is introduced to denote information systems that provide EPIS designers and developers with EP-oriented tools for the setup, operation and maintenance of EPISs. The requirements of EPMSs have been identified and a reference software architecture that satisfies them is presented. Contributions The main contributions of this thesis relate to the fields of information science and scientific communication. The analysis of the state of the art about EPIS results in a terminology and a classification that can be useful as reference for the com- parison and discussion of such systems. A systemic approach, based on the novel notion of Enhanced Publication Management System (EPMS), is proposed as a more cost effective solution to the realization of EPIS, compared to the current “from scratch” strategy. A reference architecture for EPMSs and a general-purpose data model for EPs are pro- posed with the intent of contributing at building structured foundations of what is today becoming an area of research on its own
    corecore