Search CORE

6,012 research outputs found

The NASA Astrophysics Data System: Data Holdings

Author: Accomazzi A.
Eichhorn G.
Grant C.
Kurtz M. J.
Murray S. S.
Publication venue: 'EDP Sciences'
Publication date: 01/01/1999
Field of study

Since its inception in 1993, the ADS Abstract Service has become an indispensable research tool for astronomers and astrophysicists worldwide. In those seven years, much effort has been directed toward improving both the quantity and the quality of references in the database. From the original database of approximately 160,000 astronomy abstracts, our dataset has grown almost tenfold to approximately 1.5 million references covering astronomy, astrophysics, planetary sciences, physics, optics, and engineering. We collect and standardize data from approximately 200 journals and present the resulting information in a uniform, coherent manner. With the cooperation of journal publishers worldwide, we have been able to place scans of full journal articles on-line back to the first volumes of many astronomical journals, and we are able to link to current version of articles, abstracts, and datasets for essentially all of the current astronomy literature. The trend toward electronic publishing in the field, the use of electronic submission of abstracts for journal articles and conference proceedings, and the increasingly prominent use of the World Wide Web to disseminate information have enabled the ADS to build a database unparalleled in other disciplines. The ADS can be accessed at http://adswww.harvard.eduComment: 24 pages, 1 figure, 6 tables, 3 appendice

arXiv.org e-Print Archive

CiteSeerX

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

CERN Document Server

Automatic Metadata Generation using Associative Networks

Author: de Lin S.
Han H.
Herbert Van De Sompel
Johan Bollen
Mao S.
Marko A. Rodriguez
Rorvig M.
Yang H.-C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/03/2009
Field of study

In spite of its tremendous value, metadata is generally sparse and incomplete, thereby hampering the effectiveness of digital information services. Many of the existing mechanisms for the automated creation of metadata rely primarily on content analysis which can be costly and inefficient. The automatic metadata generation system proposed in this article leverages resource relationships generated from existing metadata as a medium for propagation from metadata-rich to metadata-poor resources. Because of its independence from content analysis, it can be applied to a wide variety of resource media types and is shown to be computationally inexpensive. The proposed method operates through two distinct phases. Occurrence and co-occurrence algorithms first generate an associative network of repository resources leveraging existing repository metadata. Second, using the associative network as a substrate, metadata associated with metadata-rich resources is propagated to metadata-poor resources by means of a discrete-form spreading activation algorithm. This article discusses the general framework for building associative networks, an algorithm for disseminating metadata through such networks, and the results of an experiment and validation of the proposed method using a standard bibliographic dataset

arXiv.org e-Print Archive

Crossref

Extracting, Transforming and Archiving Scientific Data

Author: Lemire Daniel
Vellino Andre
Publication venue
Publication date: 01/03/2011
Field of study

It is becoming common to archive research datasets that are not only large but also numerous. In addition, their corresponding metadata and the software required to analyse or display them need to be archived. Yet the manual curation of research data can be difficult and expensive, particularly in very large digital repositories, hence the importance of models and tools for automating digital curation tasks. The automation of these tasks faces three major challenges: (1) research data and data sources are highly heterogeneous, (2) future research needs are difficult to anticipate, (3) data is hard to index. To address these problems, we propose the Extract, Transform and Archive (ETA) model for managing and mechanizing the curation of research data. Specifically, we propose a scalable strategy for addressing the research-data problem, ranging from the extraction of legacy data to its long-term storage. We review some existing solutions and propose novel avenues of research.Comment: 8 pages, Fourth Workshop on Very Large Digital Libraries, 201

arXiv.org e-Print Archive

R-libre

An Infrastructure for acquiring high quality semantic metadata

Author: B. Popov
C. Fellbaum
F. Harmelen van
L. Stojanovic
M. Vargas-Vera
N.F. Noy
T. Berners-Lee
V. Lopez
Y. Sure
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Because metadata that underlies semantic web applications is gathered from distributed and heterogeneous data sources, it is important to ensure its quality (i.e., reduce duplicates, spelling errors, ambiguities). However, current infrastructures that acquire and integrate semantic data have only marginally addressed the issue of metadata quality. In this paper we present our metadata acquisition infrastructure, ASDI, which pays special attention to ensuring that high quality metadata is derived. Central to the architecture of ASDI is a erification engine that relies on several semantic web tools to check the quality of the derived data. We tested our prototype in the context of building a semantic web portal for our lab, KMi. An experimental evaluation omparing the automatically extracted data against manual annotations indicates that the verification engine enhances the quality of the extracted semantic metadata

CiteSeerX

Crossref

Open Research Online (The Open University)

Recommended from our members

Computerization of workflows, guidelines and care pathways: a review of implementation challenges for process-oriented health information systems

Author: Gooch P.
Roudsari A.
Publication venue: 'BMJ'
Publication date: 01/01/2011
Field of study

There is a need to integrate the various theoretical frameworks and formalisms for modeling clinical guidelines, workflows, and pathways, in order to move beyond providing support for individual clinical decisions and toward the provision of process-oriented, patient-centered, health information systems (HIS). In this review, we analyze the challenges in developing process-oriented HIS that formally model guidelines, workflows, and care pathways. A qualitative meta-synthesis was performed on studies published in English between 1995 and 2010 that addressed the modeling process and reported the exposition of a new methodology, model, system implementation, or system architecture. Thematic analysis, principal component analysis (PCA) and data visualisation techniques were used to identify and cluster the underlying implementation ‘challenge’ themes. One hundred and eight relevant studies were selected for review. Twenty-five underlying ‘challenge’ themes were identified. These were clustered into 10 distinct groups, from which a conceptual model of the implementation process was developed. We found that the development of systems supporting individual clinical decisions is evolving toward the implementation of adaptable care pathways on the semantic web, incorporating formal, clinical, and organizational ontologies, and the use of workflow management systems. These architectures now need to be implemented and evaluated on a wider scale within clinical settings

City Research Online

Crossref

PubMed Central

A posteriori metadata from automated provenance tracking: Integration of AiiDA and TCOD

Author: Cepellotti Andrea
Gražulis Saulius
Marzari Nicola
Merkys Andrius
Mounet Nicolas
Pizzi Giovanni
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/06/2017
Field of study

In order to make results of computational scientific research findable, accessible, interoperable and re-usable, it is necessary to decorate them with standardised metadata. However, there are a number of technical and practical challenges that make this process difficult to achieve in practice. Here the implementation of a protocol is presented to tag crystal structures with their computed properties, without the need of human intervention to curate the data. This protocol leverages the capabilities of AiiDA, an open-source platform to manage and automate scientific computational workflows, and TCOD, an open-access database storing computed materials properties using a well-defined and exhaustive ontology. Based on these, the complete procedure to deposit computed data in the TCOD database is automated. All relevant metadata are extracted from the full provenance information that AiiDA tracks and stores automatically while managing the calculations. Such a protocol also enables reproducibility of scientific data in the field of computational materials science. As a proof of concept, the AiiDA-TCOD interface is used to deposit 170 theoretical structures together with their computed properties and their full provenance graphs, consisting in over 4600 AiiDA nodes

arXiv.org e-Print Archive

Directory of Open Access Journals

Bibliopedia

Author: Geraldine Heng
Geraldine Heng
Jason Yandell
Michael Widner
Publication venue: 'Modern Language Association'
Publication date: 01/01/2012
Field of study

Bibliopedia is a tool that will perform advanced data-mining & cross-referencing between secondary literature & primary texts & original documents. It will search repositories like JSTOR, Google Scholar, & Project MUSE for full-text citations that mention an original document, analyze the articles & books found, and save the results in a publicly accessible database that will form the basis of an online research collaboratory. The platform will also allow for human-machine collaboration to correct errors in metadata. Bibliopedia will also allow users to create browsable & customizable bibliographies of all the works cited by each article & book. Most importantly, it will perform automated textual analysis, data extraction, cross-referencing, & visualizations of the relationships between texts & authors. Our aim is to serve the research and pedagogical needs of the broadest possible range of humanities scholars

Humanities Commons