Search CORE

3,305 research outputs found

Automated annotation of chemical names in the literature with tunable accuracy

Author: A Copestake
AR Aronson
AR Aronson
C Kolarik
C Kolarik
CE Lipscomb
DL Banville
DM Jassop
E Bolton
Evan E Bolton
GG Chowdhury
GG Chowdhury
JD Wren
Jun D Zhang
KM Hettne
KM Hettne
Lewis Y Geer
P Corbett
PT Corbett
R Klinger
Stephen H Bryant
WJ Wilbur
YY Zhou
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background A significant portion of the biomedical and chemical literature refers to small molecules. The accurate identification and annotation of compound name that are relevant to the topic of the given literature can establish links between scientific publications and various chemical and life science databases. Manual annotation is the preferred method for these works because well-trained indexers can understand the paper topics as well as recognize key terms. However, considering the hundreds of thousands of new papers published annually, an automatic annotation system with high precision and relevance can be a useful complement to manual annotation. Results An automated chemical name annotation system, MeSH Automated Annotations (MAA), was developed to annotate small molecule names in scientific abstracts with tunable accuracy. This system aims to reproduce the MeSH term annotations on biomedical and chemical literature that would be created by indexers. When comparing automated free text matching to those indexed manually of 26 thousand MEDLINE abstracts, more than 40% of the annotations were false-positive (FP) cases. To reduce the FP rate, MAA incorporated several filters to remove "incorrect" annotations caused by nonspecific, partial, and low relevance chemical names. In part, relevance was measured by the position of the chemical name in the text. Tunable accuracy was obtained by adding or restricting the sections of the text scanned for chemical names. The best precision obtained was 96% with a 28% recall rate. The best performance of MAA, as measured with the F statistic was 66%, which favorably compares to other chemical name annotation systems. Conclusions Accurate chemical name annotation can help researchers not only identify important chemical names in abstracts, but also match unindexed and unstructured abstracts to chemical records. The current work is tested against MEDLINE, but the algorithm is not specific to this corpus and it is possible that the algorithm can be applied to papers from chemical physics, material, polymer and environmental science, as well as patents, biological assay descriptions and other textual data.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Engineering polymer informatics: Towards the computer-aided design of polymers

Author: Adams
Adams
Adams
Adams
Ai
Ai
Berners-Lee
Bicerano
Blower
Brooksbank
Carrell
Chowdhury
Chowdhury
Corbett
Cuchelkar
Davies
Degtyarenko
Elias
Feldman
Fleischmann
Frenkel
Frey
Gkoutos
Gordon
Gordon
Gordon
Hamoudeh
Herz
Holliday
Hoogenboom
Hoogenboom
Jenkins
Kanehisa
Kang
Kataoka
Keener
Lamport
Ma
Malmsten
Meier
Metanomski
Murray-Rust
Murray-Rust
Murray-Rust
Murray-Rust
Murray-Rust
Putnam
Rieder
Sankar
Schmaljohann
Service
Studer
Taylor
van der Vet
van Krevelen
Wagner
Weininger
Wiesbrock
Wilks
Wilks
Wilks
Wilks
Wu
Zamora
Zamora
Zhang
Publication venue: MACROMOL RAPID COMM
Publication date: 27/03/2008
Field of study

The computer-aided design of polymers is one of the holy grails of modern chemical informatics and of significant interest for a number of communities in polymer science. The paper outlines a vision for the in silico design of polymers and presents an information model for polymers based on modern semantic web technologies, thus laying the foundations for achieving the vision

Crossref

Apollo (Cambridge)

Chemical bibliographic databases: the influence of term indexing policies on topic searches

Author: Boyrie Fabrice
Niel Gilles
Virieux David
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 28/08/2015
Field of study

International audienceA comparative study of the three main chemical information systems (Scifinder, Web of Science and Scopus) was performed by studying the indexing policies of titles, abstracts and keywords within selected literature articles. Various chemical expressions were introduced as topic searches to illustrate the different search tools related to term indexing. The resulting article lists were compared two-by-two by means of a script designed to identify common reference lists and specific ones to each editor. Analyzing these specific reference lists reveals that only partial coverage areas of references should be expected when querying a single platform. The discussion covers the term and keyword indexing policies, their influence on the retrievability of references and on the retrievability of the highly cited papers

HAL Descartes

Chemoinformatics Research at the University of Sheffield: A History and Citation Analysis

Author: Bishop N.
Gillet V.J.
Holliday J.D.
Willett P.
Publication venue: 'SAGE Publications'
Publication date: 01/07/2003
Field of study

This paper reviews the work of the Chemoinformatics Research Group in the Department of Information Studies at the University of Sheffield, focusing particularly on the work carried out in the period 1985-2002. Four major research areas are discussed, these involving the development of methods for: substructure searching in databases of three-dimensional structures, including both rigid and flexible molecules; the representation and searching of the Markush structures that occur in chemical patents; similarity searching in databases of both two-dimensional and three-dimensional structures; and compound selection and the design of combinatorial libraries. An analysis of citations to 321 publications from the Group shows that it attracted a total of 3725 residual citations during the period 1980-2002. These citations appeared in 411 different journals, and involved 910 different citing organizations from 54 different countries, thus demonstrating the widespread impact of the Group's work

Crossref

White Rose Research Online

Dagstuhl Reports : Volume 1, Issue 2, February 2011

Author: Schloss Dagstuhl Leibniz-Zentrum für Informatik
Publication venue
Publication date: 09/09/2011
Field of study

Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

Hochschulschriftenserver - Universität Frankfurt am Main

Knowledge-based Biomedical Data Science 2019

Author: Callahan Tiffany J.
Hunter Lawrence E.
Pielke-Lombardo Harrison
Tripodi Ignacio J.
Publication venue
Publication date: 08/10/2019
Field of study

Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

arXiv.org e-Print Archive

Self-Evaluation Applied Mathematics 2003-2008 University of Twente

Author: Mouthaan A.J.
Vegt J.J.W. van der
Publication venue: Faculty of Electrical Engineering, Mathematics and Computer Science, University of Twente
Publication date: 01/01/2009
Field of study

This report contains the self-study for the research assessment of the Department of Applied Mathematics (AM) of the Faculty of Electrical Engineering, Mathematics and Computer Science (EEMCS) at the University of Twente (UT). The report provides the information for the Research Assessment Committee for Applied Mathematics, dealing with mathematical sciences at the three universities of technology in the Netherlands. It describes the state of affairs pertaining to the period 1 January 2003 to 31 December 2008

University of Twente Research Information

Data Base Mapping Model and Search Scheme to Facilitate Resource Sharing: Volume 1, Mapping of Chemical Data Bases and Mapping of Data Base Data Elements Using a Rational Data Base Structure

Author: MacLaury Keith
Preece Scott E.
Rouse Sandra H.
Williams Martha E.
Publication venue: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
Publication date: 01/12/1977
Field of study

Coordinated Science Laboratory was formerly known as Control Systems LaboratoryNational Science Foundation / NSF SIS 74-1855

Illinois Digital Environment for Access to Learning and Scholarship Repository

A survey of chemical information systems

Author: Dominick Wayne D.
Shaikh Aneesa Bashir
Publication venue
Publication date
Field of study

A survey of the features, functions, and characteristics of a fairly wide variety of chemical information storage and retrieval systems currently in operation is given. The types of systems (together with an identification of the specific systems) addressed within this survey are as follows: patents and bibliographies (Derwent's Patent System; IFI Comprehensive Database; PULSAR); pharmacology and toxicology (Chemfile; PAGODE; CBF; HEEDA; NAPRALERT; MAACS); the chemical information system (CAS Chemical Registry System; SANSS; MSSS; CSEARCH; GINA; NMRLIT; CRYST; XTAL; PDSM; CAISF; RTECS Search System; AQUATOX; WDROP; OHMTADS; MLAB; Chemlab); spectra (OCETH; ASTM); crystals (CRYSRC); and physical properties (DETHERM). Summary characteristics and current trends in chemical information systems development are also examined

NASA Technical Reports Server