Search CORE

29 research outputs found

OntoGene in BioCreative II

Author: Clematide S
Hess M
Kaljurand K
Kappeler T
Klenner M
Parisot P
Rinaldi Fabio
Romacker M
Schneider G
Vachon T
von Allmen J M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2008
Field of study

BACKGROUND: Research scientists and companies working in the domains of biomedicine and genomics are increasingly faced with the problem of efficiently locating, within the vast body of published scientific findings, the critical pieces of information that are needed to direct current and future research investment. RESULTS: In this report we describe approaches taken within the scope of the second BioCreative competition in order to solve two aspects of this problem: detection of novel protein interactions reported in scientific articles, and detection of the experimental method that was used to confirm the interaction. Our approach to the former problem is based on a high-recall protein annotation step, followed by two strict disambiguation steps. The remaining proteins are then combined according to a number of lexico-syntactic filters, which deliver high-precision results while maintaining reasonable recall. The detection of the experimental methods is tackled by a pattern matching approach, which has delivered the best results in the official BioCreative evaluation. CONCLUSION: Although the results of BioCreative clearly show that no tool is sufficiently reliable for fully automated annotations, a few of the proposed approaches (including our own) already perform at a competitive level. This makes them interesting either as standalone tools for preliminary document inspection, or as modules within an environment aimed at supporting the process of curation of biomedical literature

ZORA

Recommended from our members

Matching disease and phenotype ontologies in the ontology alignment evaluation initiative

Author: Alam-Faruque Y.
Harrow I.
Jimenez-Ruiz E.
Koch M.
Malone J.
Markel S.
Romacker M.
Splendiani A.
Waaler A.
Woollard P.
Publication venue: BMC
Publication date: 01/01/2017
Field of study

Background: The disease and phenotype track was designed to evaluate the relative performance of ontology matching systems that generate mappings between source ontologies. Disease and phenotype ontologies are important for applications such as data mining, data integration and knowledge management to support translational science in drug discovery and understanding the genetics of disease. Results: Eleven systems (out of 21 OAEI participating systems) were able to cope with at least one of the tasks in the Disease and Phenotype track. AML, FCA-Map, LogMap(Bio) and PhenoMF systems produced the top results for ontology matching in comparison to consensus alignments. The results against manually curated mappings proved to be more difficult most likely because these mapping sets comprised mostly subsumption relationships rather than equivalence. Manual assessment of unique equivalence mappings showed that AML, LogMap(Bio) and PhenoMF systems have the highest precision results. Conclusions: Four systems gave the highest performance for matching disease and phenotype ontologies. These systems coped well with the detection of equivalence matches, but struggled to detect semantic similarity. This deserves more attention in the future development of ontology matching systems. The findings of this evaluation show that such systems could help to automate equivalence matching in the workflow of curators, who maintain ontology mapping services in numerous domains such as disease and phenotype

City Research Online

Crossref

ZENODO

Directory of Open Access Journals

The Novartis Repository

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

NORA - Norwegian Open Research Archives

Recommended from our members

Ontology mapping for semantically enabled applications

Author: Balakrishnan R.
Harrow I.
Jimenez-Ruiz E.
Jupp S.
Lomax J.
Reed J.
Romacker M.
Senger C.
Splendiani A.
Wilson J.
Woollard P.
Publication venue: Elsevier BV
Publication date: 01/01/2019
Field of study

In this review, we provide a summary of recent progress in ontology mapping (OM) at a crucial time when biomedical research is under a deluge of an increasing amount and variety of data. This is particularly important for realising the full potential of semantically enabled or enriched applications and for meaningful insights, such as drug discovery, using machine-learning technologies. We discuss challenges and solutions for better ontology mappings, as well as how to select ontologies before their application. In addition, we describe tools and algorithms for ontology mapping, including evaluation of tool capability and quality of mappings. Finally, we outline the requirements for an ontology mapping service (OMS) and the progress being made towards implementation of such sustainable services

City Research Online

The Novartis Repository

An environment for relation mining over richly annotated corpora: the case of GENIA

Author: A Koike
A Mikheev
A Ratnaparkhi
A Yakushiji
C Friedman
D Hindle
D Lin
DPA Corney
F Rinaldi
F Rinaldi
Fabio Rinaldi
G Leroy
G Minnen
G Schneider
G Schneider
G Schneider
Gerold Schneider
J Carroll
J Hakenberg
J Kim
J Preiss
J Saric
JC Reynar
K Kaljurand
Kaarel Kaljurand
LJ Jensen
M Collins
M Huang
M Marcus
M Romacker
Martin Romacker
Michael Hess
N Daraselia
S Novichkova
S Riedel
T Rindflesch
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: The biomedical domain is witnessing a rapid growth of the amount of published scientific results, which makes it increasingly difficult to filter the core information. There is a real need for support tools that 'digest' the published results and extract the most important information. RESULTS: We describe and evaluate an environment supporting the extraction of domain-specific relations, such as protein-protein interactions, from a richly-annotated corpus. We use full, deep-linguistic parsing and manually created, versatile patterns, expressing a large set of syntactic alternations, plus semantic ontology information. CONCLUSION: The experiments show that our approach described is capable of delivering high-precision results, while maintaining sufficient levels of recall. The high level of abstraction of the rules used by the system, which are considerably more powerful and versatile than finite-state approaches, allows speedy interactive development and validation

Crossref

Springer - Publisher Connector

PubMed Central

ZORA

Quantification of growth factors in allogenic bone grafts extracted with three different methods

Author: A Hansen
A Pruss
A Pruss
A. Kadow-Romacker
A. Pruss
B Blum
B. Wildemann
C Hofmann
CE Pepene
CH Lohmann
DL Skaggs
EM Pinholt
G. Schmidmaier
H Schliephake
H Taira
JB Stiehl
JS Silber
M Laitinen
M Lind
M Sprossig
ME Bolander
MF Moreau
MG Rock
N Kubler
N. P. Haas
P Aspenberg
P Wutzler
Q Zhang
R Kuhls
R Versen von
RC Sasso
RW Bright
S Honsawek
T Albrektsson
TK Sampath
U Ripamonti
U Thielicke
W Haynert
Publication venue: Springer Netherlands
Publication date: 01/01/2006
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

The gene normalization task in BioCreative III

Author: A McCallum
AA Morgan
AP Dawid
AS Schwartz
B Settles
B Turner
C Lindberg
Cheng-Ju Kuo
Chih-Hsuan Wei
Chun-Nan Hsu
CN Hsu
D Hong-Jie
D Rebholz-Schuhmann
David Campos
DD Lewis
Dina Vishnyakova
E Agirre
F Leitner
F Rinaldi
F Rinaldi
Fabio Rinaldi
Feifan Liu
H Liu
H Liu
Han-Cheol Cho
HD Carroll
Hong-Jie Dai
Hongfang Liu
Hung-Yu Kao
Illes Solt
J Hakenberg
J Whitechill
Jingchen Liu
Karin Verspoor
Kevin M Livingston
KG Dowell
L Hirschman
L Smith
M Ashburner
M Gerner
M Hall
M Huang
Manabu Torii
Martin Gerner
Martin Romacker
ME Colosimo
Minlie Huang
Naoaki Okazaki
P Donmez
P Ruch
P Smyth
P Welinder
Padmini Srinivasan
Patrick Ruch
R Leaman
R Snow
Richard Tzong-Han Tsai
S Bhattacharya
S Brin
S Matos
S Sarntivijai
Sanmitra Bhattacharya
Sergio Matos
Shashank Agarwal
T Kappeler
T Zhang
TH Haveliwala
VC Raykar
VS Sheng
W John Wilbur
X Wang
Z Lu
Zhiyong Lu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

BACKGROUND: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance

Crossref

Springer - Publisher Connector

PubMed Central

ZORA

OntoGene web services for biomedical text mining

Author: A Davis
AA Morgan
AJ Williams
AR Aronson
C Arighi
C Jonquet
C Stark
CN Arighi
D Campos
D Ferrucci
D Maglott
D Rebholz-Schuhmann
D Rebholz-Schuhmann
D Rebholz-Schuhmann
DC Comeau
F Leitner
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
Fabio Rinaldi
G Schneider
G Schneider
G Schneider
G Schneider
GK Savova
H Cunningham
H Hermjakob
Hernani Marques
I Androutsopoulos
I Segura-Bedmar
J Hakenberg
J Hakenberg
J Kim
JD Kim
JD Kim
K Dolinski
K Haverinen
K Kaljurand
K Sangkuhl
KB Cohen
L Richardson
L Tanabe
M Craven
M Krallinger
M Krallinger
M Mintz
Martin Romacker
R Hoffmann
Raul Rodriguez-Esteban
S Clematide
S Clematide
S Federhen
S Gama-Castro
S Gama-Castro
S Gama-Castro
Simon Clematide
T Consortium
T Kappeler
Tilia Ellendorff
W Liu
W Sun
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

ODIN: an advanced interface for the curation of biomedical literature

Author: Clematide S
Rinaldi Fabio
Romacker M
Schneider G
Vachon Th
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/10/2010
Field of study

We present ODIN (Ontogene Document INspector): a system for interactive curation of biomedical literature, developed within the scope of the SASEBio project (Semi-Automated Semantic Enrichment of the Biomedical Literature), as a collaboration between the OntoGene group at the University of Zurich and the NITAS/TMS group of Novartis Pharma AG. The purpose of the system is to allow a human annotator/curator to leverage upon the results of an advanced text mining system in order to enhance the speed and effectiveness of the annotation process. The OntoGene system takes as input a document (e.g a full paper from PubMed Central) and processes it with a custom NLP pipeline, which includes Named Entity recognition and relation extraction. Entities which are currently supported include proteins, genes, experimental methods, cell lines, species. Entities detected in the input document are disambiguated with respect to a reference database (UniProt, EntrezGene, NCBI taxonomy, PSI-MI ontology). The annotated documents are handed back to the ODIN interface, which allows multiple display modalities. The curator/annotator can view the whole document with in-line annotations highlighted, or can browse the extracted entities and be pointed back to the mentions of the entities within the original document. All entity mentions are entirely editable: the curator can easily add or delete any of them, and also change their extent (i.e. add/remove words to its right or left) with a simple click of the mouse. Different entity views are supported, with sorting capabilities according to different criteria (entity type, entity mention, confidence score, etc.). Selective highlighting of text units (e.g. sentences containing desired entities) is supported. Additionally, extensive logging functionalities are provided. All documents and entities are fully interlinked to reference databases, for the purpose of simplified inspection. Entities can be grouped in classes (e.g. by species) and actions can be applied to whole classes, for selective editing or removal

Crossref

ZORA