Search CORE

1,288 research outputs found

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer

The Hessian Method (Highly Efficient State Smoothing, In a Nutshell)

Author: McCausland William J.
Publication venue: Université de Montréal. Département de sciences économiques.
Publication date: 01/03/2008
Field of study

Dépôt Institutionnel Numérique

Semi-Supervised Named Entity Recognition:\ud Learning to Recognize 100 Entity Types with Little Supervision\ud

Author: Nadeau David
Publication venue
Publication date: 28/11/2007
Field of study

Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. \ud \ud In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. \ud \ud Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. \ud \ud We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts. \u

CogPrints Cognitive Sciences Eprint Archive

Entity recognition in the biomedical domain using a hybrid approach

Author: A Tharatipyakul
C Funk
CD Paice
CS Funk
D Campos
D Koning
D Maglott
D Szklarczyk
DM Jessop
E Pafilis
E Tseytlin
F Rinaldi
F Rinaldi
F Rinaldi
F Rinaldi
G Sheikhshab
K Degtyarenko
K Eilbeck
K Verspoor
K Verspoor
M Ashburner
M Bada
M Basaldella
M Basaldella
MF Porter
N Pudota
P Lopez
PD Turney
R Core Team
R Leaman
R Leaman
S Aubin
S Eltyeb
S Tulkens
SA Akhondi
T Groza
T Munkhdalai
U Leser
Y Sasaki
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Higher-order inference in conditional random fields using submodular functions

Author: Pansari Pankaj
Publication venue
Publication date: 08/12/2023
Field of study

Higher-order and dense conditional random fields (CRFs) are expressive graphical models which have been very successful in low-level computer vision applications such as semantic segmentation, and stereo matching. These models are able to capture long-range interactions and higher-order image statistics much better than pairwise CRFs. This expressive power comes at a price though - inference problems in these models are computationally very demanding. This is a particular challenge in computer vision, where fast inference is important and the problem involves millions of pixels. In this thesis, we look at how submodular functions can help us designing efficient inference methods for higher-order and dense CRFs. Submodular functions are special discrete functions that have important properties from an optimisation perspective, and are closely related to convex functions. We use submodularity in a two-fold manner: (a) to design efficient MAP inference algorithm for a robust higher-order model that generalises the widely-used truncated convex models, and (b) to glean insights into a recently proposed variational inference algorithm which give us a principled approach for applying it efficiently to higher-order and dense CRFs

Oxford University Research Archive

Named Entity Recognition for Bacterial Type IV Secretion Systems

Author: Ananiadou Sophia
Black William
Gillespie Joseph J.
Kolluru BalaKrishna
Levow Gina-Anne
Mao Chunhong
Pyysalo Sampo
Sobral Bruno
Sullivan Dan
Tsujii Junichi
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Research on specialized biological systems is often hampered by a lack of consistent terminology, especially across species. In bacterial Type IV secretion systems genes within one set of orthologs may have over a dozen different names. Classifying research publications based on biological processes, cellular components, molecular functions, and microorganism species should improve the precision and recall of literature searches allowing researchers to keep up with the exponentially growing literature, through resources such as the Pathosystems Resource Integration Center (PATRIC, patricbrc.org). We developed named entity recognition (NER) tools for four entities related to Type IV secretion systems: 1) bacteria names, 2) biological processes, 3) molecular functions, and 4) cellular components. These four entities are important to pathogenesis and virulence research but have received less attention than other entities, e.g., genes and proteins. Based on an annotated corpus, large domain terminological resources, and machine learning techniques, we developed recognizers for these entities. High accuracy rates (>80%) are achieved for bacteria, biological processes, and molecular function. Contrastive experiments highlighted the effectiveness of alternate recognition strategies; results of term extraction on contrasting document sets demonstrated the utility of these classes for identifying T4SS-related documents

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

The University of Manchester - Institutional Repository