Search CORE

1,083 research outputs found

Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition

Author: Sang Erik F. Tjong Kim
Publication venue
Publication date: 01/01/2002
Field of study

We describe the CoNLL-2002 shared task: language-independent named entity recognition. We give background information on the data sets and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.Comment: 4 page

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition

Author: De Meulder Fien
Sang Erik F. Tjong Kim
Publication venue
Publication date: 01/01/2003
Field of study

We describe the CoNLL-2003 shared task: language-independent named entity recognition. We give background information on the data sets (English and German) and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance

arXiv.org e-Print Archive

CiteSeerX

Tilburg University Repository

Implicit reference to citations: a study of astronomy

Author: Kim Yunhyong
Webber Bonnie
Publication venue
Publication date: 05/10/2006
Field of study

The research in this paper presents results in the automatic classification of pronouns within articles into those which refer to cited research and those which do not. It also discusses the automatic linking of pronouns which do refer to citations to their corresponding citations. The current study focused on the pronoun they as used in papers in Astronomy journals. The paper describes a classifier trained on maximum entropy principles using features defined by the distance to preceding citations and the category of verbs associated to the pronoun under consideration

Enlighten

Exploring the boundaries: gene and protein identification in biomedical text

Author: Beatrice Alex
Bmc Bioinformatics
Christopher D Manning
Claire Grover
Jenny Finkel
Malvina Nissim
Shipra Dingare
Publication venue: BioMed Central
Publication date: 01/01/2004
Field of study

Background: Good automatic information extraction tools offer hope for automatic processing of the exploding biomedical literature, and successful named entity recognition is a key component for such tools. Methods: We present a maximum-entropy based system incorporating a diverse set of features for identifying gene and protein names in biomedical abstracts. Results: This system was entered in the BioCreative comparative evaluation and achieved a precision of 0.83 and recall of 0.84 in the “open ” evaluation and a precision of 0.78 and recall of 0.85 in the “closed ” evaluation. Conclusions: Central contributions are rich use of features derived from the training data at multiple levels of granularity, a focus on correctly identifying entity boundaries, and the innovative use of several external knowledge sources including full MEDLINE abstracts and web searches. Background The explosion of information in the biomedical domain and particularly in genetics has highlighted the need for automated text information extraction techniques. MEDLINE, the primary research database serving the biomedical community, currently contains over 14 million abstracts, with 60,000 new abstracts appearing each month. There is also an impressive number of molecular biological databases covering a

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Edinburgh Research Explorer