Search CORE

2,500 research outputs found

Adapting a relation extraction pipeline for the BioCreAtIvE II task

Author: Grover Claire
Haddow Barry
Klein Ewan
Matthews Michael
Nielsen Leif Arda
Tobin Richard
Wang Xinglong
Publication venue
Publication date: 01/01/2007
Field of study

Dublin City University at QA@CLEF 2008

Author: Adafre Sisay Fissaha
van Genabith Josef
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

We describe our participation in Multilingual Question Answering at CLEF 2008 using German and English as our source and target languages respectively. The system was built using UIMA (Unstructured Information Management Architecture) as underlying framework

Crossref

DCU Online Research Access Service

Learning Parse and Translation Decisions From Examples With Rich Context

Author: Hermjakob Ulf
Mooney Raymond J.
Publication venue
Publication date: 01/01/1997
Field of study

We present a knowledge and context-based system for parsing and translating natural language and evaluate it on sentences from the Wall Street Journal. Applying machine learning techniques, the system uses parse action examples acquired under supervision to generate a deterministic shift-reduce parser in the form of a decision structure. It relies heavily on context, as encoded in features which describe the morphological, syntactic, semantic and other aspects of a given parse state.Comment: 8 pages, LaTeX, 3 postscript figures, uses aclap.st

arXiv.org e-Print Archive

CiteSeerX

Crossref

African language technology: the data-driven perspective

Author: De Pauw Guy
de Schryver Gilles-Maurice
Publication venue: 'European Academy of Applied and Social Sciences (EURAASS)'
Publication date: 01/01/2009
Field of study

Ghent University Academic Bibliography

Institutional Repository Universiteit Antwerpen

Towards a machine-learning architecture for lexical functional grammar parsing

Author: Chrupała Grzegorz
Publication venue: Dublin City University. School of Computing
Publication date: 01/11/2008
Field of study

Data-driven grammar induction aims at producing wide-coverage grammars of human languages. Initial efforts in this field produced relatively shallow linguistic representations such as phrase-structure trees, which only encode constituent structure. Recent work on inducing deep grammars from treebanks addresses this shortcoming by also recovering non-local dependencies and grammatical relations. My aim is to investigate the issues arising when adapting an existing Lexical Functional Grammar (LFG) induction method to a new language and treebank, and find solutions which will generalize robustly across multiple languages. The research hypothesis is that by exploiting machine-learning algorithms to learn morphological features, lemmatization classes and grammatical functions from treebanks we can reduce the amount of manual specification and improve robustness, accuracy and domain- and language -independence for LFG parsing systems. Function labels can often be relatively straightforwardly mapped to LFG grammatical functions. Learning them reliably permits grammar induction to depend less on language-specific LFG annotation rules. I therefore propose ways to improve acquisition of function labels from treebanks and translate those improvements into better-quality f-structure parsing. In a lexicalized grammatical formalism such as LFG a large amount of syntactically relevant information comes from lexical entries. It is, therefore, important to be able to perform morphological analysis in an accurate and robust way for morphologically rich languages. I propose a fully data-driven supervised method to simultaneously lemmatize and morphologically analyze text and obtain competitive or improved results on a range of typologically diverse languages

Irish Universities

DCU Online Research Access Service

FREQUENCY IN MORPHOLOGY

Author: Kornai András
Publication venue
Publication date: 01/01/1992
Field of study

calls the traditional rule-based view of grammar into question. These authors emphasize that grammatical rule systems aiming at syntax-directed translation, and even rule systems aimed at the description of a single language, break down when faced with the actual complexit

CiteSeerX

SZTAKI Publication Repository

Morphological Analysis as Classification: an Inductive-Learning Approach

Author: Bosch Antal van den
Daelemans Walter
Weijters Ton
Publication venue
Publication date: 01/01/1996
Field of study

Morphological analysis is an important subtask in text-to-speech conversion, hyphenation, and other language engineering tasks. The traditional approach to performing morphological analysis is to combine a morpheme lexicon, sets of (linguistic) rules, and heuristics to find a most probable analysis. In contrast we present an inductive learning approach in which morphological analysis is reformulated as a segmentation task. We report on a number of experiments in which five inductive learning algorithms are applied to three variations of the task of morphological analysis. Results show (i) that the generalisation performance of the algorithms is good, and (ii) that the lazy learning algorithm IB1-IG performs best on all three tasks. We conclude that lazy learning of morphological analysis as a classification task is indeed a viable approach; moreover, it has the strong advantages over the traditional approach of avoiding the knowledge-acquisition bottleneck, being fast and deterministic in learning and processing, and being language-independent.Comment: 11 pages, 5 encapsulated postscript figures, uses non-standard NeMLaP proceedings style nemlap.sty; inputs ipamacs (international phonetic alphabet) and epsf macro

arXiv.org e-Print Archive

CiteSeerX

Institutional Repository Universiteit Antwerpen

Tilburg University Repository

Towards a Protein-Protein Interaction information extraction system: recognizing named entities

Author: Alfred
Antonio Molina
Aronson
Bader
Baeza-yates
Danger
Denny
Dingare
Ferran Pla
Giles
Habib
Hersh
Kerrien
Leaman
Lee
Levenshtein
Li
Lindberg
McCandless
Miller
Mishra
Nadkarni
Orchard
Pagel
Paolo Rosso
Phizicky
Rebholz-Schuhmann
Reguly
Ristad
Roxana Danger
Salwinski
Schneider
Smith
Song
Sun
Thomas
Tsai
Tsuruoka
Wang
Zanzoni
Publication venue: 'Elsevier BV'
Publication date: 01/02/2014
Field of study

[EN] The majority of biological functions of any living being are related to Protein Protein Interactions (PPI). PPI discoveries are reported in form of research publications whose volume grows day after day. Consequently, automatic PPI information extraction systems are a pressing need for biologists. In this paper we are mainly concerned with the named entity detection module of PPIES (the PPI information extraction system we are implementing) which recognizes twelve entity types relevant in PPI context. It is composed of two sub-modules: a dictionary look-up with extensive normalization and acronym detection, and a Conditional Random Field classifier. The dictionary look-up module has been tested with Interaction Method Task (IMT), and it improves by approximately 10% the current solutions that do not use Machine Learning (ML). The second module has been used to create a classifier using the Joint Workshop on Natural Language Processing in Biomedicine and its Applications (JNLPBA 04) data set. It does not use any external resources, or complex or ad hoc post-processing, and obtains 77.25%, 75.04% and 76.13 for precision, recall, and F1-measure, respectively, improving all previous results obtained for this data set.This work has been funded by MICINN, Spain, as part of the "Juan de la Cierva" Program and the Project DIANA-Applications (TIN2012-38603-C02-01), as well as the by the European Commission as part of the WIQ-EI IRSES Project (Grant No. 269180) within the FP 7 Marie Curie People Framework.Danger Mercaderes, RM.; Pla Santamaría, F.; Molina Marco, A.; Rosso, P. (2014). Towards a Protein-Protein Interaction information extraction system: recognizing named entities. Knowledge-Based Systems. 57:104-118. https://doi.org/10.1016/j.knosys.2013.12.010S1041185

Crossref

RiuNet