Search CORE

28 research outputs found

Robust Subgraph Generation Improves Abstract Meaning Representation Parsing

Author: Angeli Gabor
Manning Christopher
Werling Keenon
Publication venue
Publication date: 09/06/2015
Field of study

The Abstract Meaning Representation (AMR) is a representation for open-domain rich semantics, with potential use in fields like event extraction and machine translation. Node generation, typically done using a simple dictionary lookup, is currently an important limiting factor in AMR parsing. We propose a small set of actions that derive AMR subgraphs by transformations on spans of text, which allows for more robust learning of this stage. Our set of construction actions generalize better than the previous approach, and can be learned with a simple classifier. We improve on the previous state-of-the-art result for AMR parsing, boosting end-to-end performance by 3 F

_1

on both the LDC2013E117 and LDC2014T12 datasets.Comment: To appear in ACL 201

arXiv.org e-Print Archive

CiteSeerX

Semi-automatic conversion of BioProp semantic annotation to PASBio annotation

Author: AL Berger
B Santorini
C Warner
Chi-Hsin Huang
D Dowty
E Charniak
H-J Dai
Hong-Jie Dai
KB Cohen
M Collins
M Palmer
O Babko-Malaya
O Babko-Malaya
PK Shah
PK Shah
R Hoernig
RA Hudson
Richard Tzong-Han Tsai
RT-H Tsai
RT-H Tsai
S Pradhan
T Wattarujeekrit
V Punyakanok
W-C Chou
Wen-Lian Hsu
X Carreras
X Carreras
Y Kogan
Y Tateisi
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Semantic role labeling (SRL) is an important text analysis technique. In SRL, sentences are represented by one or more predicate-argument structures (PAS). Each PAS is composed of a predicate (verb) and several arguments (noun phrases, adverbial phrases, etc.) with different semantic roles, including main arguments (agent or patient) as well as adjunct arguments (time, manner, or location). PropBank is the most widely used PAS corpus and annotation format in the newswire domain. In the biomedical field, however, more detailed and restrictive PAS annotation formats such as PASBio are popular. Unfortunately, due to the lack of an annotated PASBio corpus, no publicly available machine-learning (ML) based SRL systems based on PASBio have been developed. In previous work, we constructed a biomedical corpus based on the PropBank standard called BioProp, on which we developed an ML-based SRL system, BIOSMILE. In this paper, we aim to build a system to convert BIOSMILE's BioProp annotation output to PASBio annotation. Our system consists of BIOSMILE in combination with a BioProp-PASBio rule-based converter, and an additional semi-automatic rule generator. Results Our first experiment evaluated our rule-based converter's performance independently from BIOSMILE performance. The converter achieved an F-score of 85.29%. The second experiment evaluated combined system (BIOSMILE + rule-based converter). The system achieved an F-score of 69.08% for PASBio's 29 verbs. Conclusion Our approach allows PAS conversion between BioProp and PASBio annotation using BIOSMILE alongside our newly developed semi-automatic rule generator and rule-based converter. Our system can match the performance of other state-of-the-art domain-specific ML-based SRL systems and can be easily customized for PASBio application development.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Hard constraints for grammatical function labelling

Author: Kuhn Jonas
Rehbein Ines
Seeker Wolfgang
van Genabith Josef
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2010
Field of study

For languages with (semi-) free word order (such as German), labelling grammatical functions on top of phrase-structural constituent analyses is crucial for making them interpretable. Unfortunately, most statistical classifiers consider only local information for function labelling and fail to capture important restrictions on the distribution of core argument functions such as subject, object etc., namely that there is at most one subject (etc.) per clause. We augment a statistical classifier with an integer linear program imposing hard linguistic constraints on the solution space output by the classifier, capturing global distributional restrictions. We show that this improves labelling quality, in particular for argument grammatical functions, in an intrinsic evaluation, and, importantly, grammar coverage for treebankbased (Lexical-Functional) grammar acquisition and parsing, in an extrinsic evaluation

Irish Universities

DCU Online Research Access Service

Emotion Detection in Textual Information by Semantic Role Labeling and Web Mining Techniques

Author: Cruz-Lara Samuel
Hong Jen-Shin
Lu Cheng-Yu
Publication venue: HAL CCSD
Publication date: 28/03/2006
Field of study

Automatic emotion detection in textual information is critical for the development of intelligent interfaces in many interactive multimedia applications. In the literature, existing approaches based on keyword spotting or statistic natural language process techniques, have limited success rate in free text emotion sensing applications. In this paper, we describe a system, developed in the framework of the National ChiNan University and LORIA collaboration, that associates semantic labeling and web mining techniques, to detect several basic emotions. A common sense knowledgebase – ConceptNet – is also used in order to retrieve some additional contextual information that can be used to retrieve appropriate background images for the presentation. Our objective is to adapt a multimedia presentation by detecting emotions contained in the textual information

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees

Author: Barzilay Regina
Globerson Amir
Jaakkola Tommi S.
Lei Tao
Zhang Yuan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions. In contrast, we demonstrate that highly expressive scoring functions can be used with substantially simpler inference procedures. Specifically, we introduce a sampling-based parser that can easily handle arbitrary global features. Inspired by SampleRank, we learn to take guided stochastic steps towards a high scoring parse. We introduce two samplers for traversing the space of trees, Gibbs and Metropolis-Hastings with Random Walk. The model outperforms state-of-the-art results when evaluated on 14 languages of non-projective CoNLL datasets. Our sampling-based approach naturally extends to joint prediction scenarios, such as joint parsing and POS correction. The resulting method outperforms the best reported results on the CATiB dataset, approaching performance of parsing with gold tags.United States. Multidisciplinary University Research Initiative (W911NF-10-1-0533)United States. Defense Advanced Research Projects Agency. Broad Operational Language TranslationUnited States-Israel Binational Science Foundation (Grant 2012330

CiteSeerX

DSpace@MIT

Crossref