5,879 research outputs found

    Syntax-driven argument identification and multi-argument classification for semantic role labeling

    Get PDF
    Semantic role labeling is an important stage in systems for Natural Language Understanding. The basic problem is one of identifying who did what to whom for each predicate in a sentence. Thus labeling is a two-step process: identify constituent phrases that are arguments to a predicate, then label those arguments with appropriate thematic roles. Existing systems for semantic role labeling use machine learning methods to assign roles one-at-a-time to candidate arguments. There are several drawbacks to this general approach. First, more than one candidate can be assigned the same role, which is undesirable. Second, the search for each candidate argument is exponential with respect to the number of words in the sentence. Third, single-role assignment cannot take advantage of dependencies known to exist between semantic roles of predicate arguments, such as their relative juxtaposition. And fourth, execution times for existing algorithm are excessive, making them unsuitable for real-time use. This thesis seeks to obviate these problems by approaching semantic role labeling as a multi-argument classification process. It observes that the only valid arguments to a predicate are unembedded constituent phrases that do not overlap that predicate. Given that semantic role labeling occurs after parsing, this thesis proposes an algorithm that systematically traverses the parse tree when looking for arguments, thereby eliminating the vast majority of impossible candidates. Moreover, instead of assigning semantic roles one at a time, an algorithm is proposed to assign all labels simultaneously; leveraging dependencies between roles and eliminating the problem of duplicate assignment. Experimental results are provided as evidence to show that a combination of the proposed argument identification and multi-argument classification algorithms outperforms all existing systems that use the same syntactic information

    Multi-argument classification for semantic role labeling

    Get PDF
    This paper describes a Multi-Argument Classification (MAC) approach to Semantic Role Labeling. The goal is to exploit dependencies between semantic roles by simultaneously classifying all arguments as a pattern. Argument identification, as a pre-processing stage, is carried at using the improved Predicate-Argument Recognition Algorithm (PARA) developed by Lin and Smith (2006). Results using standard evaluation metrics show that multi-argument classification, archieving 76.60 in F₁ measurement on WSJ 23, outperforms existing systems that use a single parse tree for the CoNLL 2005 shared task data. This paper also describes ways to significantly increase the speed of multi-argument classification, making it suitable for real-time language processing tasks that require semantic role labelling

    An Argument-Marker Model for Syntax-Agnostic Proto-Role Labeling

    Full text link
    Semantic proto-role labeling (SPRL) is an alternative to semantic role labeling (SRL) that moves beyond a categorical definition of roles, following Dowty's feature-based view of proto-roles. This theory determines agenthood vs. patienthood based on a participant's instantiation of more or less typical agent vs. patient properties, such as, for example, volition in an event. To perform SPRL, we develop an ensemble of hierarchical models with self-attention and concurrently learned predicate-argument-markers. Our method is competitive with the state-of-the art, overall outperforming previous work in two formulations of the task (multi-label and multi-variate Likert scale prediction). In contrast to previous work, our results do not depend on gold argument heads derived from supplementary gold tree banks.Comment: accepted at *SEM 201

    Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus

    Full text link
    Many efforts of research are devoted to semantic role labeling (SRL) which is crucial for natural language understanding. Supervised approaches have achieved impressing performances when large-scale corpora are available for resource-rich languages such as English. While for the low-resource languages with no annotated SRL dataset, it is still challenging to obtain competitive performances. Cross-lingual SRL is one promising way to address the problem, which has achieved great advances with the help of model transferring and annotation projection. In this paper, we propose a novel alternative based on corpus translation, constructing high-quality training datasets for the target languages from the source gold-standard SRL annotations. Experimental results on Universal Proposition Bank show that the translation-based method is highly effective, and the automatic pseudo datasets can improve the target-language SRL performances significantly.Comment: Accepted at ACL 202
    • …
    corecore