Search CORE

181 research outputs found

The Scope and the Sources of Variation in Verbal Predicates in English and French

Author: Kashaeva Goljihan
Merlo Paola
Samardžić Tanja
van der Plas Lonneke
Publication venue
Publication date: 01/12/2010
Field of study

Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 199-210. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

DSpace at Tartu University Library

Resource Interoperability for Sustainable Benchmarking: The Case of Events:The case of events

Author: Aroyo L.M.
Inel O.A.
Morante Vallejo R.
van Son C.M.
Vossen P.T.J.M.
Publication venue: European Language Resources Association (ELRA)
Publication date: 01/01/2018
Field of study

VU Research Portal

Nominalization and Alternations in Biomedical Language

Author: Adam Meyers
Adam Meyers
Adam Meyers
Adam Meyers
BarbaraH Partee
Ben Goertzel
Beth Levin
Carol Friedman
CharlesJ Fillmore
Christiane Fellbaum
DeborahA Dahl
Douglas Biber
George Dunham
George Hripcsak
Gondy Leroy
Gondy Leroy
James Pustejovsky
Jin-Dong Kim
JM Ko
John Lehrberger
Jonathan Schuman
K. Bretonnel Cohen
Karin Verspoor
KBretonnel Cohen
KBretonnel Cohen
Laurie Bauer
Lawrence Hunter
Leroy Gondy
Lynette Hirschman
M Narayanaswamy
Malka Rappaport-Hovav
Maria Koptjevskaja-Tamm
Martha Palmer
Martha Palmer
Martha Palmer
MartinF Porter
Michael Johnston
Michael Johnston
Naomi Sager
Naomi Sager
ParantuK Shah
PhilipV Ogren
PhilipV Ogren
Pierre Zweigenbaum
Ralph Grishman
Randolph Quirk
Richard Kittredge
Richard Tzong-Han Tsai
Robert P. Futrelle
RobertB Lees
Ron Artstein
Sameer Pradhan
Seth Kulick
T Ono
Thomas Herbst
Thomas Roeper
ThomasC Rindflesch
TimothyW Finin
Tony McEnery
Tuangthong Wattarujeekrit
Wen-Chi Chou
X Yuan
Yacov Kogan
Yuka Tateisi
Zellig Harris
Zheng Ping Jiang
ZZ Hu
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

Background: This paper presents data on alternations in the argument structure of common domain-specific verbs and their associated verbal nominalizations in the PennBioIE corpus. Alternation is the term in theoretical linguistics for variations in the surface syntactic form of verbs, e.g. the different forms of stimulate in FSH stimulates follicular development and follicular development is stimulated by FSH. The data is used to assess the implications of alternations for biomedical text mining systems and to test the fit of the sublanguage model to biomedical texts. Methodology/Principal Findings: We examined 1,872 tokens of the ten most common domain-specific verbs or their zerorelated nouns in the PennBioIE corpus and labelled them for the presence or absence of three alternations. We then annotated the arguments of 746 tokens of the nominalizations related to these verbs and counted alternations related to the presence or absence of arguments and to the syntactic position of non-absent arguments. We found that alternations are quite common both for verbs and for nominalizations. We also found a previously undescribed alternation involving an adjectival present participle. Conclusions/Significance: We found that even in this semantically restricted domain, alternations are quite common, and alternations involving nominalizations are exceptionally diverse. Nonetheless, the sublanguage model applies to biomedica

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

A Corpus of Preposition Supersenses

Author: Conger Kathryn
Green Meredith
Hwang Jena D.
O'Gorman Tim
Palmer Martha
Schneider Nathan
Srikumar Vivek
Suresh Abhijit
Publication venue
Publication date: 11/08/2016
Field of study

Edinburgh Research Explorer

Proceedings

Author: Dickinson Markus
Müürisep Kaili
Passarotti Marco
Publication venue
Publication date: 01/12/2010
Field of study

Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 268 pages. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891

DSpace at Tartu University Library

Semantic Role Labeling in Portuguese: Improving the State of the Art with Transfer Learning and BERT-based Models

Author: Ana Sofia Medeiros Oliveira
Publication venue
Publication date: 09/11/2020
Field of study

Repositório Aberto da Universidade do Porto

Recommended from our members

Adapting Semantic Role Labeling to New Genres and Languages

Author: Myers Skatje Katharina
Publication venue: University of Colorado Boulder
Publication date: 01/08/2023
Field of study

Semantic role labeling (SRL) is the identification of semantic predicates and their participants within a sentence, which is vital for deeper natural language understanding. State-of-the-art SRL models require annotated text for training, but those annotations don't exist for many languages and domains. The ability to annotate new corpora is hampered by limited time and budget. We explore two different ways of reducing the annotation required to produce SRL systems for new domains or languages: active learning and annotation projection. Active learning reduces annotation requirements by selecting just the most informative training instances through an iterative process of training and annotation. In this work, we investigate the use of Bayesian Active Learning by Disagreement, ways of tuning it for SRL, and assessing its performance across multiple corpora. We study the choices being made by different selection methods over the course of iterations, examining vocabulary coverage, diversity, predicates selected, and the shifts in confidence. We also explore the impact of various strategies of selecting the initial training data. We investigate a number of potentially influential factors within batches of queries, such as diversity and disagreement scores. In order to reduce the overhead of training time, we additionally compare the effect of increasing the amount of queries being selected on each iteration. Abstract Meaning Representations (AMRs) are increasingly popular semantic representations of whole sentences. Based on our successful results using active learning to assess the informativeness of annotation instances for SRL, we look into whether the commonalities between these representations can be leveraged to supply targeted annotation for AMR parsing. Finally, we explore annotation projection of SRL. This approach attempts to create semantic annotations in a target language given parallel translations that have been given SRL annotations through manual or automatic means. We assess the recently developed Russian PropBank and the feasibility of generating the same semantic annotations by projecting from the English PropBank annotation. We use both our own system with English-Russian automatic word alignments and the recent Universal PropBanks 2.0. We examine the types of errors that arise from inconsistencies or gaps in annotations as well as systemic issues arising from the strong English-bias of the projections. This analysis leads us to the development of several filtering techniques that improve the precision of the projections.</p

CU Scholar Institutional Repository

Investigating the cross-lingual translatability of VerbNet-style classification.

Author: Huang Yan
Korhonen Anna
Laippala Veronika
Majewska Olga
McCarthy Diana
Murakami Akira
Vulić Ivan
Publication venue: Lang Resour Eval
Publication date: 20/10/2017
Field of study

VerbNet-the most extensive online verb lexicon currently available for English-has proved useful in supporting a variety of NLP tasks. However, its exploitation in multilingual NLP has been limited by the fact that such classifications are available for few languages only. Since manual development of VerbNet is a major undertaking, researchers have recently translated VerbNet classes from English to other languages. However, no systematic investigation has been conducted into the applicability and accuracy of such a translation approach across different, typologically diverse languages. Our study is aimed at filling this gap. We develop a systematic method for translation of VerbNet classes from English to other languages which we first apply to Polish and subsequently to Croatian, Mandarin, Japanese, Italian, and Finnish. Our results on Polish demonstrate high translatability with all the classes (96% of English member verbs successfully translated into Polish) and strong inter-annotator agreement, revealing a promising degree of overlap in the resultant classifications. The results on other languages are equally promising. This demonstrates that VerbNet classes have strong cross-lingual potential and the proposed method could be applied to obtain gold standards for automatic verb classification in different languages. We make our annotation guidelines and the six language-specific verb classifications available with this paper

Crossref

University of Birmingham Research Portal

Apollo (Cambridge)

An exploratory study using the predicate-argument structure to develop methodology for measuring semantic similarity of radiology sentences

Author: Newsom Eric Tyner
Publication venue
Publication date: 12/11/2013
Field of study

Indiana University-Purdue University Indianapolis (IUPUI)The amount of information produced in the form of electronic free text in healthcare is increasing to levels incapable of being processed by humans for advancement of his/her professional practice. Information extraction (IE) is a sub-field of natural language processing with the goal of data reduction of unstructured free text. Pertinent to IE is an annotated corpus that frames how IE methods should create a logical expression necessary for processing meaning of text. Most annotation approaches seek to maximize meaning and knowledge by chunking sentences into phrases and mapping these phrases to a knowledge source to create a logical expression. However, these studies consistently have problems addressing semantics and none have addressed the issue of semantic similarity (or synonymy) to achieve data reduction. To achieve data reduction, a successful methodology for data reduction is dependent on a framework that can represent currently popular phrasal methods of IE but also fully represent the sentence. This study explores and reports on the benefits, problems, and requirements to using the predicate-argument statement (PAS) as the framework. A convenient sample from a prior study with ten synsets of 100 unique sentences from radiology reports deemed by domain experts to mean the same thing will be the text from which PAS structures are formed

IUPUIScholarWorks