Search CORE

63 research outputs found

Learning Sentence-internal Temporal Relations

Author: Lapata M.
Lascarides A.
Publication venue: 'AI Access Foundation'
Publication date: 01/01/2006
Field of study

In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer

Learning to simplify sentences with quasi-synchronous grammar and integer programming

Author: Lapata M.
Woodsend K.
Publication venue
Publication date: 01/01/2011
Field of study

Edinburgh Research Explorer

Learning Opinion Summarizers by Selecting Informative Reviews

Author: Bražinskas A.
Lapata M.
Titov I.
Publication venue
Publication date: 01/01/2021
Field of study

Opinion summarization has been traditionally approached with unsupervised, weakly-supervised and few-shot learning techniques. In this work, we collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training. However, the number of reviews per product is large (320 on average), making summarization - and especially training a summarizer - impractical. Moreover, the content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates. In order to deal with both of these challenges, we formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets. The choice of the review subset is treated as a latent variable, predicted by a small and simple selector. The subset is then fed into a more powerful summarizer. For joint training, we use amortized variational inference and policy gradient methods. Our experiments demonstrate the importance of selecting informative reviews resulting in improved quality of summaries and reduced hallucinations.</p

International Migration, Integration and Social Cohesion online publications

Universal Discourse Representation Structure Parsing

Author: Bos Johan
Cohen S.B.
Lapata M.
Liu J.
Publication venue: 'MIT Press - Journals'
Publication date: 01/01/2021
Field of study

We consider the task of crosslingual semantic parsing in the style of Discourse Representation Theory (DRT) where knowledge from annotated corpora in a resource-rich language is transferred via bitext to guide learning in other languages. We introduce Universal Discourse Representation Theory (UDRT), a variant of DRT that explicitly anchors semantic representations to tokens in the linguistic input. We develop a semantic parsing framework based on the Transformer architecture and utilize it to obtain semantic resources in multiple languages following two learning schemes. The many-to-one approach translates non-English text to English, and then runs a relatively accurate English parser on the translated text, while the one-to-many approach translates gold standard English to non-English text and trains multiple parsers (one per language) on the translations. Experimental results on the Parallel Meaning Bank show that our proposal outperforms strong baselines by a wide margin and can be used to construct (silver-standard) meaning banks for 99 languages

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Whodunnit? Crime Drama as a Case for Natural Language Understanding

Author: Cohen Shay
Frermann Lea
Lapata M.
Publication venue: 'MIT Press - Journals'
Publication date: 11/01/2018
Field of study

Edinburgh Research Explorer

Schema Normalization for Improving Schema Matching

Author: D. Beneventano
G.A. Miller
I. Plag
J. Euzenat
J.N. Levi
M. Lapata
R. Uthurusamy
S. Bergamaschi
X. Su
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the \hidden meaning" associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a \u201cmeaning" to schema labels. However, accuracy of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and word abbreviations.In this work, we address this problem by proposing a method to perform schema labels normalization which increases the number of comparable labels. Unlike other solutions, the method semi-automatically expands abbreviations and annotates compound terms, without a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy

Crossref

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia