Search CORE

45 research outputs found

The Parallel Meaning Bank:A Framework for Semantically Annotating Multiple Languages

Author: Abzianidze Lasha
Bos Johan
van Noord Rik
Wang Chunliu
Publication venue
Publication date: 01/01/2020
Field of study

This paper gives a general description of the ideas behind the Parallel Meaning Bank, a framework with the aim to provide an easy way to annotate compositional semantics for texts written in languages other than English. The annotation procedure is semi-automatic, and comprises seven layers of linguistic information: segmentation, symbolisation, semantic tagging, word sense disambiguation, syntactic structure, thematic role labelling, and co-reference. New languages can be added to the meaning bank as long as the documents are based on translations from English, but also introduce new interesting challenges on the linguistics assumptions underlying the Parallel Meaning Bank.Comment: 13 pages, 5 figures, 1 tabl

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Utrecht University Repository

Dissertations of the University of Groningen

DRS at MRP 2020:Dressing up Discourse Representation Structures as Graphs

Author: Abzianidze Lasha
Bos Johan
Oepen S.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

Discourse Representation Theory (DRT) is a formal account for representing the meaning of natural language discourse. Meaning in DRT is modeled via a Discourse Representation Structure (DRS), a meaning representation with a model-theoretic interpretation, which is usually depicted as nested boxes. In contrast, a directed labeled graph is a common data structure used to encode semantics of natural language texts. The paper describes the procedure of dressing up DRSs as directed labeled graphs to include DRT as a new framework in the 2020 shared task on Cross-Framework and Cross-Lingual Meaning Representation Parsing. Since one of the goals of the shared task is to encourage unified models for several semantic graph frameworks, the conversion procedure was biased towards making the DRT graph framework somewhat similar to other graph-based meaning representation frameworks.Comment: 10 pages, 4 figures, 4 tables, CoNLL 2020 Shared Tas

arXiv.org e-Print Archive

Proceedings - University of Groningen

Crossref

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Global and Local Hierarchy-aware Contrastive Framework for Implicit Discourse Relation Recognition

Author: Jiang Yuxin
Wang Wei
Zhang Linhan
Publication venue
Publication date: 24/11/2022
Field of study

Due to the absence of explicit connectives, implicit discourse relation recognition (IDRR) remains a challenging task in discourse analysis. The critical step for IDRR is to learn high-quality discourse relation representations between two arguments. Recent methods tend to integrate the whole hierarchical information of senses into discourse relation representations for multi-level sense recognition. Nevertheless, they insufficiently incorporate the static hierarchical structure containing all senses (defined as global hierarchy), and ignore the hierarchical sense label sequence corresponding to each instance (defined as local hierarchy). For the purpose of sufficiently exploiting global and local hierarchies of senses to learn better discourse relation representations, we propose a novel GLobal and LOcal Hierarchy-aware Contrastive Framework (GLOF), to model two kinds of hierarchies with the aid of contrastive learning. Experimental results on the PDTB dataset demonstrate that our method remarkably outperforms the current state-of-the-art model at all hierarchical levels.Comment: 13 pages, 10 figure

arXiv.org e-Print Archive

Character-based Neural Semantic Parsing

Author: van Noord Rik
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2021
Field of study

Humans and computers do not speak the same language. A lot of day-to-day tasks would be vastly more efficient if we could communicate with computers using natural language instead of relying on an interface. It is necessary, then, that the computer does not see a sentence as a collection of individual words, but instead can understand the deeper, compositional meaning of the sentence. A way to tackle this problem is to automatically assign a formal, structured meaning representation to each sentence, which are easy for computers to interpret. There have been quite a few attempts at this before, but these approaches were usually heavily reliant on predefined rules, word lists or representations of the syntax of the text. This made the general usage of these methods quite complicated. In this thesis we employ an algorithm that can learn to automatically assign meaning representations to texts, without using any such external resource. Specifically, we use a type of artificial neural network called a sequence-to-sequence model, in a process that is often referred to as deep learning. The devil is in the details, but we find that this type of algorithm can produce high quality meaning representations, with better performance than the more traditional methods. Moreover, a main finding of the thesis is that, counter intuitively, it is often better to represent the text as a sequence of individual characters, and not words. This is likely the case because it helps the model in dealing with spelling errors, unknown words and inflections

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Discovering multiword expressions

Author: Aline Villavicencio
Attia
Baldwin
Barrett
Barrett
Biber
Calzolari
Camacho-Collados
Church
Clark
Curran
de Marneffe
Dunning
Firth
Frege
Kilgarriff
Kim
Kiros
Lapesa
Leacock
Lin
Manning
Marco Idiart
McCarthy
Melamed
Mitchell
Moon
Nunberg
Pearce
Peters
Roller
Sag
Salehi
Salehi
Schneider
Schneider
Schulte im Walde
Sporleder
Søgaard
Van de Cruys
Villavicencio
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/11/2019
Field of study

In this paper, we provide an overview of research on multiword expressions (MWEs), from a natural lan- guage processing perspective. We examine methods developed for modelling MWEs that capture some of their linguistic properties, discussing their use for MWE discovery and for idiomaticity detection. We con- centrate on their collocational and contextual preferences, along with their fixedness in terms of canonical forms and their lack of word-for-word translatatibility. We also discuss a sample of the MWE resources that have been used in intrinsic evaluation setups for these methods

Crossref

White Rose Research Online

VALSE: A Task-independent benchmark for Vision and Language models centered on linguistic phenomena

Author: Cafagna M
Calixto I
Frank A
Gatt A
Muradjan L
Parcalabescu L
Publication venue
Publication date: 01/01/2022
Field of study

We propose VALSE (Vision And Language Structured Evaluation), a novel benchmark designed for testing general-purpose pretrained vision and language (V&L) models for their visio-linguistic grounding capabilities on specific linguistic phenomena. VALSE offers a suite of six tests covering various linguistic constructs. Solving these requires models to ground linguistic phenomena in the visual modality, allowing more fine-grained evaluations than hitherto possible. We build VALSE using methods that support the construction of valid foils, and report results from evaluating five widely-used V&L models. Our experiments suggest that current models have considerable difficulty addressing most phenomena. Hence, we expect VALSE to serve as an important benchmark to measure future progress of pretrained V&L models from a linguistic perspective, complementing the canonical task-centred V&L evaluations

arXiv.org e-Print Archive

Utrecht University Repository