Search CORE

3,084 research outputs found

The Parallel Meaning Bank: Towards a Multilingual Corpus of Translations Annotated with Compositional Meaning Representations

Author: Abzianidze Lasha
Bjerva Johannes
Bos Johan
Evang Kilian
Haagsma Hessel
Ludmann Pierre
Nguyen Duc-Duy
van Noord Rik
Publication venue
Publication date: 01/01/2017
Field of study

The Parallel Meaning Bank is a corpus of translations annotated with shared, formal meaning representations comprising over 11 million words divided over four languages (English, German, Italian, and Dutch). Our approach is based on cross-lingual projection: automatically produced (and manually corrected) semantic annotations for English sentences are mapped onto their word-aligned translations, assuming that the translations are meaning-preserving. The semantic annotation consists of five main steps: (i) segmentation of the text in sentences and lexical items; (ii) syntactic parsing with Combinatory Categorial Grammar; (iii) universal semantic tagging; (iv) symbolization; and (v) compositional semantic analysis based on Discourse Representation Theory. These steps are performed using statistical models trained in a semi-supervised manner. The employed annotation models are all language-neutral. Our first results are promising.Comment: To appear at EACL 201

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

PersoNER: Persian named-entity recognition

Author: Abdous M
Borzeshi EZ
Piccardi M
Poostchi H
Publication venue
Publication date: 01/01/2016
Field of study

© 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

OPUS - University of Technology Sydney

Learning Social Relation Traits from Face Images

Author: Loy Chen Change
Luo Ping
Tang Xiaoou
Zhang Zhanpeng
Publication venue
Publication date: 13/09/2015
Field of study

Social relation defines the association, e.g, warm, friendliness, and dominance, between two or more people. Motivated by psychological studies, we investigate if such fine-grained and high-level relation traits can be characterised and quantified from face images in the wild. To address this challenging problem we propose a deep model that learns a rich face representation to capture gender, expression, head pose, and age-related attributes, and then performs pairwise-face reasoning for relation prediction. To learn from heterogeneous attribute sources, we formulate a new network architecture with a bridging layer to leverage the inherent correspondences among these datasets. It can also cope with missing target attribute labels. Extensive experiments show that our approach is effective for fine-grained social relation learning in images and videos.Comment: To appear in International Conference on Computer Vision (ICCV) 201

arXiv.org e-Print Archive

Crossref

How compatible are our discourse annotation frameworks? Insights from mapping RST-DT and PDTB annotations

Author: Asr Fatemeh Torabi
Demberg Vera
Scholman Merel C.J.
Publication venue: University of Illinois at Chicago Library
Publication date: 14/06/2019
Field of study

Discourse-annotated corpora are an important resource for the community, but they are often annotated according to different frameworks. This makes joint usage of the annotations difficult, preventing researchers from searching the corpora in a unified way, or using all annotated data jointly to train computational systems. Several theoretical proposals have recently been made for mapping the relational labels of different frameworks to each other, but these proposals have so far not been validated against existing annotations. The two largest discourse relation annotated resources, the Penn Discourse Treebank and the Rhetorical Structure Theory Discourse Treebank, have however been annotated on the same texts, allowing for a direct comparison of the annotation layers. We propose a method for automatically aligning the discourse segments, and then evaluate existing mapping proposals by comparing the empirically observed against the proposed mappings. Our analysis highlights the influence of segmentation on subsequent discourse relation labelling, and shows that while agreement between frameworks is reasonable for explicit relations, agreement on implicit relations is low. We identify several sources of systematic discrepancies between the two annotation schemes and discuss consequences for future annotation and for usage of the existing resources

University of Illinois at Chicago: Journals@UIC

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Recommended from our members

Analysing trade-offs and synergies between SDGs for urban development, food security and poverty alleviation in rapidly changing peri-urban areas: a tool to support inclusive urban planning

Author: Butcher Bradley
Dolley Jonathan
Eray Baris
Marshall Fiona
Quadrianto Novi
Reffin Jeremy
Robinson James Alexander
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/04/2020
Field of study

Transitional peri-urban contexts are frontiers for sustainable development where land-use change involves negotiation and contestation between diverse interest groups. Multiple, complex trade-offs between outcomes emerge which have both negative and positive impacts on progress towards achieving Sustainable Development Goals (SDGs). These trade-offs are often overlooked in policy and planning processes which depend on top-down expert perspectives and rely on course grain aggregate data which does not reflect complex peri-urban dynamics or the rapid pace of change. Tools are required to address this gap, integrate data from diverse perspectives and inform more inclusive planning processes. In this paper, we draw on a reinterpretation of empirical data concerned with land-use change and multiple dimensions of food security from the city of Wuhan in China to illustrate some of the complex trade-offs between SDG goals that tend to be overlooked with current planning approaches. We then describe the development of an interactive web-based tool that implements deep learning methods for fine-grained land-use classification of high-resolution remote sensing imagery and integrates this with a flexible method for rapid trade-off analysis of land-use change scenarios. The development and potential use of the tool are illustrated using data from the Wuhan case study example. This tool has the potential to support participatory planning processes by providing a platform for multiple stakeholders to explore the implications of planning decisions and land-use policies. Used alongside other planning, engagement and ecosystem service mapping tools it can help to reveal invisible trade-offs and foreground the perspectives of diverse stakeholders. This is vital for building approaches which recognise how trade-offs between the achievement of SDGs can be influenced by development interventions

Sussex Research Online

Domain transfer for deep natural language generation from abstract meaning representations

Author: Dethlefs Nina
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/07/2017
Field of study

Stochastic natural language generation systems that are trained from labelled datasets are often domainspecific in their annotation and in their mapping from semantic input representations to lexical-syntactic outputs. As a result, learnt models fail to generalize across domains, heavily restricting their usability beyond single applications. In this article, we focus on the problem of domain adaptation for natural language generation. We show how linguistic knowledge from a source domain, for which labelled data is available, can be adapted to a target domain by reusing training data across domains. As a key to this, we propose to employ abstract meaning representations as a common semantic representation across domains. We model natural language generation as a long short-term memory recurrent neural network encoderdecoder, in which one recurrent neural network learns a latent representation of a semantic input, and a second recurrent neural network learns to decode it to a sequence of words. We show that the learnt representations can be transferred across domains and can be leveraged effectively to improve training on new unseen domains. Experiments in three different domains and with six datasets demonstrate that the lexical-syntactic constructions learnt in one domain can be transferred to new domains and achieve up to 75-100% of the performance of in-domain training. This is based on objective metrics such as BLEU and semantic error rate and a subjective human rating study. Training a policy from prior knowledge from a different domain is consistently better than pure in-domain training by up to 10%

Repository@Hull - Worktribe

Multimodality and superdiversity: evidence for a research agenda

Author: Adami E
Publication venue: Babylon Center for the Study of Superdiversity, Tilburg University
Publication date: 14/02/2017
Field of study

In recent years, social science research in superdiversity has questioned notions such as multiculturalism and pluralism, which hinge on and de facto reproduce ideological constructs such as separate and clearly identifiable national cultures and ethnic identities; research in language and superdiversity, in translanguaging, polylanguaging and metrolingualism have analogously questioned concepts such as multi- and bi-lingualism, which hinge on ideological constructs such as national languages, mother tongue and native speaker proficiency. Research in multimodality has questioned the centrality of language in everyday communication as well as its paradigmatic role to the understanding of communicative practices. While the multimodality of communication is generally acknowledged in work on language and superdiversity, the potential of a social semiotic multimodal approach for understanding communication in superdiversity has not been adequately explored and developed yet – and neither has the concept of superdiversity been addressed in multimodal research. The present paper wants to start to fill this gap. By discussing sign-making practices in the superdiverse context of Leeds Kirkgate Market (UK), it maps the potentials of an ethnographic social semiotics for the study of communication in superdiversity and sketches an agenda for research on multimodality and superdiversity, identifying a series of working hypotheses, research questions, areas of investigations and domains and fields of enquiry

White Rose Research Online

Faking and Conspiring about COVID-19: A Discursive Approach

Author: D\u27Errico Francesca
Paparella Alessia
Scardigno Rosa
Publication venue: NSUWorks
Publication date: 05/01/2023
Field of study

In the more general climate of post-truth - a social trend reflecting a disregard for reliable ways of knowing what is true, mostly acted through massive use of misinformation and rhetoric calling for emotions - an alarming “infodemic” accompanied the COVID-19 pandemic, affecting healthy attitudes and behaviors and further lessening trust in science, institutions, and traditional media. Its two main representative items, fake and conspiracy news, have been widely analyzed in psycho-social research, even if scholars mostly acknowledged the cognitive and social dimensions of those items and devoted less attention to their discursive construction. In addition, these works did not directly compare and differentiate fake and conspiracy pathways. In order to address this gap and promote a wider understanding of these matters, a qualitative investigation of an Italian sample of 112 fake and conspiracy news articles, mostly spread during the first two COVID-19 “waves” (from March 2020 to January 2021) was realized. Our sample gathered news specifically coming from social media posts, representing easy and fast channels for viral content diffusion. We analyzed the selected texts by means of Diatextual Analysis and Discursive Action Model models, aimed to (a) offer “in depth” fine-grained analysis of the psycholinguistic and argumentative features of fake and conspiracy news, and (b) differentiate them in line with the classical Aristotle’s rhetoric stances of logos, ethos, and pathos, thus bridging traditional and current lines of thinking. Even though they may share common roots set in the post-truth climate, fake and conspiracy news engage in different rhetoric patterns since they present different enjeu and construct specific epistemic pathways. Implications for health- and digital-literacy are debated

NSU Works