Search CORE

353 research outputs found

Neural models of language use:Studies of language comprehension and production in context

Author: Giulianelli M.
Publication venue
Publication date: 01/01/2023
Field of study

Artificial neural network models of language are mostly known and appreciated today for providing a backbone for formidable AI technologies. This thesis takes a different perspective. Through a series of studies on language comprehension and production, it investigates whether artificial neural networks—beyond being useful in countless AI applications—can serve as accurate computational simulations of human language use, and thus as a new core methodology for the language sciences

International Migration, Integration and Social Cohesion online publications

PersoNER: Persian named-entity recognition

Author: Abdous M
Borzeshi EZ
Piccardi M
Poostchi H
Publication venue
Publication date: 01/01/2016
Field of study

© 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

OPUS - University of Technology Sydney

EVALITA Evaluation of NLP and Speech Tools for Italian Proceedings of the Final Workshop

Author: Basile Pierpaolo
Cutugno Franco
Nissim Malvina
Patti Viviana
Pierpaolo Basile Franco Cutugno, Malvina Nissim, Viviana Patti, Rachele Sprugnoli
Sprugnoli Rachele
Publication venue: place:Torino
Publication date: 01/01/2016
Field of study

Editor of the proceedings of EVALITA 2016

Archivio istituzionale della Ricerca - Università degli Studi di Parma

PubliCatt

The Best Explanation:Beyond Right and Wrong in Question Answering

Author: Johannsen Anders Trærup
Publication venue: Det Humanistiske Fakultet, Københavns Universitet
Publication date: 01/01/2013
Field of study

Copenhagen University Research Information System

Natural Language Processing in-and-for Design Research

Author: Blessing Lucienne T. M.
Luo Jianxi
Siddharth L
Publication venue
Publication date: 27/11/2021
Field of study

We review the scholarly contributions that utilise Natural Language Processing (NLP) methods to support the design process. Using a heuristic approach, we collected 223 articles published in 32 journals and within the period 1991-present. We present state-of-the-art NLP in-and-for design research by reviewing these articles according to the type of natural language text sources: internal reports, design concepts, discourse transcripts, technical publications, consumer opinions, and others. Upon summarizing and identifying the gaps in these contributions, we utilise an existing design innovation framework to identify the applications that are currently being supported by NLP. We then propose a few methodological and theoretical directions for future NLP in-and-for design research

arXiv.org e-Print Archive

User profiling with geo-located social media and demographic data

Author: Poulston Adam Reece Spencer
Publication venue
Publication date: 01/06/2021
Field of study

User profiling is the task of inferring attributes, such as gender or age, of social media users based on the content they produce or their behaviours on-line. Approaches for user profiling typically use machine learning techniques to train user profiling systems capable of inferring the attributes of unseen users, having been provided with a training set of users labelled with their attributes. Classic approaches to user attribute labelling for such a training set may be manual or automated, examples include: direct solicitation through surveys, manual assignment based on outward characteristics, and extraction of attribute key-phrases from user description fields. Social media platforms, such as Twitter, often provide users with the ability to attach their geographic location to their posts, known as geo-location. In addition, government organisations release demographic data aggregated at a variety of geographic scales. The combination of these two data sources is currently under-explored in the user profiling literature. To combine these sources, a method is proposed for geo-location-driven user attribute labelling in which a coordinate level prediction is made for a user's 'home location', which in turn is used to 'look up' corresponding demographic variables that are assigned to the user. Strong baseline components for user profiling systems are investigated and validated in experiments on existing user profiling datasets, and a corpus of geo-located Tweets is used to derive a complementary resource. An evaluation of current methods for assigning fine-grained home location to social media users is performed, and two improved methods are proposed based on clustering and majority voting across arbitrary geographic regions. The proposed geo-location-driven user attribute labelling approach is applied across three demographic variables within the UK: Output Area Classification (OAC), Local Authority Classification (LAC), and National Statistics Socio-economic Classification (NS-SEC). User profiling systems are trained and evaluated on each of the derived datasets, and NS-SEC is additionally validated against a dataset derived through a different method. Promising results are achieved for LAC and NS-SEC, however characteristics of the underlying geographic and demographic data can lead to poor quality datasets, as displayed for OAC

White Rose E-theses Online

Macro-micro approach for mining public sociopolitical opinion from social media

Author: Wang Bo
Publication venue
Publication date
Field of study

During the past decade, we have witnessed the emergence of social media, which has prominence as a means for the general public to exchange opinions towards a broad range of topics. Furthermore, its social and temporal dimensions make it a rich resource for policy makers and organisations to understand public opinion. In this thesis, we present our research in understanding public opinion on Twitter along three dimensions: sentiment, topics and summary. In the first line of our work, we study how to classify public sentiment on Twitter. We focus on the task of multi-target-specific sentiment recognition on Twitter, and propose an approach which utilises the syntactic information from parse-tree in conjunction with the left-right context of the target. We show the state-of-the-art performance on two datasets including a multi-target Twitter corpus on UK elections which we make public available for the research community. Additionally we also conduct two preliminary studies including cross-domain emotion classification on discourse around arts and cultural experiences, and social spam detection to improve the signal-to-noise ratio of our sentiment corpus. Our second line of work focuses on automatic topical clustering of tweets. Our aim is to group tweets into a number of clusters, with each cluster representing a meaningful topic, story, event or a reason behind a particular choice of sentiment. We explore various ways of tackling this challenge and propose a two-stage hierarchical topic modelling system that is efficient and effective in achieving our goal. Lastly, for our third line of work, we study the task of summarising tweets on common topics, with the goal to provide informative summaries for real-world events/stories or explanation underlying the sentiment expressed towards an issue/entity. As most existing tweet summarisation approaches rely on extractive methods, we propose to apply state-of-the-art neural abstractive summarisation model for tweets. We also tackle the challenge of cross-medium supervised summarisation with no target-medium training resources. To the best of our knowledge, there is no existing work on studying neural abstractive summarisation on tweets. In addition, we present a system for providing interactive visualisation of topic-entity sentiments and the corresponding summaries in chronological order. Throughout our work presented in this thesis, we conduct experiments to evaluate and verify the effectiveness of our proposed models, comparing to relevant baseline methods. Most of our evaluations are quantitative, however, we do perform qualitative analyses where it is appropriate. This thesis provides insights and findings that can be used for better understanding public opinion in social media

Warwick Research Archives Portal Repository

A Survey on Semantic Processing Techniques

Author: Cambria Erik
Chen Guanyi
He Kai
Mao Rui
Ni Jinjie
Yang Zonglin
Zhang Xulang
Publication venue
Publication date: 22/10/2023
Field of study

Semantic processing is a fundamental research domain in computational linguistics. In the era of powerful pre-trained language models and large language models, the advancement of research in this domain appears to be decelerating. However, the study of semantics is multi-dimensional in linguistics. The research depth and breadth of computational semantic processing can be largely improved with new technologies. In this survey, we analyzed five semantic processing tasks, e.g., word sense disambiguation, anaphora resolution, named entity recognition, concept extraction, and subjectivity detection. We study relevant theoretical research in these fields, advanced methods, and downstream applications. We connect the surveyed tasks with downstream applications because this may inspire future scholars to fuse these low-level semantic processing tasks with high-level natural language processing tasks. The review of theoretical research may also inspire new tasks and technologies in the semantic processing domain. Finally, we compare the different semantic processing techniques and summarize their technical trends, application trends, and future directions.Comment: Published at Information Fusion, Volume 101, 2024, 101988, ISSN 1566-2535. The equal contribution mark is missed in the published version due to the publication policies. Please contact Prof. Erik Cambria for detail

arXiv.org e-Print Archive