Search CORE

63 research outputs found

Self-Supervised and Controlled Multi-Document Opinion Summarization

Author: Coavoux Maximin
Elsahar Hady
Gallé Matthias
Rozen Jos
Publication venue
Publication date: 30/04/2020
Field of study

We address the problem of unsupervised abstractive summarization of collections of user generated reviews with self-supervision and control. We propose a self-supervised setup that considers an individual document as a target summary for a set of similar documents. This setting makes training simpler than previous approaches by relying only on standard log-likelihood loss. We address the problem of hallucinations through the use of control codes, to steer the generation towards more coherent and relevant summaries.Finally, we extend the Transformer architecture to allow for multiple reviews as input. Our benchmarks on two datasets against graph-based and recent neural abstractive unsupervised models show that our proposed method generates summaries with a superior quality and relevance.This is confirmed in our human evaluation which focuses explicitly on the faithfulness of generated summaries We also provide an ablation study, which shows the importance of the control setup in controlling hallucinations and achieve high sentiment and topic alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

An Approximate Sampler for Energy-based Models with Divergence Diagnostics

Author: Dance C.
Dymetman M.
Eikema B.
Elsahar H.
Kruszewski G.
Publication venue
Publication date: 01/11/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Developing resources for sentiment analysis of informal Arabic text in social media

Author: Abdul-Mageed
Al-Sulaiti
Bollen
ElSahar
Hamouda
Lita
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Natural Language Processing (NLP) applications such as text categorization, machine translation, sentiment analysis, etc., need annotated corpora and lexicons to check quality and performance. This paper describes the development of resources for sentiment analysis specifically for Arabic text in social media. A distinctive feature of the corpora and lexicons developed are that they are determined from informal Arabic that does not conform to grammatical or spelling standards. We refer to Arabic social media content of this sort as Dialectal Arabic (DA) - informal Arabic originating from and potentially mixing a range of different individual dialects. The paper describes the process adopted for developing corpora and sentiment lexicons for sentiment analysis within different social media and their resulting characteristics. The addition to providing useful NLP data sets for Dialectal Arabic the work also contributes to understanding the approach to developing corpora and lexicons

Crossref

Sheffield Hallam University Research Archive

Augmentative and alternative communication (AAC) advances: A review of configurations for individuals with a speech disability

Author: Annysa Mansor (6973527)
David Kerr (1249200)
Kaddour Bouazza-Marouf (1248348)
Sijung Hu (1250727)
Yasmin Elsahar (4352617)
Publication venue
Publication date: 22/04/2019
Field of study

High-tech augmentative and alternative communication (AAC) methods are on a constant rise; however, the interaction between the user and the assistive technology is still challenged for an optimal user experience centered around the desired activity. This review presents a range of signal sensing and acquisition methods utilized in conjunction with the existing high-tech AAC platforms for individuals with a speech disability, including imaging methods, touch-enabled systems, mechanical and electro-mechanical access, breath-activated methods, and brain–computer interfaces (BCI). The listed AAC sensing modalities are compared in terms of ease of access, affordability, complexity, portability, and typical conversational speeds. A revelation of the associated AAC signal processing, encoding, and retrieval highlights the roles of machine learning (ML) and deep learning (DL) in the development of intelligent AAC solutions. The demands and the affordability of most systems hinder the scale of usage of high-tech AAC. Further research is indeed needed for the development of intelligent AAC applications reducing the associated costs and enhancing the portability of the solutions for a real user’s environment. The consolidation of natural language processing with current solutions also needs to be further explored for the amelioration of the conversational speeds. The recommendations for prospective advances in coming high-tech AAC are addressed in terms of developments to support mobile health communicative applications

Loughborough University Institutional Repository

Users' Traces for Enhancing Arabic Facebook Search

Author: Badache Ismail
Barhoumi Amira
Bayoudhi Amine
Dahou Abdelghani
ElSahar Hady
Greiffenstern Sandra
Mikolov Tomas
Mohamed
Westerveld Thijs
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 17/09/2019
Field of study

International audienceThis paper proposes an approach on Facebook search in Arabic, which exploits several users' traces (e.g. comment, share, reactions) left on Facebook posts to estimate their social importance. Our goal is to show how these social traces (signals) can play a vital role in improving Arabic Facebook search. Firstly, we identify polarities (positive or negative) carried by the textual signals (e.g. comments) and non-textual ones (e.g. the reactions love and sad) for a given Facebook post. Therefore, the polarity of each comment expressed on a given Facebook post, is estimated on the basis of a neural sentiment model in Arabic language. Secondly, we group signals according to their complementarity using features selection algorithms. Thirdly, we apply learning to rank (LTR) algorithms to re-rank Facebook search results based on the selected groups of signals. Finally, experiments are carried out on 13,500 Facebook posts, collected from 45 topics in Arabic language. Experiments results reveal that Random Forests combined with ReliefFAttributeEval (RLF) was the most effective LTR approach for this task

Crossref

HAL AMU

Breathing pattern interpretation as an alternative and effective voice communication solution

Author: Atul Gaur (4969087)
David Kerr (1249200)
Kaddour Bouazza-Marouf (1248348)
Sijung Hu (1250727)
Vipul Kaushik (7214381)
Yasmin Elsahar (4352617)
Publication venue
Publication date: 01/01/2018
Field of study

Augmentative and alternative communication (AAC) systems tend to rely on the interpretation of purposeful gestures for interaction. Existing AAC methods could be cumbersome and limit the solutions in terms of versatility. The study aims to interpret breathing patterns (BPs) to converse with the outside world by means of a unidirectional microphone and researches breathing-pattern interpretation (BPI) to encode messages in an interactive manner with minimal training. We present BP processing work with (1) output synthesized machine-spoken words (SMSW) along with single-channel Weiner filtering (WF) for signal de-noising, and (2) k-nearest neighbor (k-NN) classification of BPs associated with embedded dynamic time warping (DTW). An approved protocol to collect analogue modulated BP sets belonging to 4 distinct classes with 10 training BPs per class and 5 live BPs per class was implemented with 23 healthy subjects. An 86% accuracy of k-NN classification was obtained with decreasing error rates of 17%, 14%, and 11% for the live classifications of classes 2, 3, and 4, respectively. The results express a systematic reliability of 89% with increased familiarity. The outcomes from the current AAC setup recommend a durable engineering solution directly beneficial to the sufferers

Loughborough University Institutional Repository

Crossref

A review on corpus annotation for arabic sentiment analysis

Author: A alOwisheq
A Kaur
A Mountassir
AM Azmi
CC Aggarwal
G Leech
H ElSahar
H Ibrahim
J Carletta
J Cohen
M Saleh
NY Habash
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Mining publicly available data for meaning and value is an important research direction within social media analysis. To automatically analyze collected textual data, a manual effort is needed for a successful machine learning algorithm to effectively classify text. This pertains to annotating the text adding labels to each data entry. Arabic is one of the languages that are growing rapidly in the research of sentiment analysis, despite limited resources and scares annotated corpora. In this paper, we review the annotation process carried out by those papers. A total of 27 papers were reviewed between the years of 2010 and 2016

Crossref

Warwick Research Archives Portal Repository