Search CORE

996 research outputs found

BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer

Author: Charpentier Lucas Georges Gabriel
Rønningstad Egil
Samuel David
Wold Sondre
Publication venue
Publication date: 19/04/2023
Field of study

Retrieval-based language models are increasingly employed in question-answering tasks. These models search in a corpus of documents for relevant information instead of having all factual knowledge stored in its parameters, thereby enhancing efficiency, transparency, and adaptability. We develop the first Norwegian retrieval-based model by adapting the REALM framework and evaluating it on various tasks. After training, we also separate the language model, which we call the reader, from the retriever components, and show that this can be fine-tuned on a range of downstream tasks. Results show that retrieval augmented language modeling improves the reader's performance on extractive question-answering, suggesting that this type of training improves language models' general ability to use context and that this does not happen at the expense of other abilities such as part-of-speech tagging, dependency parsing, named entity recognition, and lemmatization. Code, trained models, and data are made publicly available.Comment: Accepted for NoDaLiDa 2023, main conferenc

arXiv.org e-Print Archive

Fine-grained sentiment analysis for measuring customer satisfaction using an extended set of fuzzy linguistic hedges

Author: Asghar Muhammad Zubair
Hameed Ibrahim A.
Jillani Nosheen
Khattak Asad
Paracha Waqas Tariq
Saddozai Furqan Khan
Younis Umair
Publication venue: ZU Scholars
Publication date: 01/01/2020
Field of study

© 2020 The Authors. Published by Atlantis Press SARL. In recent years, the boom in social media sites such as Facebook and Twitter has brought people together for the sharing of opinions, sentiments, emotions, and experiences about products, events, politics, and other topics. In particular, sentiment-based applications are growing in popularity among individuals and businesses for the making of purchase decisions. Fuzzy-based sentiment analysis aims at classifying customer sentiment at a fine-grained level. This study deals with the development of a fuzzy-based sentiment analysis by extending fuzzy hedges and rule-sets for a more efficient classification of customer sentiment and satisfaction. Prior studies have used a limited number of linguistic hedges and polarity classes in their rule-sets, resulting in the degraded efficiency of their fuzzy-based sentiment analysis systems. The proposed analysis of the current study classifies customer reviews using fuzzy linguistic hedges and an extended rule-set with seven sentiment analysis classes, namely extremely positive, very positive, positive, neutral, negative, very negative, and extremely negative. Then, a fuzzy logic system is applied to measure customer satisfaction at a fine-grained level. The experimental results demonstrate that the proposed analysis has an improved performance over the baseline works

ZU Scholars (Zayed University)

NORA - Norwegian Open Research Archives

A Deep Network Model for Paraphrase Detection in Short Text Messages

Author: Agarwal Basant
Langseth Helge
Ramampiaro Heri
Ruocco Massimiliano
Publication venue: 'Elsevier BV'
Publication date: 07/12/2017
Field of study

This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is that existing paraphrase systems perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts. Challenges with paraphrase detection on user generated short texts, such as Twitter, include language irregularity and noise. To cope with these challenges, we propose a novel deep neural network-based approach that relies on coarse-grained sentence modeling using a convolutional neural network and a long short-term memory model, combined with a specific fine-grained word-level similarity matching model. Our experimental results show that the proposed approach outperforms existing state-of-the-art approaches on user-generated noisy social media data, such as Twitter texts, and achieves highly competitive performance on a cleaner corpus

arXiv.org e-Print Archive

NORA - Norwegian Open Research Archives

XED : A Multilingual Dataset for Sentiment Analysis and Emotion Detection

Author: Kajava Kaisla
Pàmies Marc
Tiedemann Jörg
Öhman Emily
Publication venue: International Committee on Computational Linguistics
Publication date: 01/01/2020
Field of study

We introduce XED, a multilingual fine-grained human-annotated emotion dataset. The dataset consists of human-annotated Finnish (25k) and English sentences (30k), as well as projected annotations for 43 additional languages, providing new resources to many low-resource languages. We use Plutchik’s core emotions to annotate the dataset with the addition of neutral. The dataset is carefully evaluated using language-specific BERT to show that XED performs on par with other similar datasets and is therefore a useful tool for sentiment analysis and emotion detection.Peer reviewe

arXiv.org e-Print Archive

Crossref

Helsingin yliopiston digitaalinen arkisto

24th Nordic Conference on Computational Linguistics (NoDaLiDa)

Author
Publication venue: University of Tartu Library
Publication date: 01/05/2023
Field of study

DSpace at Tartu University Library