3,605 research outputs found
Econometrics meets sentiment : an overview of methodology and applications
The advent of massive amounts of textual, audio, and visual data has spurred the development of econometric methodology to transform qualitative sentiment data into quantitative sentiment variables, and to use those variables in an econometric analysis of the relationships between sentiment and other variables. We survey this emerging research field and refer to it as sentometrics, which is a portmanteau of sentiment and econometrics. We provide a synthesis of the relevant methodological approaches, illustrate with empirical results, and discuss useful software
Text Classification via Large Language Models
Despite the remarkable success of large-scale Language Models (LLMs) such as
GPT-3, their performances still significantly underperform fine-tuned models in
the task of text classification. This is due to (1) the lack of reasoning
ability in addressing complex linguistic phenomena (e.g., intensification,
contrast, irony etc); (2) limited number of tokens allowed in in-context
learning.
In this paper, we introduce Clue And Reasoning Prompting (CARP). CARP adopts
a progressive reasoning strategy tailored to addressing the complex linguistic
phenomena involved in text classification: CARP first prompts LLMs to find
superficial clues (e.g., keywords, tones, semantic relations, references, etc),
based on which a diagnostic reasoning process is induced for final decisions.
To further address the limited-token issue, CARP uses a fine-tuned model on the
supervised dataset for NN demonstration search in the in-context learning,
allowing the model to take the advantage of both LLM's generalization ability
and the task-specific evidence provided by the full labeled dataset.
Remarkably, CARP yields new SOTA performances on 4 out of 5 widely-used
text-classification benchmarks, 97.39 (+1.24) on SST-2, 96.40 (+0.72) on
AGNews, 98.78 (+0.25) on R8 and 96.95 (+0.6) on R52, and a performance
comparable to SOTA on MR (92.39 v.s. 93.3). More importantly, we find that CARP
delivers impressive abilities on low-resource and domain-adaptation setups.
Specifically, using 16 examples per class, CARP achieves comparable
performances to supervised models with 1,024 examples per class.Comment: Pre-print Versio
A Hybrid Approach to Domain-Specific Entity Linking
The current state-of-the-art Entity Linking (EL) systems are geared towards
corpora that are as heterogeneous as the Web, and therefore perform
sub-optimally on domain-specific corpora. A key open problem is how to
construct effective EL systems for specific domains, as knowledge of the local
context should in principle increase, rather than decrease, effectiveness. In
this paper we propose the hybrid use of simple specialist linkers in
combination with an existing generalist system to address this problem. Our
main findings are the following. First, we construct a new reusable benchmark
for EL on a corpus of domain-specific conversations. Second, we test the
performance of a range of approaches under the same conditions, and show that
specialist linkers obtain high precision in isolation, and high recall when
combined with generalist linkers. Hence, we can effectively exploit local
context and get the best of both worlds.Comment: SEM'1
Multilingual sentiment analysis in social media.
252 p.This thesis addresses the task of analysing sentiment in messages coming from social media. The ultimate goal was to develop a Sentiment Analysis system for Basque. However, because of the socio-linguistic reality of the Basque language a tool providing only analysis for Basque would not be enough for a real world application. Thus, we set out to develop a multilingual system, including Basque, English, French and Spanish.The thesis addresses the following challenges to build such a system:- Analysing methods for creating Sentiment lexicons, suitable for less resourced languages.- Analysis of social media (specifically Twitter): Tweets pose several challenges in order to understand and extract opinions from such messages. Language identification and microtext normalization are addressed.- Research the state of the art in polarity classification, and develop a supervised classifier that is tested against well known social media benchmarks.- Develop a social media monitor capable of analysing sentiment with respect to specific events, products or organizations
- …