4,929 research outputs found
Multilingual twitter sentiment analysis using machine learning
Twitter sentiment analysis is one of the leading research fields. Most of the researchers were contributed to twitter sentiment analysis in English tweets, but few researchers focus on the multilingual twitter sentiment analysis. Some challenges are hoping for the research solutions in multilingual twitter sentiment analysis. This study presents the implementation of sentiment analysis in multilingual twitter data and improves the data classification up to the adequate level of accuracy. Twitter is the sixth leading social networking site in the world. Active users for twitter in a month are 330 million. People can tweet or re-tweet in their languages and allow users to use emoji’s, abbreviations, contraction words, miss spellings, and shortcut words. The best platform for sentiment analysis is twitter. Multilingual tweets and data sparsity are the two main challenges. In this paper, the MLTSA algorithm gives the solution for these two challenges. MLTSA algorithm divides into two parts. One is detecting and translating non-English tweets into English using natural language processing (NLP). And the second one is an appropriate pre-processing method with NLP support can reduce the data sparsity. The result of the MLTSA with SVM achieves good accuracy by up to 95%
Introduction to the special issue on cross-language algorithms and applications
With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of
Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special
issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment
analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
Multilingual Large Language Models Are Not (Yet) Code-Switchers
Multilingual Large Language Models (LLMs) have recently shown great
capabilities in a wide range of tasks, exhibiting state-of-the-art performance
through zero-shot or few-shot prompting methods. While there have been
extensive studies on their abilities in monolingual tasks, the investigation of
their potential in the context of code-switching (CSW), the practice of
alternating languages within an utterance, remains relatively uncharted. In
this paper, we provide a comprehensive empirical analysis of various
multilingual LLMs, benchmarking their performance across four tasks: sentiment
analysis, machine translation, summarization and word-level language
identification. Our results indicate that despite multilingual LLMs exhibiting
promising outcomes in certain tasks using zero or few-shot prompting, they
still underperform in comparison to fine-tuned models of much smaller scales.
We argue that current "multilingualism" in LLMs does not inherently imply
proficiency with code-switching texts, calling for future research to bridge
this discrepancy.Comment: Accepted at EMNLP 202
Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts
Sentiment analysis (SA) regards the classification of texts according to the polarity of the opinions they express. SA systems are highly relevant to many real-world applications (e.g. marketing, eGovernance, business intelligence, behavioral sciences) and also to many tasks in Natural Language Processing (NLP) – information extraction, question answering, textual entailment, to name just a few. The importance of this field has been proven by the high number of approaches proposed in research, as well as by the interest that it raised from other disciplines and the applications that were created using its technology.
In our case, the primary focus is to use sentiment analysis in the context of media monitoring, to enable tracking of global reactions to events. The main challenge that we face is that tweets are written in different languages and an unbiased system should be able to deal with all of them, in order to process all (possible) available data.
Unfortunately, although many linguistic resources exist for processing texts written in English, for many other languages data and tools are scarce. Following our initial efforts described in (Balahur and Turchi, 2013), in this article we extend our study on the possibility to implement a multilingual system that is able to a) classify sentiment expressed in tweets in various languages using training data obtained through machine translation; b) to verify the extent to which the quality of the translations influences the sentiment classification performance, in this case, of highly informal texts; and c) to improve multilingual sentiment classification using small amounts of data annotated in the target language. To this aim, varying sizes of target language data are tested. The languages we explore are: Arabic, Turkish, Russian, Italian, Spanish, German and French.JRC.G.2-Global security and crisis managemen
Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology
Every culture and language is unique. Our work expressly focuses on the
uniqueness of culture and language in relation to human affect, specifically
sentiment and emotion semantics, and how they manifest in social multimedia. We
develop sets of sentiment- and emotion-polarized visual concepts by adapting
semantic structures called adjective-noun pairs, originally introduced by Borth
et al. (2013), but in a multilingual context. We propose a new
language-dependent method for automatic discovery of these adjective-noun
constructs. We show how this pipeline can be applied on a social multimedia
platform for the creation of a large-scale multilingual visual sentiment
concept ontology (MVSO). Unlike the flat structure in Borth et al. (2013), our
unified ontology is organized hierarchically by multilingual clusters of
visually detectable nouns and subclusters of emotionally biased versions of
these nouns. In addition, we present an image-based prediction task to show how
generalizable language-specific models are in a multilingual context. A new,
publicly available dataset of >15.6K sentiment-biased visual concepts across 12
languages with language-specific detector banks, >7.36M images and their
metadata is also released.Comment: 11 pages, to appear at ACM MM'1
- …