139 research outputs found
M2SA: Multimodal and Multilingual Model for Sentiment Analysis of Tweets
In recent years, multimodal natural language processing, aimed at learning
from diverse data types, has garnered significant attention. However, there
needs to be more clarity when it comes to analysing multimodal tasks in
multi-lingual contexts. While prior studies on sentiment analysis of tweets
have predominantly focused on the English language, this paper addresses this
gap by transforming an existing textual Twitter sentiment dataset into a
multimodal format through a straightforward curation process. Our work opens up
new avenues for sentiment-related research within the research community.
Additionally, we conduct baseline experiments utilising this augmented dataset
and report the findings. Notably, our evaluations reveal that when comparing
unimodal and multimodal configurations, using a sentiment-tuned large language
model as a text encoder performs exceptionally well
Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation
Peer reviewe
Communication of COVID-19 consequences in the Baltic States inforsphere
This article seeks to describe the dynamics of COVID-19 in the Baltic States and to analyse the ways of communicating the threat and its consequences. Particular attention is paid to the media strategies pursued in the study area. The research is based on Russian and English texts from the Baltic media, WHO official documents and datasets, as well as initiatives of the Baltic Sea region organisations (2020) counteracting COVID-19. A combination of these sources builds up an objective view of the situation and demonstrates how the pandemic and its consequences are represented in public consciousness given a certain pragmatic goal. The pandemic is a new type of threat; its consequences demonstrate a tendency towards negative synergy and a category shift from soft threats to hard ones. The research shows that several key strategies - counter-active, projective, conservative, mobilising, resilient, and reflective - are used to communicate the threat and its consequences in the media
Understanding the Vegetable Oil Debate and Its Implications for Sustainability through Social Media
The global production and consumption of vegetable oils have sparked several
discussions on sustainable development. This study analyzes over 20 million
tweets related to vegetable oils to explore the key factors shaping public
opinion. We found that coconut, olive, and palm oils dominate social media
discourse despite their lower contribution to overall global vegetable
production. The discussion about olive and palm oils remarkably correlates with
Twitter's growth, while coconut increases more significantly with bursts of
activity. Discussions around coconut and olive oils primarily focus on health,
beauty, and food, while palm draws attention to pressing environmental
concerns. Overall, virality is related to environmental issues and negative
connotations. In the context of Sustainable Development Goals, this study
highlights the multifaceted nature of the vegetable oil debate and its
disconnection from scientific discussions. Our research sheds light on the
power of social media in shaping public perception, providing insights into
sustainable development strategies.Comment: 26 pages including figures and the S
14th Conference on DATA ANALYSIS METHODS for Software Systems
DAMSS-2023 is the 14th International Conference on Data Analysis Methods for Software Systems, held in Druskininkai, Lithuania. Every year at the same venue and time. The exception was in 2020, when the world was gripped by the Covid-19 pandemic and the movement of people was severely restricted. After a year’s break, the conference was back on track, and the next conference was successful in achieving its primary goal of lively scientific communication. The conference focuses on live interaction among participants. For better efficiency of communication among participants, most of the presentations are poster presentations.
This format has proven to be highly effective. However, we have several oral sections, too. The history of the conference dates back to 2009 when 16 papers were presented. It began as a workshop and has evolved into a well-known conference. The idea of such a workshop originated at the Institute of Mathematics and Informatics, now the Institute of Data Science and Digital Technologies of Vilnius University. The Lithuanian Academy of Sciences and the Lithuanian Computer Society supported this idea, which gained enthusiastic acceptance from both the Lithuanian and international scientific communities. This year’s conference features 84 presentations, with 137 registered participants from 11 countries. The conference serves as a gathering point for researchers from six Lithuanian universities, making it the main annual meeting for Lithuanian computer scientists. The primary aim of the conference is to showcase research conducted at Lithuanian and foreign universities in the fields of data science and software engineering. The annual organization of the conference facilitates the rapid exchange of new ideas within the scientific community. Seven IT companies supported the conference this year, indicating the relevance of the conference topics to the business sector. In addition, the conference is supported by the Lithuanian Research Council and the National Science and Technology Council (Taiwan, R. O. C.). The conference covers a wide range of topics, including Applied Mathematics, Artificial Intelligence, Big Data, Bioinformatics, Blockchain Technologies, Business Rules, Software Engineering, Cybersecurity, Data Science, Deep Learning, High-Performance Computing, Data Visualization, Machine Learning, Medical Informatics, Modelling Educational Data, Ontological Engineering, Optimization, Quantum Computing, Signal Processing. This book provides an overview of all presentations from the DAMSS-2023 conference
What Does Twitter Say About Self-Regulated Learning? Mapping Tweets From 2011 to 2021
Social network services such as Twitter are important venues that can be used as rich data sources to mine public opinions about various topics. In this study, we used Twitter to collect data on one of the most growing theories in education, namely Self-Regulated Learning (SRL) and carry out further analysis to investigate What Twitter says about SRL? This work uses three main analysis methods, descriptive, topic modeling, and geocoding analysis. The searched and collected dataset consists of a large volume of relevant SRL tweets equal to 54,070 tweets between 2011 and 2021. The descriptive analysis uncovers a growing discussion on SRL on Twitter from 2011 till 2018 and then markedly decreased till the collection day. For topic modeling, the text mining technique of Latent Dirichlet allocation (LDA) was applied and revealed insights on computationally processed topics. Finally, the geocoding analysis uncovers a diverse community from all over the world, yet a higher density representation of users from the Global North was identified. Further implications are discussed in the paper.publishedVersio
Proceedings of the 2023 CLASP Conference on Learning with Small Data
The purpose of our conference is to bring together researchers from several areas of NLP, addressing datasets, methods and limits of effective (machine) learning with small data containing natural language and associated multi-modal information. The conference covers areas such as machine learning, natural language processing, language technology, computational linguistics, theoretical linguistics, psycholinguistics, as well as artificial intelligence, cognitive science, ethics, and policy.Centre for Linguistic Theory and Studies in Probability (CLASP
- …
