65 research outputs found
Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network
Language in social media is extremely dynamic: new words emerge, trend and
disappear, while the meaning of existing words can fluctuate over time. Such
dynamics are especially notable during a period of crisis. This work addresses
several important tasks of measuring, visualizing and predicting short term
text representation shift, i.e. the change in a word's contextual semantics,
and contrasting such shift with surface level word dynamics, or concept drift,
observed in social media streams. Unlike previous approaches on learning word
representations from text, we study the relationship between short-term concept
drift and representation shift on a large social media corpus - VKontakte posts
in Russian collected during the Russia-Ukraine crisis in 2014-2015. Our novel
contributions include quantitative and qualitative approaches to (1) measure
short-term representation shift and contrast it with surface level concept
drift; (2) build predictive models to forecast short-term shifts in meaning
from previous meaning as well as from concept drift; and (3) visualize
short-term representation shift for example keywords to demonstrate the
practical use of our approach to discover and track meaning of newly emerging
terms in social media. We show that short-term representation shift can be
accurately predicted up to several weeks in advance. Our unique approach to
modeling and visualizing word representation shifts in social media can be used
to explore and characterize specific aspects of the streaming corpus during
crisis events and potentially improve other downstream classification tasks
including real-time event detection
Capturing stance dynamics in social media: open challenges and research directions
Social media platforms provide a goldmine for mining public opinion on issues
of wide societal interest and impact. Opinion mining is a problem that can be
operationalised by capturing and aggregating the stance of individual social
media posts as supporting, opposing or being neutral towards the issue at hand.
While most prior work in stance detection has investigated datasets that cover
short periods of time, interest in investigating longitudinal datasets has
recently increased. Evolving dynamics in linguistic and behavioural patterns
observed in new data require adapting stance detection systems to deal with the
changes. In this survey paper, we investigate the intersection between
computational linguistics and the temporal evolution of human communication in
digital media. We perform a critical review of emerging research considering
dynamics, exploring different semantic and pragmatic factors that impact
linguistic data in general, and stance in particular. We further discuss
current directions in capturing stance dynamics in social media. We discuss the
challenges encountered when dealing with stance dynamics, identify open
challenges and discuss future directions in three key dimensions: utterance,
context and influence
Dynamic Contextualized Word Embeddings
Static word embeddings that represent words by a single vector cannot capture the variability of word meaning in different linguistic and extralinguistic contexts. Building on prior work on contextualized and dynamic word embeddings, we introduce dynamic contextualized word embeddings that represent words as a function of both linguistic and extralinguistic context. Based on a pretrained language model (PLM), dynamic contextualized word embeddings model time and social space jointly, which makes them attractive for a range of NLP tasks involving semantic variability. We highlight potential application scenarios by means of qualitative and quantitative analyses on four English datasets
Concept Drift Adaptation in Text Stream Mining Settings: A Comprehensive Review
Due to the advent and increase in the popularity of the Internet, people have
been producing and disseminating textual data in several ways, such as reviews,
social media posts, and news articles. As a result, numerous researchers have
been working on discovering patterns in textual data, especially because social
media posts function as social sensors, indicating peoples' opinions,
interests, etc. However, most tasks regarding natural language processing are
addressed using traditional machine learning methods and static datasets. This
setting can lead to several problems, such as an outdated dataset, which may
not correspond to reality, and an outdated model, which has its performance
degrading over time. Concept drift is another aspect that emphasizes these
issues, which corresponds to data distribution and pattern changes. In a text
stream scenario, it is even more challenging due to its characteristics, such
as the high speed and data arriving sequentially. In addition, models for this
type of scenario must adhere to the constraints mentioned above while learning
from the stream by storing texts for a limited time and consuming low memory.
In this study, we performed a systematic literature review regarding concept
drift adaptation in text stream scenarios. Considering well-defined criteria,
we selected 40 papers to unravel aspects such as text drift categories, types
of text drift detection, model update mechanism, the addressed stream mining
tasks, types of text representations, and text representation update mechanism.
In addition, we discussed drift visualization and simulation and listed
real-world datasets used in the selected papers. Therefore, this paper
comprehensively reviews the concept drift adaptation in text stream mining
scenarios.Comment: 49 page
The Palgrave Handbook of Digital Russia Studies
This open access handbook presents a multidisciplinary and multifaceted perspective on how the ‘digital’ is simultaneously changing Russia and the research methods scholars use to study Russia. It provides a critical update on how Russian society, politics, economy, and culture are reconfigured in the context of ubiquitous connectivity and accounts for the political and societal responses to digitalization. In addition, it answers practical and methodological questions in handling Russian data and a wide array of digital methods. The volume makes a timely intervention in our understanding of the changing field of Russian Studies and is an essential guide for scholars, advanced undergraduate and graduate students studying Russia today
The Palgrave Handbook of Digital Russia Studies
This open access handbook presents a multidisciplinary and multifaceted perspective on how the ‘digital’ is simultaneously changing Russia and the research methods scholars use to study Russia. It provides a critical update on how Russian society, politics, economy, and culture are reconfigured in the context of ubiquitous connectivity and accounts for the political and societal responses to digitalization. In addition, it answers practical and methodological questions in handling Russian data and a wide array of digital methods. The volume makes a timely intervention in our understanding of the changing field of Russian Studies and is an essential guide for scholars, advanced undergraduate and graduate students studying Russia today
Migration Research in a Digitized World: Using Innovative Technology to Tackle Methodological Challenges
This open access book explores implications of the digital revolution for migration scholars’ methodological toolkit. New information and communication technologies hold considerable potential to improve the quality of migration research by originating previously non-viable solutions to a myriad of methodological challenges in this field of study. Combining cutting-edge migration scholarship and methodological expertise, the book addresses a range of crucial issues related to both researcher-designed data collections and the secondary use of “big data”, highlighting opportunities as well as challenges and limitations. A valuable source for students and scholars engaged in migration research, the book will also be of keen interest to policymakers
- …