61 research outputs found
Semantic Variation in Online Communities of Practice
We introduce a framework for quantifying semantic variation of common words
in Communities of Practice and in sets of topic-related communities. We show
that while some meaning shifts are shared across related communities, others
are community-specific, and therefore independent from the discussed topic. We
propose such findings as evidence in favour of sociolinguistic theories of
socially-driven semantic variation. Results are evaluated using an independent
language modelling task. Furthermore, we investigate extralinguistic features
and show that factors such as prominence and dissemination of words are related
to semantic variation.Comment: 13 pages, Proceedings of the 12th International Conference on
Computational Semantics (IWCS 2017
Analysing Lexical Semantic Change with Contextualised Word Representations
This paper presents the first unsupervised approach to lexical semantic
change that makes use of contextualised word representations. We propose a
novel method that exploits the BERT neural language model to obtain
representations of word usages, clusters these representations into usage
types, and measures change along time with three proposed metrics. We create a
new evaluation dataset and show that the model representations and the detected
semantic shifts are positively correlated with human judgements. Our extensive
qualitative analysis demonstrates that our method captures a variety of
synchronic and diachronic linguistic phenomena. We expect our work to inspire
further research in this direction.Comment: To appear in Proceedings of the 58th Annual Meeting of the
Association for Computational Linguistics (ACL-2020
Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network
Language in social media is extremely dynamic: new words emerge, trend and
disappear, while the meaning of existing words can fluctuate over time. Such
dynamics are especially notable during a period of crisis. This work addresses
several important tasks of measuring, visualizing and predicting short term
text representation shift, i.e. the change in a word's contextual semantics,
and contrasting such shift with surface level word dynamics, or concept drift,
observed in social media streams. Unlike previous approaches on learning word
representations from text, we study the relationship between short-term concept
drift and representation shift on a large social media corpus - VKontakte posts
in Russian collected during the Russia-Ukraine crisis in 2014-2015. Our novel
contributions include quantitative and qualitative approaches to (1) measure
short-term representation shift and contrast it with surface level concept
drift; (2) build predictive models to forecast short-term shifts in meaning
from previous meaning as well as from concept drift; and (3) visualize
short-term representation shift for example keywords to demonstrate the
practical use of our approach to discover and track meaning of newly emerging
terms in social media. We show that short-term representation shift can be
accurately predicted up to several weeks in advance. Our unique approach to
modeling and visualizing word representation shifts in social media can be used
to explore and characterize specific aspects of the streaming corpus during
crisis events and potentially improve other downstream classification tasks
including real-time event detection
- …