12 research outputs found
A Semantics-Based Measure of Emoji Similarity
Emoji have grown to become one of the most important forms of communication
on the web. With its widespread use, measuring the similarity of emoji has
become an important problem for contemporary text processing since it lies at
the heart of sentiment analysis, search, and interface design tasks. This paper
presents a comprehensive analysis of the semantic similarity of emoji through
embedding models that are learned over machine-readable emoji meanings in the
EmojiNet knowledge base. Using emoji descriptions, emoji sense labels and emoji
sense definitions, and with different training corpora obtained from Twitter
and Google News, we develop and test multiple embedding models to measure emoji
similarity. To evaluate our work, we create a new dataset called EmoSim508,
which assigns human-annotated semantic similarity scores to a set of 508
carefully selected emoji pairs. After validation with EmoSim508, we present a
real-world use-case of our emoji embedding models using a sentiment analysis
task and show that our models outperform the previous best-performing emoji
embedding model on this task. The EmoSim508 dataset and our emoji embedding
models are publicly released with this paper and can be downloaded from
http://emojinet.knoesis.org/.Comment: This paper is accepted at Web Intelligence 2017 as a full paper, In
2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). Leipzig,
Germany: ACM, 201
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Machine Learning has been a big success story during the AI resurgence. One
particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there
is increasing recognition for utilizing knowledge whenever it is available or
can be created purposefully. In this paper, we discuss the indispensable role
of knowledge for deeper understanding of content where (i) large amounts of
training data are unavailable, (ii) the objects to be recognized are complex,
(e.g., implicit entities and highly subjective content), and (iii) applications
need to use complementary or related data in multiple modalities/media. What
brings us to the cusp of rapid progress is our ability to (a) create relevant
and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP
techniques. Using diverse examples, we seek to foretell unprecedented progress
in our ability for deeper understanding and exploitation of multimodal data and
continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International
Conference on Web Intelligence (WI). arXiv admin note: substantial text
overlap with arXiv:1610.0770
Peaches and eggplants or. . . something else? The role of context in emoji interpretations
This paper presents the results of an experiment designed to measure interpretations of two emojis oft-discussed in popular culture, the eggplant and the peach. The experiment asked people to judge how sexual an emoji-containing text message was. The context surrounding these messages was manipulated across experimental conditions, altering both the preceding discourse and the presence of a sentence-final wink emoji. Unsurprisingly, the baseline interpretation of both the eggplant and peach emoji is euphemism. When one of these emojis is used in a context that strongly biases towards the non-euphemistic interpretation, ratings for sexualness decrease and variability increases. This suggests that participants are still able to access non-euphemistic interpretations of these emojis, but it must be under specific circumstances and will nonetheless come with a high degree of variability. Wink emojis added to messages containing non-euphemistic food emojis were also rated as more highly sexual (albeit still low on the rating scale), indicating an affective role for this emoji
Accommodated Emoji Usage: Influence of Hierarchy on the Adaption of Pictogram Usage in Instant Messaging
Communication Accommodation Theory predicts to what extent individuals accommodate their verbal and nonverbal behaviour by converging it towards their conversation partner or diverging it away from them to gain social approval and to decrease social distance. Especially individuals in lower hierarchy positions accommodate their communication behaviour towards individuals in higher hierarchy positions. Nowadays, computer- and smartphone-mediated communication are common ways to communicate, for example via instant messaging. However, instant messenger lack in transporting nonverbal cues. To fill this gap, emoji are used increasingly. A study was conducted to examine how individuals in lower hierarchy positions converge their emoji usage towards individuals in higher hierarchy position. The results support the assumption that the higher hierarchy is perceived, the more emoji accommodation is shown
Multimodal Emotion Classification
Most NLP and Computer Vision tasks are limited to scarcity of labelled data.
In social media emotion classification and other related tasks, hashtags have
been used as indicators to label data. With the rapid increase in emoji usage
of social media, emojis are used as an additional feature for major social NLP
tasks. However, this is less explored in case of multimedia posts on social
media where posts are composed of both image and text. At the same time, w.e
have seen a surge in the interest to incorporate domain knowledge to improve
machine understanding of text. In this paper, we investigate whether domain
knowledge for emoji can improve the accuracy of emotion classification task. We
exploit the importance of different modalities from social media post for
emotion classification task using state-of-the-art deep learning architectures.
Our experiments demonstrate that the three modalities (text, emoji and images)
encode different information to express emotion and therefore can complement
each other. Our results also demonstrate that emoji sense depends on the
textual context, and emoji combined with text encodes better information than
considered separately. The highest accuracy of 71.98\% is achieved with a
training data of 550k posts.Comment: Accepted at the 2nd Emoji Workshop co-located with The Web Conference
201
Multimodal Emotion Classification
Most NLP and Computer Vision tasks are limited to scarcity of labelled data. In social media emotion classification and other related tasks, hashtags have been used as indicators to label data. With the rapid increase in emoji usage of social media, emojis are used as an additional feature for major social NLP tasks. However, this is less explored in case of multimedia posts on social media where posts are composed of both image and text. At the same time, w.e have seen a surge in the interest to incorporate domain knowledge to improve machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emotion classification task. We exploit the importance of different modalities from social media post for emotion classification task using state-of-the-art deep learning architectures. Our experiments demonstrate that the three modalities (text, emoji and images) encode different information to express emotion and therefore can complement each other. Our results also demonstrate that emoji sense depends on the textual context, and emoji combined with text encodes better information than considered separately. The highest accuracy of 71.98% is achieved with a training data of 550k posts
Repurposing emoji for personalised communication::Why [pizza slice] means “I love you”
The use of emoji in digital communication can convey a wealth of emotions and concepts that otherwise would take many words to express. Emoji have become a popular form of communication, with researchers claiming emoji represent a type of “ubiquitous language” that can span different languages. In this paper however, we explore how emoji are also used in highly personalised and purposefully secretive ways. We show that emoji are repurposed for something other than their “intended” use between close partners, family members and friends. We present the range of reasons why certain emoji get chosen, including the concept of “emoji affordance” and explore why repurposing occurs. Normally used for speed, some emoji are instead used to convey intimate and personal sentiments that, for many reasons, their users cannot express in words. We discuss how this form of repurposing must be considered in tasks such as emoji-based sentiment analysis