Search CORE

38 research outputs found

Understanding Chat Messages for Sticker Recommendation in Messaging Apps

Author: Hanoosh Mohamed
Laddha Abhishek
Mukherjee Debdoot
Narang Ankur
Patwa Parth
Publication venue
Publication date: 24/11/2019
Field of study

Stickers are popularly used in messaging apps such as Hike to visually express a nuanced range of thoughts and utterances to convey exaggerated emotions. However, discovering the right sticker from a large and ever expanding pool of stickers while chatting can be cumbersome. In this paper, we describe a system for recommending stickers in real time as the user is typing based on the context of the conversation. We decompose the sticker recommendation (SR) problem into two steps. First, we predict the message that the user is likely to send in the chat. Second, we substitute the predicted message with an appropriate sticker. Majority of Hike's messages are in the form of text which is transliterated from users' native language to the Roman script. This leads to numerous orthographic variations of the same message and makes accurate message prediction challenging. To address this issue, we learn dense representations of chat messages employing character level convolution network in an unsupervised manner. We use them to cluster the messages that have the same meaning. In the subsequent steps, we predict the message cluster instead of the message. Our approach does not depend on human labelled data (except for validation), leading to fully automatic updation and tuning pipeline for the underlying models. We also propose a novel hybrid message prediction model, which can run with low latency on low-end phones that have severe computational limitations. Our described system has been deployed for more than

6

months and is being used by millions of users along with hundreds of thousands of expressive stickers

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research

Author: Antypas Dimosthenis
Barbieri Francesco
Camacho-Collados Jose
Espinosa-Anke Luis
Neves Leonardo
Pei Jiaxin
Rezaee Kiamehr
Ushio Asahi
Publication venue
Publication date: 23/10/2023
Field of study

Despite its relevance, the maturity of NLP for social media pales in comparison with general-purpose models, metrics and benchmarks. This fragmented landscape makes it hard for the community to know, for instance, given a task, which is the best performing model and how it compares with others. To alleviate this issue, we introduce a unified benchmark for NLP evaluation in social media, SuperTweetEval, which includes a heterogeneous set of tasks and datasets combined, adapted and constructed from scratch. We benchmarked the performance of a wide range of models on SuperTweetEval and our results suggest that, despite the recent advances in language modelling, social media remains challenging.Comment: EMNLP 2023 Finding

arXiv.org e-Print Archive