727 research outputs found
Interactive Search and Exploration in Online Discussion Forums Using Multimodal Embeddings
In this paper we present a novel interactive multimodal learning system,
which facilitates search and exploration in large networks of social multimedia
users. It allows the analyst to identify and select users of interest, and to
find similar users in an interactive learning setting. Our approach is based on
novel multimodal representations of users, words and concepts, which we
simultaneously learn by deploying a general-purpose neural embedding model. We
show these representations to be useful not only for categorizing users, but
also for automatically generating user and community profiles. Inspired by
traditional summarization approaches, we create the profiles by selecting
diverse and representative content from all available modalities, i.e. the
text, image and user modality. The usefulness of the approach is evaluated
using artificial actors, which simulate user behavior in a relevance feedback
scenario. Multiple experiments were conducted in order to evaluate the quality
of our multimodal representations, to compare different embedding strategies,
and to determine the importance of different modalities. We demonstrate the
capabilities of the proposed approach on two different multimedia collections
originating from the violent online extremism forum Stormfront and the
microblogging platform Twitter, which are particularly interesting due to the
high semantic level of the discussions they feature
StarSpace: Embed All The Things!
We present StarSpace, a general-purpose neural embedding model that can solve
a wide variety of problems: labeling tasks such as text classification, ranking
tasks such as information retrieval/web search, collaborative filtering-based
or content-based recommendation, embedding of multi-relational graphs, and
learning word, sentence or document level embeddings. In each case the model
works by embedding those entities comprised of discrete features and comparing
them against each other -- learning similarities dependent on the task.
Empirical results on a number of tasks show that StarSpace is highly
competitive with existing methods, whilst also being generally applicable to
new cases where those methods are not
Measuring Emotions in the COVID-19 Real World Worry Dataset
The COVID-19 pandemic is having a dramatic impact on societies and economies
around the world. With various measures of lockdowns and social distancing in
place, it becomes important to understand emotional responses on a large scale.
In this paper, we present the first ground truth dataset of emotional responses
to COVID-19. We asked participants to indicate their emotions and express these
in text. This resulted in the Real World Worry Dataset of 5,000 texts (2,500
short + 2,500 long texts). Our analyses suggest that emotional responses
correlated with linguistic measures. Topic modeling further revealed that
people in the UK worry about their family and the economic situation.
Tweet-sized texts functioned as a call for solidarity, while longer texts shed
light on worries and concerns. Using predictive modeling approaches, we were
able to approximate the emotional responses of participants from text within
14% of their actual value. We encourage others to use the dataset and improve
how we can use automated methods to learn about emotional responses and worries
about an urgent problem.Comment: Accepted to ACL 2020 COVID-19 worksho
Classification Model for Bullying Posts Detection
Nowadays, many research tasks are concentrating on Social Media for Analyzing Sentiments and Opinions, Political Issues, Marketing Strategies and many more. Several text mining structures have been designed for different applications. Harassing is a category of claiming social turmoil in different structures and conduct toward a singular or group, to damage others. Investigation outcomes demonstrated that 7 young people out of 10 become the casualty of cyber bullying. Throughout the world, many prominent cases are existing due to the bad communications over the Web. So there could be suitable solutions for this problem and there is a need to eradicate the lacking in existing strategies in dealing problems with cyber bullying incidents. A prominent aim is to design a scheme to alert the people those who are using social networks and also to prevent them from bullying environments. Tweet corpus carries the messages in the text as well as it has ID, time, and so forth. The messages are imparted in informal form and furthermore, there is variety in the dialect. So, there is a requirement to operate a progression of filtration to handle the raw tweets before feature extraction and frequency extraction. The idea is to regard each tweet as a limited blend over a basic arrangement of topics, each of which is described by dissemination over words, and after that analyze tweets through such topic dispersions. Naturally, bullying topics might be related to higher probabilities for bullying words. An arrangement of training tweets with both bullying and non-bullying texts are required to take in a model that can derive topic distributions from tweets. Topic modeling is used to get lexical collocation designs in the irreverent content and create significant topics for a model
- …