29,541 research outputs found
Impact Of Content Features For Automatic Online Abuse Detection
Online communities have gained considerable importance in recent years due to
the increasing number of people connected to the Internet. Moderating user
content in online communities is mainly performed manually, and reducing the
workload through automatic methods is of great financial interest for community
maintainers. Often, the industry uses basic approaches such as bad words
filtering and regular expression matching to assist the moderators. In this
article, we consider the task of automatically determining if a message is
abusive. This task is complex since messages are written in a non-standardized
way, including spelling errors, abbreviations, community-specific codes...
First, we evaluate the system that we propose using standard features of online
messages. Then, we evaluate the impact of the addition of pre-processing
strategies, as well as original specific features developed for the community
of an online in-browser strategy game. We finally propose to analyze the
usefulness of this wide range of features using feature selection. This work
can lead to two possible applications: 1) automatically flag potentially
abusive messages to draw the moderator's attention on a narrow subset of
messages ; and 2) fully automate the moderation process by deciding whether a
message is abusive without any human intervention
Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews
This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews
Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies
Stories can have tremendous power -- not only useful for entertainment, they
can activate our interests and mobilize our actions. The degree to which a
story resonates with its audience may be in part reflected in the emotional
journey it takes the audience upon. In this paper, we use machine learning
methods to construct emotional arcs in movies, calculate families of arcs, and
demonstrate the ability for certain arcs to predict audience engagement. The
system is applied to Hollywood films and high quality shorts found on the web.
We begin by using deep convolutional neural networks for audio and visual
sentiment analysis. These models are trained on both new and existing
large-scale datasets, after which they can be used to compute separate audio
and visual emotional arcs. We then crowdsource annotations for 30-second video
clips extracted from highs and lows in the arcs in order to assess the
micro-level precision of the system, with precision measured in terms of
agreement in polarity between the system's predictions and annotators' ratings.
These annotations are also used to combine the audio and visual predictions.
Next, we look at macro-level characterizations of movies by investigating
whether there exist `universal shapes' of emotional arcs. In particular, we
develop a clustering approach to discover distinct classes of emotional arcs.
Finally, we show on a sample corpus of short web videos that certain emotional
arcs are statistically significant predictors of the number of comments a video
receives. These results suggest that the emotional arcs learned by our approach
successfully represent macroscopic aspects of a video story that drive audience
engagement. Such machine understanding could be used to predict audience
reactions to video stories, ultimately improving our ability as storytellers to
communicate with each other.Comment: Data Mining (ICDM), 2017 IEEE 17th International Conference o
- …