852 research outputs found
Tweeting your Destiny: Profiling Users in the Twitter Landscape around an Online Game
Social media has become a major communication channel for communities
centered around video games. Consequently, social media offers a rich data
source to study online communities and the discussions evolving around games.
Towards this end, we explore a large-scale dataset consisting of over 1 million
tweets related to the online multiplayer shooter Destiny and spanning a time
period of about 14 months using unsupervised clustering and topic modelling.
Furthermore, we correlate Twitter activity of over 3,000 players with their
playtime. Our results contribute to the understanding of online player
communities by identifying distinct player groups with respect to their Twitter
characteristics, describing subgroups within the Destiny community, and
uncovering broad topics of community interest.Comment: Accepted at IEEE Conference on Games 201
Indirect Match Highlights Detection with Deep Convolutional Neural Networks
Highlights in a sport video are usually referred as actions that stimulate
excitement or attract attention of the audience. A big effort is spent in
designing techniques which find automatically highlights, in order to
automatize the otherwise manual editing process. Most of the state-of-the-art
approaches try to solve the problem by training a classifier using the
information extracted on the tv-like framing of players playing on the game
pitch, learning to detect game actions which are labeled by human observers
according to their perception of highlight. Obviously, this is a long and
expensive work. In this paper, we reverse the paradigm: instead of looking at
the gameplay, inferring what could be exciting for the audience, we directly
analyze the audience behavior, which we assume is triggered by events happening
during the game. We apply deep 3D Convolutional Neural Network (3D-CNN) to
extract visual features from cropped video recordings of the supporters that
are attending the event. Outputs of the crops belonging to the same frame are
then accumulated to produce a value indicating the Highlight Likelihood (HL)
which is then used to discriminate between positive (i.e. when a highlight
occurs) and negative samples (i.e. standard play or time-outs). Experimental
results on a public dataset of ice-hockey matches demonstrate the effectiveness
of our method and promote further research in this new exciting direction.Comment: "Social Signal Processing and Beyond" workshop, in conjunction with
ICIAP 201
Video Highlight Prediction Using Audience Chat Reactions
Sports channel video portals offer an exciting domain for research on
multimodal, multilingual analysis. We present methods addressing the problem of
automatic video highlight prediction based on joint visual features and textual
analysis of the real-world audience discourse with complex slang, in both
English and traditional Chinese. We present a novel dataset based on League of
Legends championships recorded from North American and Taiwanese Twitch.tv
channels (will be released for further research), and demonstrate strong
results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201
Effectiveness of Data Enrichment on Categorization: Two Case Studies on Short Texts and User Movements
The widespread diffusion of mobile devices, e.g., smartphones and tablets, has made possible a huge increment in data generation by users. Nowadays, about a billion users daily interact on online social media, where they share information and discuss about a wide variety of topics, sometimes including the places they visit. Furthermore, the use of mobile devices makes available a large amount of data tracked by integrated sensors, which monitor several users’ activities, again including their position. The content produced by users are composed of few elements, such as only some words in a social post, or a simple GPS position, therefore a poor source of information to analyze. On this basis, a data enrichment process may provide additional knowledge by exploiting other related sources to extract additional data.
The aim of this dissertation is to analyze the effectiveness of data enrichment for categorization, in particular on two domains, short texts and user movements. We de- scribe the concept behind our experimental design where users’ content are represented as abstract objects in a geometric space, with distances representing relatedness and similarity values, and contexts representing regions close to the each object where it is possibile to find other related objects, and therefore suitable as data enrichment source. Regarding short texts our research involves a novel approach on short text enrichment and categorization, and an extensive study on the properties of data used as enrich- ment. We analyze the temporal context and a set of properties which characterize data from an external source in order to properly select and extract additional knowledge related to textual content that users produce. We use Twitter as short texts source to build datasets for all experiments. Regarding user movements we address the problem of places categorization recognizing important locations that users visit frequently and intensively. We propose a novel approach on places categorization based on a feature space which models the users’ movement habits. We analyze both temporal and spa- tial context to find additional information to use as data enrichment and improve the importance recognition process. We use an in-house built dataset of GPS logs and the GeoLife public dataset for our experiments. Experimental evaluations on both our stud- ies highlight how the enrichment phase has a considerable impact on each process, and the results demonstrate its effectiveness. In particular, the short texts analysis shows how news articles are documents particularly suitable to be used as enrichment source, and their freshness is an important property to consider. User Movements analysis demonstrates how the context with additional data helps, even with user trajectories difficult to analyze. Finally, we provide an early stage study on user modeling. We exploit the data extracted with enrichment on the short texts to build a richer user profile. The enrichment phase, combined with a network-based approach, improves the profiling process providing higher scores in similarity computation where expectedCo-supervisore: Ivan ScagnettoopenDottorato di ricerca in Informaticaope
Can we predict a riot? Disruptive event detection using Twitter
In recent years, there has been increased interest in real-world event detection using publicly accessible data made available through Internet technology such as Twitter, Facebook, and YouTube. In these highly interactive systems, the general public are able to post real-time reactions to “real world” events, thereby acting as social sensors of terrestrial activity. Automatically detecting and categorizing events, particularly small-scale incidents, using streamed data is a non-trivial task but would be of high value to public safety organisations such as local police, who need to respond accordingly. To address this challenge, we present an end-to-end integrated event detection framework that comprises five main components: data collection, pre-processing, classification, online clustering, and summarization. The integration between classification and clustering enables events to be detected, as well as related smaller-scale “disruptive events,” smaller incidents that threaten social safety and security or could disrupt social order. We present an evaluation of the effectiveness of detecting events using a variety of features derived from Twitter posts, namely temporal, spatial, and textual content. We evaluate our framework on a large-scale, real-world dataset from Twitter. Furthermore, we apply our event detection system to a large corpus of tweets posted during the August 2011 riots in England. We use ground-truth data based on intelligence gathered by the London Metropolitan Police Service, which provides a record of actual terrestrial events and incidents during the riots, and show that our system can perform as well as terrestrial sources, and even better in some cases
Leveraging Contextual Cues for Generating Basketball Highlights
The massive growth of sports videos has resulted in a need for automatic
generation of sports highlights that are comparable in quality to the
hand-edited highlights produced by broadcasters such as ESPN. Unlike previous
works that mostly use audio-visual cues derived from the video, we propose an
approach that additionally leverages contextual cues derived from the
environment that the game is being played in. The contextual cues provide
information about the excitement levels in the game, which can be ranked and
selected to automatically produce high-quality basketball highlights. We
introduce a new dataset of 25 NCAA games along with their play-by-play stats
and the ground-truth excitement data for each basket. We explore the
informativeness of five different cues derived from the video and from the
environment through user studies. Our experiments show that for our study
participants, the highlights produced by our system are comparable to the ones
produced by ESPN for the same games.Comment: Proceedings of ACM Multimedia 201
Spatial and Temporal Sentiment Analysis of Twitter data
The public have used Twitter world wide for expressing opinions. This study focuses on spatio-temporal variation of georeferenced Tweets’ sentiment polarity, with a view to understanding how opinions evolve on Twitter over space and time and across communities of users. More specifically, the question this study tested is whether sentiment polarity on Twitter exhibits specific time-location patterns. The aim of the study is to investigate the spatial and temporal distribution of georeferenced Twitter sentiment polarity within the area of 1 km buffer around the Curtin Bentley campus boundary in Perth, Western Australia. Tweets posted in campus were assigned into six spatial zones and four time zones. A sentiment analysis was then conducted for each zone using the sentiment analyser tool in the Starlight Visual Information System software. The Feature Manipulation Engine was employed to convert non-spatial files into spatial and temporal feature class. The spatial and temporal distribution of Twitter sentiment polarity patterns over space and time was mapped using Geographic Information Systems (GIS). Some interesting results were identified. For example, the highest percentage of positive Tweets occurred in the social science area, while science and engineering and dormitory areas had the highest percentage of negative postings. The number of negative Tweets increases in the library and science and engineering areas as the end of the semester approaches, reaching a peak around an exam period, while the percentage of negative Tweets drops at the end of the semester in the entertainment and sport and dormitory area. This study will provide some insights into understanding students and staff ’s sentiment variation on Twitter, which could be useful for university teaching and learning management
The Potential of Social Media Intelligence to Improve Peoples Lives: Social Media Data for Good
In this report, developed with support from Facebook, we focus on an approach to extract public value from social media data that we believe holds the greatest potential: data collaboratives. Data collaboratives are an emerging form of public-private partnership in which actors from different sectors exchange information to create new public value. Such collaborative arrangements, for example between social media companies and humanitarian organizations or civil society actors, can be seen as possible templates for leveraging privately held data towards the attainment of public goals
Spatio-temporal distribution analysis of brand interest in social networks
Social Networks applications such as Facebook and Twitter became part of many people’s
lives and are used daily by millions of users. In such platforms, users share their emotions,
opinions, experiences, and thoughts. Twitter, in particular, is used to discuss diverse topics,
including brands, their products and services. In this thesis, we analyse how brand interest is
reflected on Twitter and how this platform can be used to monitor what people say about specific
brands, as an indicator of brand interest. Brand interest can be defined as the level of interest
one has in a brand, and the level of curiosity one has to learn more about a brand. For this work,
the volume of tweets is used as a measure of brand interest. Our methodology is based on time,
location, and the number of brand-related tweets to perform a spatio-temporal analysis.
Additionally, we propose a framework for discovering latent patterns (topics) from a large
dataset of grouped short messages to analyse brand interest, using Twitter as a data source. We
applied a well-known Text Mining technique called Topic Modelling, which is an unsupervised
learning technique used when dealing with text data, useful to uncover topics in a collection
of documents. This technique provides a convenient way to retrieve information from unstructured text. Topic Modelling tasks have been applied to track events/trends and uncover topics
in domains such as academic, public health, marketing, and so forth. The framework consists of training LDA (Latent Dirichlet Allocation) topic models on aggregated tweets, and then
applying the model on different documents, also composed by grouped Twitter posts. Furthermore, we describe a set of pre-processing tasks that helped to improve the performance of topic
models, enabling us to obtain a better output, thus performing a better analysis of it. The experiments demonstrated that Topic Modelling can successfully track people’s discussions on Social
Networks even in massive datasets such as the one used in the current work, and capture those
topics spiked by real-life eventsActualmente, plataformas como Twitter e Facebook fazem parte do dia-a-dia de muitas pessoas e são usadas por milhões de utilizadores. Nestas plataformas, denominadas Redes Sociais,
os utilizadores partilham informações incluindo opiniões, sentimentos, experiências e pensamentos. A plataforma Twitter, em particular, e usada para partilhar diversos tópicos, que podem
incluir dicussões sobre marcas, seus produtos e/ou serviços. O presente estudo analisa como o
interesse numa marca e reflectido na Rede Social Twitter e apresenta uma metodologia que permite utilizar o Twitter como fonte de informação para monitorizar o que os utilizadores dizem
acerca de determinadas marcas. O interesse numa marca pode ser definido como o nĂvel de
interesse que um indivĂduo tem por uma marca, e o nĂvel de curiosidade que um indivĂduo tem
e que o leva a aprender mais acerca dessa marca. Neste estudo, o nĂşmero de tweets publicados
e usado para medir o interesse nas marcas escolhidas. A metodologia seguida baseia-se na data
em que o tweet foi publicado, localização, e número de publicações, para efectuar uma análise
espacio-temporal.
Adicionalmente, apresenta-se uma framework que possibilita a exploração de um vasto
conjunto de dados, com o objectivo de revelar padrões latentes, bem como analisar o interesse
nas marcas seleccionadas, usando o Twitter como fonte dados. Para o efeito, aplicou-se Topic
Modelling, uma técnica de Text Mining bastante utilizada para descobrir tópicos em texto não
estruturado. Algoritmos de Topic Modelling tĂŞm sido amplamente utilizados para monitorizar
eventos e tendências e descobrir tópicos em áreas como educação, marketing, saúde, entre outras. A framework consiste em treinar o modelo de tópicos LDA (Latent Dirichlet Allocation)
usando tweets agrupados (considerando determinado critério) e posteriormente aplicar o modelo treinado noutro conjunto de tweets agrupados (considerando outro critério). Descreve-se um
conjunto de tarefas de prĂ©-processamento dos dados que ajudaram a melhorar o desempenho dos modelos, a obter melhor resultados e, consequentemente, a efectuar uma melhor análise. As experiĂŞncias revelam que atravĂŞs de Topic Modelling e possĂvel rastrear dicussões de utilizadores
de Redes Sociais durante um longo perĂodo de tempo, e capturar alterações relacionadas com acontecimentos reais
- …