9,637 research outputs found
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Breaking the News: First Impressions Matter on Online News
A growing number of people are changing the way they consume news, replacing
the traditional physical newspapers and magazines by their virtual online
versions or/and weblogs. The interactivity and immediacy present in online news
are changing the way news are being produced and exposed by media corporations.
News websites have to create effective strategies to catch people's attention
and attract their clicks. In this paper we investigate possible strategies used
by online news corporations in the design of their news headlines. We analyze
the content of 69,907 headlines produced by four major global media
corporations during a minimum of eight consecutive months in 2014. In order to
discover strategies that could be used to attract clicks, we extracted features
from the text of the news headlines related to the sentiment polarity of the
headline. We discovered that the sentiment of the headline is strongly related
to the popularity of the news and also with the dynamics of the posted comments
on that particular news.Comment: The paper appears in ICWSM 201
CasGCN: Predicting future cascade growth based on information diffusion graph
Sudden bursts of information cascades can lead to unexpected consequences
such as extreme opinions, changes in fashion trends, and uncontrollable spread
of rumors. It has become an important problem on how to effectively predict a
cascade' size in the future, especially for large-scale cascades on social
media platforms such as Twitter and Weibo. However, existing methods are
insufficient in dealing with this challenging prediction problem. Conventional
methods heavily rely on either hand crafted features or unrealistic
assumptions. End-to-end deep learning models, such as recurrent neural
networks, are not suitable to work with graphical inputs directly and cannot
handle structural information that is embedded in the cascade graphs. In this
paper, we propose a novel deep learning architecture for cascade growth
prediction, called CasGCN, which employs the graph convolutional network to
extract structural features from a graphical input, followed by the application
of the attention mechanism on both the extracted features and the temporal
information before conducting cascade size prediction. We conduct experiments
on two real-world cascade growth prediction scenarios (i.e., retweet popularity
on Sina Weibo and academic paper citations on DBLP), with the experimental
results showing that CasGCN enjoys a superior performance over several baseline
methods, particularly when the cascades are of large scale
Resource Letter: Dark Energy and the Accelerating Universe
This Resource Letter provides a guide to the literature on dark energy and
the accelerating universe. It is intended to be of use to researchers,
teachers, and students at several levels. Journal articles, books, and websites
are cited for the following topics: Einstein's cosmological constant,
quintessence or dynamical scalar fields, modified cosmic gravity, relations to
high energy physics, cosmological probes and observations, terrestrial probes,
calculational tools and parameter estimation, teaching strategies and
educational resources, and the fate of the universe.Comment: Resource Letter for AAPT/AJP, 11 pages, 99 reference
MUFFLE: Multi-Modal Fake News Influence Estimator on Twitter
To alleviate the impact of fake news on our society, predicting the popularity of fake news posts on social media is a crucial problem worthy of study. However, most related studies on fake news emphasize detection only. In this paper, we focus on the issue of fake news influence prediction, i.e., inferring how popular a fake news post might become on social platforms. To achieve our goal, we propose a comprehensive framework, MUFFLE, which captures multi-modal dynamics by encoding the representation of news-related social networks, user characteristics, and content in text. The attention mechanism developed in the model can provide explainability for social or psychological analysis. To examine the effectiveness of MUFFLE, we conducted extensive experiments on real-world datasets. The experimental results show that our proposed method outperforms both state-of-the-art methods of popularity prediction and machine-based baselines in top-k NDCG and hit rate. Through the experiments, we also analyze the feature importance for predicting fake news influence via the explainability provided by MUFFLE
Mining News Content for Popularity Prediction
The problem of popularity prediction has been studied extensively in various previous research. The idea behind popularity prediction is that the attention users give to online items is unequally distributed, as only a small fraction of all the available content receives serious users attention. Researchers have been experimenting with different methods to find a way to predict that fraction. However, to the best of our knowledge, none of the previous work used the content for popularity prediction; instead, the research looked at other features such as early user reactions (number of views/shares/comments) of the first hours/days to predict the future popularity. These models are built to be easily generalized to all data types from videos (e.g. YouTube videos) and images, to news stories. However, they are not considered very efficient for the news domain as our research shows that most stories get 90% to 100% of the attention that they will ever get on the first day. Thus, it would be much more efficient to estimate the popularity even before an item is seen by the users. In this thesis, we plan to approach the problem in a way that accomplishes that goal. We will narrow our focus to the news domain, and concentrate on the content of news stories. We would like to investigate the ability to predict the popularity of news articles by finding the topics that interest the users and the estimated audience of each topic. Then, given a new news story, we would infer the topics from the story’s content, and based on those topics we would make a prediction for how popular it may become in the future even before it’s released to the public
- …