13,979 research outputs found
Analysis and Forecasting of Trending Topics in Online Media Streams
Among the vast information available on the web, social media streams capture
what people currently pay attention to and how they feel about certain topics.
Awareness of such trending topics plays a crucial role in multimedia systems
such as trend aware recommendation and automatic vocabulary selection for video
concept detection systems.
Correctly utilizing trending topics requires a better understanding of their
various characteristics in different social media streams. To this end, we
present the first comprehensive study across three major online and social
media streams, Twitter, Google, and Wikipedia, covering thousands of trending
topics during an observation period of an entire year. Our results indicate
that depending on one's requirements one does not necessarily have to turn to
Twitter for information about current events and that some media streams
strongly emphasize content of specific categories. As our second key
contribution, we further present a novel approach for the challenging task of
forecasting the life cycle of trending topics in the very moment they emerge.
Our fully automated approach is based on a nearest neighbor forecasting
technique exploiting our assumption that semantically similar topics exhibit
similar behavior.
We demonstrate on a large-scale dataset of Wikipedia page view statistics
that forecasts by the proposed approach are about 9-48k views closer to the
actual viewing statistics compared to baseline methods and achieve a mean
average percentage error of 45-19% for time periods of up to 14 days.Comment: ACM Multimedia 201
Real-Time Classification of Twitter Trends
Social media users give rise to social trends as they share about common
interests, which can be triggered by different reasons. In this work, we
explore the types of triggers that spark trends on Twitter, introducing a
typology with following four types: 'news', 'ongoing events', 'memes', and
'commemoratives'. While previous research has analyzed trending topics in a
long term, we look at the earliest tweets that produce a trend, with the aim of
categorizing trends early on. This would allow to provide a filtered subset of
trends to end users. We analyze and experiment with a set of straightforward
language-independent features based on the social spread of trends to
categorize them into the introduced typology. Our method provides an efficient
way to accurately categorize trending topics without need of external data,
enabling news organizations to discover breaking news in real-time, or to
quickly identify viral memes that might enrich marketing decisions, among
others. The analysis of social features also reveals patterns associated with
each type of trend, such as tweets about ongoing events being shorter as many
were likely sent from mobile devices, or memes having more retweets originating
from a few trend-setters.Comment: Pre-print of article accepted for publication in Journal of the
American Society for Information Science and Technology copyright @ 2013
(American Society for Information Science and Technology
Search Engine Optimisation in UK news production
This is an Author's Accepted Manuscript of an article published in Journalism Practice, 5(4), 462 - 477, 2011, copyright Taylor & Francis, available online at: http://www.tandfonline.com/10.1080/17512786.2010.551020.This paper represents an exploratory study into an emerging culture in UK online newsrooms—the practice of Search Engine Optimisation (SEO), which assesses its impact on news production. Comprising a short-term participant observational case study at a national online news publisher, and a series of semi-structured, in-depth interviews with SEO professionals at three further UK media organisations, the author sets out to establish how SEO is operationalised in the newsroom, and what consequences these practices have for online news production. SEO practice is found to be varied and application is not universal. Not all UK news organisations are making the most of SEO even though some publishers take a highly sophisticated approach. Efforts are constrained by time, resources and management support, as well as off-page technical issues. SEO policy is found, in some cases, to inform editorial policy, but there is resistance to the principal of SEO driving decision-making. Several themes are established which call for further research
EveTAR: Building a Large-Scale Multi-Task Test Collection over Arabic Tweets
This article introduces a new language-independent approach for creating a
large-scale high-quality test collection of tweets that supports multiple
information retrieval (IR) tasks without running a shared-task campaign. The
adopted approach (demonstrated over Arabic tweets) designs the collection
around significant (i.e., popular) events, which enables the development of
topics that represent frequent information needs of Twitter users for which
rich content exists. That inherently facilitates the support of multiple tasks
that generally revolve around events, namely event detection, ad-hoc search,
timeline generation, and real-time summarization. The key highlights of the
approach include diversifying the judgment pool via interactive search and
multiple manually-crafted queries per topic, collecting high-quality
annotations via crowd-workers for relevancy and in-house annotators for
novelty, filtering out low-agreement topics and inaccessible tweets, and
providing multiple subsets of the collection for better availability. Applying
our methodology on Arabic tweets resulted in EveTAR , the first
freely-available tweet test collection for multiple IR tasks. EveTAR includes a
crawl of 355M Arabic tweets and covers 50 significant events for which about
62K tweets were judged with substantial average inter-annotator agreement
(Kappa value of 0.71). We demonstrate the usability of EveTAR by evaluating
existing algorithms in the respective tasks. Results indicate that the new
collection can support reliable ranking of IR systems that is comparable to
similar TREC collections, while providing strong baseline results for future
studies over Arabic tweets
- …