27,306 research outputs found
Temporal Locality in Today's Content Caching: Why it Matters and How to Model it
The dimensioning of caching systems represents a difficult task in the design
of infrastructures for content distribution in the current Internet. This paper
addresses the problem of defining a realistic arrival process for the content
requests generated by users, due its critical importance for both analytical
and simulative evaluations of the performance of caching systems. First, with
the aid of YouTube traces collected inside operational residential networks, we
identify the characteristics of real traffic that need to be considered or can
be safely neglected in order to accurately predict the performance of a cache.
Second, we propose a new parsimonious traffic model, named the Shot Noise Model
(SNM), that enables users to natively capture the dynamics of content
popularity, whilst still being sufficiently simple to be employed effectively
for both analytical and scalable simulative studies of caching systems.
Finally, our results show that the SNM presents a much better solution to
account for the temporal locality observed in real traffic compared to existing
approaches.Comment: 7 pages, 7 figures, Accepted for publication in ACM Computer
Communication Revie
Cost-effective online trending topic detection and popularity prediction in microblogging
Identifying topic trends on microblogging services such as Twitter and estimating those topics’ future popularity have great academic and business value, especially when the operations can be done in real time. For any third party, however, capturing and processing such huge volumes of real-time data in microblogs are almost infeasible tasks, as there always exist API (Application Program Interface) request limits, monitoring and computing budgets, as well as timeliness requirements. To deal with these challenges, we propose a cost-effective system framework with algorithms that can automatically select a subset of representative users in microblogging networks in offline, under given cost constraints. Then the proposed system can online monitor and utilize only these selected users’ real-time microposts to detect the overall trending topics and predict their future popularity among the whole microblogging network. Therefore, our proposed system framework is practical for real-time usage as it avoids the high cost in capturing and processing full real-time data, while not compromising detection and prediction performance under given cost constraints. Experiments with real microblogs dataset show that by tracking only 500 users out of 0.6 million users and processing no more than 30,000 microposts daily, about 92% trending topics could be detected and predicted by the proposed system and, on average, more than 10 hours earlier than they appear in official trends lists
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
- …