Location of Repository

Hourly Traffic Prediction of News Stories

By Luís Marujo, Miguel Bugalho, João P. Neto and Anatole Gershman

Abstract

The process of predicting news stories popularity from several news sources has become a challenge of great importance for both news producers and readers. In this paper, we investigate methods for automatically predicting the number of clicks on a news story during one hour. Our approach is a combination of additive regression and bagging applied over a M5P regression tree using a logarithmic scale (log10). The features included are social-based (social network metadata from Facebook), content-based (automatically extracted keyphrases, and stylometric statistics from news titles), and time-based. In 1 st Sapo Data Challenge we obtained 11.99 % as mean relative error value which put us in the 4 th place out of 26 participants

Topics: General Terms Algorithms, Measurement, Experimentation. Keywords Prediction, News, Clicks, Sapo Challenge, Traffic
Year: 2014
OAI identifier: oai:CiteSeerX.psu:10.1.1.415.3133
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://ceur-ws.org/Vol-791/pap... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.