22 research outputs found

    Improving News Popularity Estimation via Weak Supervision and Meta-active Learning

    Get PDF
    Social news has fundamentally changed the mechanisms of public perception, education, and even dis-information. Apprising the popularity of social news articles can have significant impact through a diversity of information redistribution techniques. In this article, an improved prediction algorithm is proposed to predict the long-time popularity of social news articles without the need for ground-truth observations. The proposed framework applies a novel active learning selection policy to obtain the optimal volume of observations and achieve superior predictive performance. To assess the proposed framework, a large set of experiments are undertaken; these indicate that the new solution can improve prediction performance by 28% (precision) while reducing the volume of required ground truth by 32%

    News Article Position Recommendation Based on The Analysis of Article's Content -Time Matters

    Get PDF
    ABSTRACT As more people prefer to read news on-line, the newspapers are focusing on personalized news presentation. In this study, we investigate the prediction of article's position based on the analysis of article's content using different text analytics methods. The evaluation is performed in 4 main scenarios using articles from different time frames. The result of the analysis shows that the article's freshness plays an important role in the prediction of a new article's position. Also, the results from this work provides insight on how to find an optimised solution to automate the process of assigning new article the right position. We believe that these insights may further be used in developing content based news recommender algorithms

    Popularity Prediction of Reddit Texts

    Get PDF
    Popularity prediction is a useful technique for marketers to anticipate the success of marketing campaigns, to build recommendation systems that suggest new products to consumers, and to develop targeted advertising. Researchers likewise use popularity prediction to measure how popularity changes within a community or within a given timespan. In this paper, I explore ways to predict popularity of posts in reddit.com, which is a blend of news aggregator and community forum. I frame popularity prediction as a text classification problem and attempt to solve it by first identifying topics in the text and then classifying whether the topics identified are more characteristic of popular or unpopular texts. This classifier is then used to label unseen texts as popular or not dependent on the topics found in these new posts. I explore the use of Latent Dirichlet Allocation and term frequency-inverse document frequency for topic identification and naïve Bayes classifiers and support vector machines for classification. The relation between topics and popularity is dynamic -- topics in Reddit communities can wax and wane in popularity. Despite the inherent variability, the methods explored in the paper are effective, showing prediction accuracy between 60% and 75%. The study contributes to the field in various ways. For example, it provides novel data for research and development, not only for text classification but also for the study of relation between topics and popularity in general. The study also helps us better understand different topic identification and classification methods by illustrating their effectiveness on real-life data from a fast-changing and multi-purpose websit