106 research outputs found

    The Untold Story of the Clones: Content-agnostic Factors that Impact YouTube Video Popularity

    Full text link
    Video dissemination through sites such as YouTube can have widespread impacts on opinions, thoughts, and cultures. Not all videos will reach the same popularity and have the same impact. Popularity differences arise not only because of differences in video content, but also because of other "content-agnostic" factors. The latter factors are of considerable interest but it has been difficult to accurately study them. For example, videos uploaded by users with large social networks may tend to be more popular because they tend to have more interesting content, not because social network size has a substantial direct impact on popularity. In this paper, we develop and apply a methodology that is able to accurately assess, both qualitatively and quantitatively, the impacts of various content-agnostic factors on video popularity. When controlling for video content, we observe a strong linear "rich-get-richer" behavior, with the total number of previous views as the most important factor except for very young videos. The second most important factor is found to be video age. We analyze a number of phenomena that may contribute to rich-get-richer, including the first-mover advantage, and search bias towards popular videos. For young videos we find that factors other than the total number of previous views, such as uploader characteristics and number of keywords, become relatively more important. Our findings also confirm that inaccurate conclusions can be reached when not controlling for content.Comment: Dataset available at: http://www.ida.liu.se/~nikca/papers/kdd12.htm

    Ephemeral Content Popularity at the Edge and Implications for On-Demand Caching

    Full text link

    Efficient Data Dissemination in the World Wide Web

    No full text
    The World Wide Web is an exponentially increasing distributed information dissemination system that has become very popular in recent times. The exponential growth necessitates that the resources be utilised in a judicious manner by the different Web servers and clients. Reducing network traffic and server load are the key objectives for any strategy that would improve the Web's performance. Also, in order to reduce the access latencies it is desirable to store copies of popular documents closer to the user, from where the access latencies are more acceptable. A variety of mechanisms have been proposed for choosing the best site/replica, while satisfying a document request. Replication & caching techniques and the mechanisms & issues involving selection of server sites are discussed in this paper. Most major commercial Web sites (eg. AltaVista, Microsoft, Netscape) are mirrored on a world wide basis. Whether such mirror sites are being used effectively as a mechanism for efficient info..

    Ephemeral Content Popularity at the Edge and Implications for On-Demand Caching

    No full text
    The ephemeral content popularity seen with many content delivery applications can make indiscriminate on-demand caching in edge networks highly inefficient, since many of the content items that are added to the cache will not be requested again from that network. In this paper, we address the problem of designing and evaluating more selective edge-network caching policies. The need for such policies is demonstrated through an analysis of a dataset recording YouTube video requests from users on an edge network over a 20-month period. We then develop a novel workload modelling approach for such applications and apply it to study the performance of alternative edge caching policies, including indiscriminate caching and cache on kth request for different k. The latter policies are found able to greatly reduce the fraction of the requested items that are inserted into the cache, at the cost of only modest increases in cache miss rate. Finally, we quantify and explore the potential room for improvement from use of other possible predictors of further requests. We find that although room for substantial improvement exists when comparing performance to that of a perfect "oracle" policy, such improvements are unlikely to be achievable in practice.Funding Agencies|Center for Industrial Information Technology (CEN-IIT); Natural Sciences and Engineering Research Council (NSERC) of Canada</p
    • …
    corecore