2 research outputs found

    "Nobody comes here anymore, it's too crowded"; Predicting Image Popularity on Flickr

    No full text
    Predicting popular content is a challenging problem for social media websites in order to encourage user interactions and activity. Existing works in this area, including the recommendation approach used by Flickr (called "interestingness"), consider only click through data, tags, comments and explicit user feedback in this computation. On image sharing websites, however, many images are annotated with no tags and initially, an image has no interaction data. In this case, these existing approaches fail due to lack of evidence. In this paper, we therefore focus on image popularity prediction in a cold start scenario (i.e. where there exist no, or limited, textual/interaction data), by considering an image's context, visual appearance and user context. Specifically, we predict the number of comments and views an image has based on a number of new features for this propose. Experimenting on the MIR-Flickr 1M collection, we are able to overcome the problems associated with popularity prediction in a cold start, achieving accuracy of up to 76%

    Employing Topological Data Analysis On Social Networks Data To Improve Information Diffusion

    Get PDF
    For the past decade, the number of users on social networks has grown tremendously from thousands in 2004 to billions by the end of 2015. On social networks, users create and propagate billions of pieces of information every day. The data can be in many forms (such as text, images, or videos). Due to the massive usage of social networks and availability of data, the field of social network analysis and mining has attracted many researchers from academia and industry to analyze social network data and explore various research opportunities (including information diffusion and influence measurement). Information diffusion is defined as the way that information is spread on social networks; this can occur due to social influence. Influence is the ability affect others without direct commands. Influence on social networks can be observed through social interactions between users (such as retweet on Twitter, like on Instagram, or favorite on Flickr). In order to improve information diffusion, we measure the influence of users on social networks to predict influential users. The ability to predict the popularity of posts can improve information diffusion as well; posts become popular when they diffuse on social networks. However, measuring influence and predicting posts popularity can be challenging due to unstructured, big, noisy data. Therefore, social network mining and analysis techniques are essential for extracting meaningful information about influential users and popular posts. For measuring the influence of users, we proposed a novel influence measurement that integrates both users’ structural locations and characteristics on social networks, which then can be used to predict influential users on social networks. centrality analysis techniques are adapted to identify the users’ structural locations. Centrality is used to identify the most important nodes within a graph; social networks can be represented as graphs (where nodes represent users and edges represent interactions between users), and centrality analysis can be adopted. The second part of the work focuses on predicting the popularity of images on social networks over time. The effect of social context, image content and early popularity on image popularity using machine learning algorithms are analyzed. A new approach for image content is developed to represent the semantics of an image using its captions, called keyword vector. This approach is based on Word2vec (an unsupervised two-layer neural network that generates distributed numerical vectors to represent words in the vector space to detect similarity) and k-means (a popular clustering algorithm). However, machine learning algorithms do not address issues arising from the nature of social network data, noise and high dimensionality in data. Therefore, topological data analysis is adopted. It is a noble approach to extract meaningful information from high-dimensional data and is robust to noise. It is based on topology, which aims to study the geometric shape of data. In this thesis, we explore the feasibility of topological data analysis for mining social network data by addressing the problem of image popularity. The proposed techniques are employed to datasets crawled from real-world social networks to examine the performance of each approach. The results for predicting the influential users outperforms existing measurements in terms of correlation. As for predicting the popularity of images on social networks, the results indicate that the proposed features provides a promising opportunity and exceeds the related work in terms of accuracy. Further exploration of these research topics can be used for a variety of real-world applications (including improving viral marketing, public awareness, political standings and charity work)
    corecore