3,265 research outputs found

    Employing Topological Data Analysis On Social Networks Data To Improve Information Diffusion

    Get PDF
    For the past decade, the number of users on social networks has grown tremendously from thousands in 2004 to billions by the end of 2015. On social networks, users create and propagate billions of pieces of information every day. The data can be in many forms (such as text, images, or videos). Due to the massive usage of social networks and availability of data, the field of social network analysis and mining has attracted many researchers from academia and industry to analyze social network data and explore various research opportunities (including information diffusion and influence measurement). Information diffusion is defined as the way that information is spread on social networks; this can occur due to social influence. Influence is the ability affect others without direct commands. Influence on social networks can be observed through social interactions between users (such as retweet on Twitter, like on Instagram, or favorite on Flickr). In order to improve information diffusion, we measure the influence of users on social networks to predict influential users. The ability to predict the popularity of posts can improve information diffusion as well; posts become popular when they diffuse on social networks. However, measuring influence and predicting posts popularity can be challenging due to unstructured, big, noisy data. Therefore, social network mining and analysis techniques are essential for extracting meaningful information about influential users and popular posts. For measuring the influence of users, we proposed a novel influence measurement that integrates both users’ structural locations and characteristics on social networks, which then can be used to predict influential users on social networks. centrality analysis techniques are adapted to identify the users’ structural locations. Centrality is used to identify the most important nodes within a graph; social networks can be represented as graphs (where nodes represent users and edges represent interactions between users), and centrality analysis can be adopted. The second part of the work focuses on predicting the popularity of images on social networks over time. The effect of social context, image content and early popularity on image popularity using machine learning algorithms are analyzed. A new approach for image content is developed to represent the semantics of an image using its captions, called keyword vector. This approach is based on Word2vec (an unsupervised two-layer neural network that generates distributed numerical vectors to represent words in the vector space to detect similarity) and k-means (a popular clustering algorithm). However, machine learning algorithms do not address issues arising from the nature of social network data, noise and high dimensionality in data. Therefore, topological data analysis is adopted. It is a noble approach to extract meaningful information from high-dimensional data and is robust to noise. It is based on topology, which aims to study the geometric shape of data. In this thesis, we explore the feasibility of topological data analysis for mining social network data by addressing the problem of image popularity. The proposed techniques are employed to datasets crawled from real-world social networks to examine the performance of each approach. The results for predicting the influential users outperforms existing measurements in terms of correlation. As for predicting the popularity of images on social networks, the results indicate that the proposed features provides a promising opportunity and exceeds the related work in terms of accuracy. Further exploration of these research topics can be used for a variety of real-world applications (including improving viral marketing, public awareness, political standings and charity work)

    Predictive Analysis on Twitter: Techniques and Applications

    Full text link
    Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

    Understanding Image Virality

    Full text link
    Virality of online content on social networking websites is an important but esoteric phenomenon often studied in fields like marketing, psychology and data mining. In this paper we study viral images from a computer vision perspective. We introduce three new image datasets from Reddit, and define a virality score using Reddit metadata. We train classifiers with state-of-the-art image features to predict virality of individual images, relative virality in pairs of images, and the dominant topic of a viral image. We also compare machine performance to human performance on these tasks. We find that computers perform poorly with low level features, and high level information is critical for predicting virality. We encode semantic information through relative attributes. We identify the 5 key visual attributes that correlate with virality. We create an attribute-based characterization of images that can predict relative virality with 68.10% accuracy (SVM+Deep Relative Attributes) -- better than humans at 60.12%. Finally, we study how human prediction of image virality varies with different `contexts' in which the images are viewed, such as the influence of neighbouring images, images recently viewed, as well as the image title or caption. This work is a first step in understanding the complex but important phenomenon of image virality. Our datasets and annotations will be made publicly available.Comment: Pre-print, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201
    • …
    corecore