424,369 research outputs found
Open Large-Scale Online Social Network Dyn
Online social networks have quickly become the most popular destination on the World Wide Web. These networks are still a fairly new form of online human interaction and have gained wide popularity only recently within the past three to four years. Few models or descriptions of the dynamics of these systems exist. This is largely due to the difficulty in gaining access to the data from these networks which is often viewed as very valuable. In these networks, members maintain list of friends with which they share content with by first uploading it to the social network service provider. The content is then distributed to members by the service provider who generates a feed for each member containing the content shared by all of the member's friends aggregated together. Direct access to dynamic linkage data for these large networks is especially difficult without a special relationship with the service provider. This makes it difficult for researchers to explore and better understand how humans interface with these systems. This dissertation examines an event driven sampling approach to acquire both dynamics link event data and blog content from the site known as LiveJournal. LiveJournal is one of the oldest online social networking sites whose features are very similar to sites such as Facebook and Myspace yet smaller in scale as to be practical for a research setting. The event driven sampling methodology and analysis of the resulting network model provide insights for other researchers interested in acquiring social network dynamics from LiveJournal or insight into what might be expected if an event driven sampling approach was applied to other online social networks. A detailed analysis of both the static structure and network dynamics of the resulting network model was performed. The analysis helped motivated work on a model of link prediction using both topological and content-based metrics. The relationship between topological and content-based metrics was explored. Factored into the link prediction analysis is the open nature of the social network data where new members are constantly joining and current members are leaving. The data used for the analysis spanned approximately two years
Friendship prediction and homophily in social media
International audienceSocial media have attracted considerable attention because their open-ended nature allows users to create lightweight semantic scaffolding to organize and share content. To date, the interplay of the social and topical components of social media has been only partially explored. Here, we study the presence of homophily in three systems that combine tagging social media with online social networks. We find a substantial level of topical similarity among users who are close to each other in the social network. We introduce a null model that preserves user activity while removing local correlations, allowing us to disentangle the actual local similarity between users from statistical effects due to the assortative mixing of user activity and centrality in the social network. This analysis suggests that users with similar interests are more likely to be friends, and therefore topical similarity measures among users based solely on their annotation metadata should be predictive of social links. We test this hypothesis on several datasets, confirming that social networks constructed from topical similarity capture actual friendship accurately. When combined with topological features, topical similarity achieves a link prediction accuracy of about 92%
Influence Analysis towards Big Social Data
Large scale social data from online social networks, instant messaging applications, and wearable devices have seen an exponential growth in a number of users and activities recently. The rapid proliferation of social data provides rich information and infinite possibilities for us to understand and analyze the complex inherent mechanism which governs the evolution of the new technology age. Influence, as a natural product of information diffusion (or propagation), which represents the change in an individual’s thoughts, attitudes, and behaviors resulting from interaction with others, is one of the fundamental processes in social worlds. Therefore, influence analysis occupies a very prominent place in social related data analysis, theory, model, and algorithms. In this dissertation, we study the influence analysis under the scenario of big social data. Firstly, we investigate the uncertainty of influence relationship among the social network. A novel sampling scheme is proposed which enables the development of an efficient algorithm to measure uncertainty. Considering the practicality of neighborhood relationship in real social data, a framework is introduced to transform the uncertain networks into deterministic weight networks where the weight on edges can be measured as Jaccard-like index. Secondly, focusing on the dynamic of social data, a practical framework is proposed by only probing partial communities to explore the real changes of a social network data. Our probing framework minimizes the possible difference between the observed topology and the actual network through several representative communities. We also propose an algorithm that takes full advantage of our divide-and-conquer strategy which reduces the computational overhead. Thirdly, if let the number of users who are influenced be the depth of propagation and the area covered by influenced users be the breadth, most of the research results are only focused on the influence depth instead of the influence breadth. Timeliness, acceptance ratio, and breadth are three important factors that significantly affect the result of influence maximization in reality, but they are neglected by researchers in most of time. To fill the gap, a novel algorithm that incorporates time delay for timeliness, opportunistic selection for acceptance ratio, and broad diffusion for influence breadth has been investigated. In our model, the breadth of influence is measured by the number of covered communities, and the tradeoff between depth and breadth of influence could be balanced by a specific parameter. Furthermore, the problem of privacy preserved influence maximization in both physical location network and online social network was addressed. We merge both the sensed location information collected from cyber-physical world and relationship information gathered from online social network into a unified framework with a comprehensive model. Then we propose the resolution for influence maximization problem with an efficient algorithm. At the same time, a privacy-preserving mechanism are proposed to protect the cyber physical location and link information from the application aspect. Last but not least, to address the challenge of large-scale data, we take the lead in designing an efficient influence maximization framework based on two new models which incorporate the dynamism of networks with consideration of time constraint during the influence spreading process in practice. All proposed problems and models of influence analysis have been empirically studied and verified by different, large-scale, real-world social data in this dissertation
Recommended from our members
Link formation in mobile and economic networks : model and empirical analysis
In this dissertation, we study three link formation problems in mobile and economic networks: (i) company matching for mergers and acquisitions (M&A) network in the high-technology (high-tech) industry, (ii) mobile application (app) matching for cross promotion network in mobile app markets, and (iii) online friendship formation in mobile social networks. Each problem can be modeled as link formation problem in a graph, where nodes represent independent entities (e.g., companies, apps, users) and edges represent interactions (e.g., transactions, promotions, friendships) among the nodes. First, we propose a new data-analytic approach to measure firms' dyadic business proximity to analyze M&A network in the high-tech industry. Specifically, our method analyzes the unstructured texts that describe firms' businesses using latent Dirichlet allocation (LDA) topic modeling, and constructs a novel business proximity measure based on the output. Using CrunchBase data including 24,382 high-tech companies and 1,689 M&A transactions, we empirically validate our business proximity measure in the context of industry intelligence and show the measure's effectiveness in an application of M&A network analysis. Based on the research, we build a cloud-based information system to facilitate competitive intelligence on the high-tech industry. Second, we analyze mobile app matching for cross promotion network in mobile app markets. Cross promotion (CP) is a new app promotion framework, in which a mobile app is promoted to the users of another app. Using IGAWorks data covering 1,011 CP campaigns, 325 apps, and 301,183 users, we evaluate the effectiveness of CP campaigns in comparison with existing ad channels such as mobile display ads. While CP campaigns, on average, are still suboptimal as compared with display ads, we find evidence that a careful matching of mobile apps can significantly improve the effectiveness of CP campaigns. Our empirical results show that app similarity, measured by LDA from apps' text descriptions, is a significant factor that increases the user engagement in CP campaigns. With this observation, we propose an app matching mechanism for the CP network to improve the ad effectiveness. Third, we study friendship network formation in a location-based social network. We build a structural model of social link creation that incorporates individual characteristics and pairwise user similarities. Specifically, we define four user proximity measures from biography, geography, mobility, and short messages (i.e., tweets). To construct proximity from unstructured text information, we build LDA topic models of user biography texts and tweets. Using Gowalla data with 385,306 users, three million locations, and 35 million check-in records, we empirically estimate the structural model to find evidence on the homophily effect in network formation.Computer Science
PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION
Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth.
Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations.
In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter.
For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist
Emergence of scale-free close-knit friendship structure in online social networks
Despite the structural properties of online social networks have attracted
much attention, the properties of the close-knit friendship structures remain
an important question. Here, we mainly focus on how these mesoscale structures
are affected by the local and global structural properties. Analyzing the data
of four large-scale online social networks reveals several common structural
properties. It is found that not only the local structures given by the
indegree, outdegree, and reciprocal degree distributions follow a similar
scaling behavior, the mesoscale structures represented by the distributions of
close-knit friendship structures also exhibit a similar scaling law. The degree
correlation is very weak over a wide range of the degrees. We propose a simple
directed network model that captures the observed properties. The model
incorporates two mechanisms: reciprocation and preferential attachment. Through
rate equation analysis of our model, the local-scale and mesoscale structural
properties are derived. In the local-scale, the same scaling behavior of
indegree and outdegree distributions stems from indegree and outdegree of nodes
both growing as the same function of the introduction time, and the reciprocal
degree distribution also shows the same power-law due to the linear
relationship between the reciprocal degree and in/outdegree of nodes. In the
mesoscale, the distributions of four closed triples representing close-knit
friendship structures are found to exhibit identical power-laws, a behavior
attributed to the negligible degree correlations. Intriguingly, all the
power-law exponents of the distributions in the local-scale and mesoscale
depend only on one global parameter -- the mean in/outdegree, while both the
mean in/outdegree and the reciprocity together determine the ratio of the
reciprocal degree of a node to its in/outdegree.Comment: 48 pages, 34 figure
- …