30 research outputs found
DeepWalk: Online Learning of Social Representations
We present DeepWalk, a novel approach for learning latent representations of
vertices in a network. These latent representations encode social relations in
a continuous vector space, which is easily exploited by statistical models.
DeepWalk generalizes recent advancements in language modeling and unsupervised
feature learning (or deep learning) from sequences of words to graphs. DeepWalk
uses local information obtained from truncated random walks to learn latent
representations by treating walks as the equivalent of sentences. We
demonstrate DeepWalk's latent representations on several multi-label network
classification tasks for social networks such as BlogCatalog, Flickr, and
YouTube. Our results show that DeepWalk outperforms challenging baselines which
are allowed a global view of the network, especially in the presence of missing
information. DeepWalk's representations can provide scores up to 10%
higher than competing methods when labeled data is sparse. In some experiments,
DeepWalk's representations are able to outperform all baseline methods while
using 60% less training data. DeepWalk is also scalable. It is an online
learning algorithm which builds useful incremental results, and is trivially
parallelizable. These qualities make it suitable for a broad class of real
world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table
Extended Edgecluster based Technique for Social Networking Collective Behavior Learning System
Growing interest and continuous development of social network sites like Facebook Twitter Flicker and YouTube etc turn to several researchers for research study planning and rigorous development Exact people behavior prediction is the most important challenge of these on-line social networking websites This research focus to learn to predict collective behavior in social media networks Particularly provided information about some person how can we collect the behavior of unobserved persons in the same network These tremendous growing networks in social media are of massive size involving large number of actors The computational scale of these networks makes necessary scalable learning for models for collective collaborative behavior prediction This scalability issue is solved by the proposed k-means clustering algorithm which is used to partition the edges into disjoint distinct sets with each set is showing one separate affiliation This edge-centric structure represents that the extracted social dimensions are definitely sparse in nature This model idealized on the sparse natured social dimensions shows efficient prediction performance than earlier existing approaches The proposed approach can effectively able to work for sparse social networks of any growing size The important advantage of this method is that it easily grows upon to handle networks with large number of actors while existing methods was unable to do This scalable approach effectively used over of online network collective behavior on a large scal
On multi-view learning with additive models
In many scientific settings data can be naturally partitioned into variable
groupings called views. Common examples include environmental (1st view) and
genetic information (2nd view) in ecological applications, chemical (1st view)
and biological (2nd view) data in drug discovery. Multi-view data also occur in
text analysis and proteomics applications where one view consists of a graph
with observations as the vertices and a weighted measure of pairwise similarity
between observations as the edges. Further, in several of these applications
the observations can be partitioned into two sets, one where the response is
observed (labeled) and the other where the response is not (unlabeled). The
problem for simultaneously addressing viewed data and incorporating unlabeled
observations in training is referred to as multi-view transductive learning. In
this work we introduce and study a comprehensive generalized fixed point
additive modeling framework for multi-view transductive learning, where any
view is represented by a linear smoother. The problem of view selection is
discussed using a generalized Akaike Information Criterion, which provides an
approach for testing the contribution of each view. An efficient implementation
is provided for fitting these models with both backfitting and local-scoring
type algorithms adjusted to semi-supervised graph-based learning. The proposed
technique is assessed on both synthetic and real data sets and is shown to be
competitive to state-of-the-art co-training and graph-based techniques.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS202 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Modeling Complex Networks For (Electronic) Commerce
NYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc
A multi-resolution approach to learning with overlapping communities
The recent few years have witnessed a rapid surge of par-ticipatory web and social media, enabling a new laboratory for studying human relations and collective behavior on an unprecedented scale. In this work, we attempt to harness the predictive power of social connections to determine the preferences or behaviors of individuals such as whether a user supports a certain political view, whether one likes one product, whether he/she would like to vote for a presidential candidate, etc. Since an actor is likely to participate in mul-tiple different communities with each regulating the actor’s behavior in varying degrees, and a natural hierarchy might exist between these communities, we propose to zoom into a network at multiple different resolutions and determine which communities are informative of a targeted behavior. We develop an efficient algorithm to extract a hierarchy of overlapping communities. Empirical results on several large-scale social media networks demonstrate the superiority of our proposed approach over existing ones without consider-ing the multi-resolution or overlapping property, indicating its highly promising potential in real-world applications
Recommended from our members
Leveraging Structure to Improve Classification Performance in Sparsely Labeled Networks
We address the problem of classification in a partially labeled network (a.k.a. within-network classification), with an emphasis on tasks in which we have very few labeled instances to start with. Recent work has demonstrated the utility of collective classification (i.e., simultaneous inferences over class labels of related instances) in this general problem setting. However, the performance of collective classification algorithms can be adversely affected by the sparseness of labels in real-world networks. We show that on several real-world data sets, collective classification appears to offer little advantage in general and hurts performance in the worst cases. In this paper, we explore a complimentary approach to within-network classification that takes advantage of network structure. Our approach is motivated by the observation that real-world networks often provide a great deal more structural information than attribute information (e.g., class labels). Through experiments on supervised and semi-supervised classifiers of network data, we demonstrate that a small number of structural features can lead to consistent and sometimes dramatic improvements in classification performance. We also examine the relative utility of individual structural features and show that, in many cases, it is a combination of both local and global network structure that is most informative