129,368 research outputs found
On Graph Stream Clustering with Side Information
Graph clustering becomes an important problem due to emerging applications
involving the web, social networks and bio-informatics. Recently, many such
applications generate data in the form of streams. Clustering massive, dynamic
graph streams is significantly challenging because of the complex structures of
graphs and computational difficulties of continuous data. Meanwhile, a large
volume of side information is associated with graphs, which can be of various
types. The examples include the properties of users in social network
activities, the meta attributes associated with web click graph streams and the
location information in mobile communication networks. Such attributes contain
extremely useful information and has the potential to improve the clustering
process, but are neglected by most recent graph stream mining techniques. In
this paper, we define a unified distance measure on both link structures and
side attributes for clustering. In addition, we propose a novel optimization
framework DMO, which can dynamically optimize the distance metric and make it
adapt to the newly received stream data. We further introduce a carefully
designed statistics SGS(C) which consume constant storage spaces with the
progression of streams. We demonstrate that the statistics maintained are
sufficient for the clustering process as well as the distance optimization and
can be scalable to massive graphs with side attributes. We will present
experiment results to show the advantages of the approach in graph stream
clustering with both links and side information over the baselines.Comment: Full version of SIAM SDM 2013 pape
The Block Point Process Model for Continuous-Time Event-Based Dynamic Networks
We consider the problem of analyzing timestamped relational events between a
set of entities, such as messages between users of an on-line social network.
Such data are often analyzed using static or discrete-time network models,
which discard a significant amount of information by aggregating events over
time to form network snapshots. In this paper, we introduce a block point
process model (BPPM) for continuous-time event-based dynamic networks. The BPPM
is inspired by the well-known stochastic block model (SBM) for static networks.
We show that networks generated by the BPPM follow an SBM in the limit of a
growing number of nodes. We use this property to develop principled and
efficient local search and variational inference procedures initialized by
regularized spectral clustering. We fit BPPMs with exponential Hawkes processes
to analyze several real network data sets, including a Facebook wall post
network with over 3,500 nodes and 130,000 events.Comment: To appear at The Web Conference 201
Finding knowledge in students social network
Social networking has been one of the widely used approaches in the communication technology movement. It is become a new trend of getting connected to other people and definitely it stores huge data including user activities and their shared materials. Many have seen the importance of collecting data for future benefits. In recent years, many companies have successfully analyzed their customer behaviour using various data mining techniques. One of the latest applications of data mining is in social network sites or environments. The objective of this paper is to present the analysis of social network user behaviour using clustering technique and centrality coefficient on university studentsâ involvement. The result of the analysis is then validated with a questionnaire-based personality test. The study discovers the patterns of studentsâ participation in social networking can be related to their personal behaviour that reflected by their characteristic and online activities. The analysis extends the research on promoting dynamic study culture at the higher learning institutions through online social network
Recherche et représentation de communautés dans des grands graphes
15 pagesNational audienceThis paper deals with the analysis and the visualization of large graphs. Our interest in such a subject-matter is related to the fact that graphs are convenient widespread data structures. Indeed, this type of data can be encountered in a growing number of concrete problems: Web, information retrieval, social networks, biological interaction networks... Furthermore, the size of these graphs becomes increasingly large as the progression of the means for data gathering and storage steadily strengthens. This calls for new methods in graph analysis and visualization which are now important and dynamic research fields at the interface of many disciplines such as mathematics, statistics, computer science and sociology. In this paper, we propose a method for graphs representation and visualization based on a prior clustering of the vertices. Newman and Girvan (2004) points out that âreducing [the] level of complexity [of a network] to one that can be interpreted readily by the human eye, will be invaluable in helping us to understand the large-scale structure of these new network dataâ: we rely on this assumption to use a priori a clustering of the vertices as a preliminary step for simplifying the representation of the graphs - as a whole. The clustering phase consists in optimizing a quality measure specifically suitable for the research of dense groups in graphs. This quality measure is the modularity and expresses the âdistanceâ to a null model in which the graph edges do not depend on the clustering. The modularity has shown its relevance in solving the problem of uncovering dense groups in a graph. Optimization of the modularity is done through a stochastic simulated annealing algorithm. The visualization/representation phase, as such, is based on a force-directed algorithm described in Truong et al. (2007). After giving a short introduction to the problem and detailing the vertices clustering and representation algorithms, the paper will introduce and discuss two applications from the social network field
The Dynamics of Multi-Modal Networks
The widespread study of networks in diverse domains, including social, technological, and scientific settings, has increased the interest in statistical and machine learning techniques for network analysis. Many of these networks are complex, involving more than one kind of entity, and multiple relationship types, both changing over time. While there have been many network analysis methods proposed for problems such as network evolution, community detection, information diffusion and opinion leader identification, the majority of these methods assume a single entity type, a single edge type and often no temporal dynamics. One of the main shortcomings of these traditional techniques is their inadequacy for capturing higher-order dependencies often present in real, complex networks.
To address these shortcomings, I focus on analysis and inference in dynamic, multi-modal, multi-relational networks, containing multiple entity types (such as people, social groups, organizations, locations, etc.), and different relationship types (such as friendship, membership, affiliation, etc.). An example from social network theory is a network describing users, organizations and interest groups, where users have different types of ties among each other, such as friendship, family ties, etc., as well as affiliation and membership links with organizations and interest groups. By considering the complex structure of these networks rather than limiting the analysis to a single entity or relationship type, I show how we can build richer predictive models that provide better understanding of the network dynamics, and thus result in better quality predictions.
In the first part of my dissertation, I address the problems of network evolution and clustering. For network evolution, I describe methods for modeling the interactions between different modalities, and propose a co-evolution model for social and affiliation networks. I then move to the problem of network clustering, where I propose a novel algorithm for clustering multi-modal, multi-relational data. The second part of my dissertation focuses on the temporal dynamics of interactions in complex networks, from both user-level and network-level perspectives. For the user-centric approach, I analyze the dynamics of user relationships with other entity types, proposing a measure of the "loyalty" a user shows for a given group or topic, based on her temporal interaction pattern. I then move to macroscopic-level approaches for analyzing the dynamic processes that occur on a network scale. I propose a new differential adaptive diffusion model for incorporating diversity and trust in the process of information diffusion on multi-modal, multi-relational networks. I also discuss the implications of the proposed diffusion model on designing new strategies for viral marketing and influential detection. I validate all the proposed methods on several real-world networks from multiple domains
- âŠ