129,368 research outputs found

    On Graph Stream Clustering with Side Information

    Full text link
    Graph clustering becomes an important problem due to emerging applications involving the web, social networks and bio-informatics. Recently, many such applications generate data in the form of streams. Clustering massive, dynamic graph streams is significantly challenging because of the complex structures of graphs and computational difficulties of continuous data. Meanwhile, a large volume of side information is associated with graphs, which can be of various types. The examples include the properties of users in social network activities, the meta attributes associated with web click graph streams and the location information in mobile communication networks. Such attributes contain extremely useful information and has the potential to improve the clustering process, but are neglected by most recent graph stream mining techniques. In this paper, we define a unified distance measure on both link structures and side attributes for clustering. In addition, we propose a novel optimization framework DMO, which can dynamically optimize the distance metric and make it adapt to the newly received stream data. We further introduce a carefully designed statistics SGS(C) which consume constant storage spaces with the progression of streams. We demonstrate that the statistics maintained are sufficient for the clustering process as well as the distance optimization and can be scalable to massive graphs with side attributes. We will present experiment results to show the advantages of the approach in graph stream clustering with both links and side information over the baselines.Comment: Full version of SIAM SDM 2013 pape

    The Block Point Process Model for Continuous-Time Event-Based Dynamic Networks

    Full text link
    We consider the problem of analyzing timestamped relational events between a set of entities, such as messages between users of an on-line social network. Such data are often analyzed using static or discrete-time network models, which discard a significant amount of information by aggregating events over time to form network snapshots. In this paper, we introduce a block point process model (BPPM) for continuous-time event-based dynamic networks. The BPPM is inspired by the well-known stochastic block model (SBM) for static networks. We show that networks generated by the BPPM follow an SBM in the limit of a growing number of nodes. We use this property to develop principled and efficient local search and variational inference procedures initialized by regularized spectral clustering. We fit BPPMs with exponential Hawkes processes to analyze several real network data sets, including a Facebook wall post network with over 3,500 nodes and 130,000 events.Comment: To appear at The Web Conference 201

    Finding knowledge in students social network

    Get PDF
    Social networking has been one of the widely used approaches in the communication technology movement. It is become a new trend of getting connected to other people and definitely it stores huge data including user activities and their shared materials. Many have seen the importance of collecting data for future benefits. In recent years, many companies have successfully analyzed their customer behaviour using various data mining techniques. One of the latest applications of data mining is in social network sites or environments. The objective of this paper is to present the analysis of social network user behaviour using clustering technique and centrality coefficient on university students’ involvement. The result of the analysis is then validated with a questionnaire-based personality test. The study discovers the patterns of students’ participation in social networking can be related to their personal behaviour that reflected by their characteristic and online activities. The analysis extends the research on promoting dynamic study culture at the higher learning institutions through online social network

    Recherche et représentation de communautés dans des grands graphes

    Get PDF
    15 pagesNational audienceThis paper deals with the analysis and the visualization of large graphs. Our interest in such a subject-matter is related to the fact that graphs are convenient widespread data structures. Indeed, this type of data can be encountered in a growing number of concrete problems: Web, information retrieval, social networks, biological interaction networks... Furthermore, the size of these graphs becomes increasingly large as the progression of the means for data gathering and storage steadily strengthens. This calls for new methods in graph analysis and visualization which are now important and dynamic research fields at the interface of many disciplines such as mathematics, statistics, computer science and sociology. In this paper, we propose a method for graphs representation and visualization based on a prior clustering of the vertices. Newman and Girvan (2004) points out that “reducing [the] level of complexity [of a network] to one that can be interpreted readily by the human eye, will be invaluable in helping us to understand the large-scale structure of these new network data”: we rely on this assumption to use a priori a clustering of the vertices as a preliminary step for simplifying the representation of the graphs - as a whole. The clustering phase consists in optimizing a quality measure specifically suitable for the research of dense groups in graphs. This quality measure is the modularity and expresses the “distance” to a null model in which the graph edges do not depend on the clustering. The modularity has shown its relevance in solving the problem of uncovering dense groups in a graph. Optimization of the modularity is done through a stochastic simulated annealing algorithm. The visualization/representation phase, as such, is based on a force-directed algorithm described in Truong et al. (2007). After giving a short introduction to the problem and detailing the vertices clustering and representation algorithms, the paper will introduce and discuss two applications from the social network field

    The Dynamics of Multi-Modal Networks

    Get PDF
    The widespread study of networks in diverse domains, including social, technological, and scientific settings, has increased the interest in statistical and machine learning techniques for network analysis. Many of these networks are complex, involving more than one kind of entity, and multiple relationship types, both changing over time. While there have been many network analysis methods proposed for problems such as network evolution, community detection, information diffusion and opinion leader identification, the majority of these methods assume a single entity type, a single edge type and often no temporal dynamics. One of the main shortcomings of these traditional techniques is their inadequacy for capturing higher-order dependencies often present in real, complex networks. To address these shortcomings, I focus on analysis and inference in dynamic, multi-modal, multi-relational networks, containing multiple entity types (such as people, social groups, organizations, locations, etc.), and different relationship types (such as friendship, membership, affiliation, etc.). An example from social network theory is a network describing users, organizations and interest groups, where users have different types of ties among each other, such as friendship, family ties, etc., as well as affiliation and membership links with organizations and interest groups. By considering the complex structure of these networks rather than limiting the analysis to a single entity or relationship type, I show how we can build richer predictive models that provide better understanding of the network dynamics, and thus result in better quality predictions. In the first part of my dissertation, I address the problems of network evolution and clustering. For network evolution, I describe methods for modeling the interactions between different modalities, and propose a co-evolution model for social and affiliation networks. I then move to the problem of network clustering, where I propose a novel algorithm for clustering multi-modal, multi-relational data. The second part of my dissertation focuses on the temporal dynamics of interactions in complex networks, from both user-level and network-level perspectives. For the user-centric approach, I analyze the dynamics of user relationships with other entity types, proposing a measure of the "loyalty" a user shows for a given group or topic, based on her temporal interaction pattern. I then move to macroscopic-level approaches for analyzing the dynamic processes that occur on a network scale. I propose a new differential adaptive diffusion model for incorporating diversity and trust in the process of information diffusion on multi-modal, multi-relational networks. I also discuss the implications of the proposed diffusion model on designing new strategies for viral marketing and influential detection. I validate all the proposed methods on several real-world networks from multiple domains
    • 

    corecore