606 research outputs found

    Centrality measures and analyzing dot-product graphs

    Full text link
    In this thesis we investigate two topics in data mining on graphs; in the first part we investigate the notion of centrality in graphs, in the second part we look at reconstructing graphs from aggregate information. In many graph related problems the goal is to rank nodes based on an importance score. This score is in general referred to as node centrality. In Part I. we start by giving a novel and more efficient algorithm for computing betweenness centrality. In many applications not an individual node but rather a set of nodes is chosen to perform some task. We generalize the notion of centrality to groups of nodes. While group centrality was first formally defined by Everett and Borgatti (1999), we are the first to pose it as a combinatorial optimization problem; find a group of k nodes with largest centrality. We give an algorithm for solving this optimization problem for a general notion of centrality that subsumes various instantiations of centrality that find paths in the graph. We prove that this problem is NP-hard for specific centrality definitions and we provide a universal algorithm for this problem that can be modified to optimize the specific measures. We also investigate the problem of increasing node centrality by adding or deleting edges in the graph. We conclude this part by solving the optimization problem for two specific applications; one for minimizing redundancy in information propagation networks and one for optimizing the expected number of interceptions of a group in a random navigational network. In the second part of the thesis we investigate what we can infer about a bipartite graph if only some aggregate information -- the number of common neighbors among each pair of nodes -- is given. First, we observe that the given data is equivalent to the dot-product of the adjacency vectors of each node. Based on this knowledge we develop an algorithm that is based on SVD-decomposition, that is capable of almost perfectly reconstructing graphs from such neighborhood data. We investigate two versions of this problem, in the versions the dot-product of nodes with themselves, e.g. the node degrees, are either known or hidden

    Exact Distributed Load Centrality Computation: Algorithms, Convergence, and Applications to Distance Vector Routing

    Get PDF
    Many optimization techniques for networking protocols take advantage of topological information to improve performance. Often, the topological information at the core of these techniques is a centrality metric such as the Betweenness Centrality (BC) index. BC is, in fact, a centrality metric with many well-known successful applications documented in the literature, from resource allocation to routing. To compute BC, however, each node must run a centralized algorithm and needs to have the global topological knowledge; such requirements limit the feasibility of optimization procedures based on BC. To overcome restrictions of this kind, we present a novel distributed algorithm that requires only local information to compute an alternative similar metric, called Load Centrality (LC). We present the new algorithm together with a proof of its convergence and the analysis of its time complexity. The proposed algorithm is general enough to be integrated with any distance vector (DV) routing protocol. In support of this claim, we provide an implementation on top of Babel, a real-world DV protocol. We use this implementation in an emulation framework to show how LC can be exploited to reduce Babel's convergence time upon node failure, without increasing control overhead. As a key step towards the adoption of centrality-based optimization for routing, we study how the algorithm can be incrementally introduced in a network running a DV routing protocol. We show that even when only a small fraction of nodes participate in the protocol, the algorithm accurately ranks nodes according to their centrality

    A data analysis of the academic use of social media

    Get PDF

    Extraction and Analysis of Facebook Friendship Relations

    Get PDF
    Online Social Networks (OSNs) are a unique Web and social phenomenon, affecting tastes and behaviors of their users and helping them to maintain/create friendships. It is interesting to analyze the growth and evolution of Online Social Networks both from the point of view of marketing and other of new services and from a scientific viewpoint, since their structure and evolution may share similarities with real-life social networks. In social sciences, several techniques for analyzing (online) social networks have been developed, to evaluate quantitative properties (e.g., defining metrics and measures of structural characteristics of the networks) or qualitative aspects (e.g., studying the attachment model for the network evolution, the binary trust relationships, and the link prediction problem).\ud However, OSN analysis poses novel challenges both to Computer and Social scientists. We present our long-term research effort in analyzing Facebook, the largest and arguably most successful OSN today: it gathers more than 500 million users. Access to data about Facebook users and their friendship relations, is restricted; thus, we acquired the necessary information directly from the front-end of the Web site, in order to reconstruct a sub-graph representing anonymous interconnections among a significant subset of users. We describe our ad-hoc, privacy-compliant crawler for Facebook data extraction. To minimize bias, we adopt two different graph mining techniques: breadth-first search (BFS) and rejection sampling. To analyze the structural properties of samples consisting of millions of nodes, we developed a specific tool for analyzing quantitative and qualitative properties of social networks, adopting and improving existing Social Network Analysis (SNA) techniques and algorithms
    • …
    corecore