3,846 research outputs found

    Characterization of complex networks: A survey of measurements

    Full text link
    Each complex network (or class of networks) presents specific topological features which characterize its connectivity and highly influence the dynamics of processes executed on the network. The analysis, discrimination, and synthesis of complex networks therefore rely on the use of measurements capable of expressing the most relevant topological features. This article presents a survey of such measurements. It includes general considerations about complex network characterization, a brief review of the principal models, and the presentation of the main existing measurements. Important related issues covered in this work comprise the representation of the evolution of complex networks in terms of trajectories in several measurement spaces, the analysis of the correlations between some of the most traditional measurements, perturbation analysis, as well as the use of multivariate statistics for feature selection and network classification. Depending on the network and the analysis task one has in mind, a specific set of features may be chosen. It is hoped that the present survey will help the proper application and interpretation of measurements.Comment: A working manuscript with 78 pages, 32 figures. Suggestions of measurements for inclusion are welcomed by the author

    Algorithms for the Identification of Central Nodes in Large Real-World Networks

    Get PDF

    Kirchhoff Index As a Measure of Edge Centrality in Weighted Networks: Nearly Linear Time Algorithms

    Full text link
    Most previous work of centralities focuses on metrics of vertex importance and methods for identifying powerful vertices, while related work for edges is much lesser, especially for weighted networks, due to the computational challenge. In this paper, we propose to use the well-known Kirchhoff index as the measure of edge centrality in weighted networks, called θ\theta-Kirchhoff edge centrality. The Kirchhoff index of a network is defined as the sum of effective resistances over all vertex pairs. The centrality of an edge ee is reflected in the increase of Kirchhoff index of the network when the edge ee is partially deactivated, characterized by a parameter θ\theta. We define two equivalent measures for θ\theta-Kirchhoff edge centrality. Both are global metrics and have a better discriminating power than commonly used measures, based on local or partial structural information of networks, e.g. edge betweenness and spanning edge centrality. Despite the strong advantages of Kirchhoff index as a centrality measure and its wide applications, computing the exact value of Kirchhoff edge centrality for each edge in a graph is computationally demanding. To solve this problem, for each of the θ\theta-Kirchhoff edge centrality metrics, we present an efficient algorithm to compute its ϵ\epsilon-approximation for all the mm edges in nearly linear time in mm. The proposed θ\theta-Kirchhoff edge centrality is the first global metric of edge importance that can be provably approximated in nearly-linear time. Moreover, according to the θ\theta-Kirchhoff edge centrality, we present a θ\theta-Kirchhoff vertex centrality measure, as well as a fast algorithm that can compute ϵ\epsilon-approximate Kirchhoff vertex centrality for all the nn vertices in nearly linear time in mm

    Knowledge Base Population using Semantic Label Propagation

    Get PDF
    A crucial aspect of a knowledge base population system that extracts new facts from text corpora, is the generation of training data for its relation extractors. In this paper, we present a method that maximizes the effectiveness of newly trained relation extractors at a minimal annotation cost. Manual labeling can be significantly reduced by Distant Supervision, which is a method to construct training data automatically by aligning a large text corpus with an existing knowledge base of known facts. For example, all sentences mentioning both 'Barack Obama' and 'US' may serve as positive training instances for the relation born_in(subject,object). However, distant supervision typically results in a highly noisy training set: many training sentences do not really express the intended relation. We propose to combine distant supervision with minimal manual supervision in a technique called feature labeling, to eliminate noise from the large and noisy initial training set, resulting in a significant increase of precision. We further improve on this approach by introducing the Semantic Label Propagation method, which uses the similarity between low-dimensional representations of candidate training instances, to extend the training set in order to increase recall while maintaining high precision. Our proposed strategy for generating training data is studied and evaluated on an established test collection designed for knowledge base population tasks. The experimental results show that the Semantic Label Propagation strategy leads to substantial performance gains when compared to existing approaches, while requiring an almost negligible manual annotation effort.Comment: Submitted to Knowledge Based Systems, special issue on Knowledge Bases for Natural Language Processin

    Sampling-based Algorithms for Optimal Motion Planning

    Get PDF
    During the last decade, sampling-based path planning algorithms, such as Probabilistic RoadMaps (PRM) and Rapidly-exploring Random Trees (RRT), have been shown to work well in practice and possess theoretical guarantees such as probabilistic completeness. However, little effort has been devoted to the formal analysis of the quality of the solution returned by such algorithms, e.g., as a function of the number of samples. The purpose of this paper is to fill this gap, by rigorously analyzing the asymptotic behavior of the cost of the solution returned by stochastic sampling-based algorithms as the number of samples increases. A number of negative results are provided, characterizing existing algorithms, e.g., showing that, under mild technical conditions, the cost of the solution returned by broadly used sampling-based algorithms converges almost surely to a non-optimal value. The main contribution of the paper is the introduction of new algorithms, namely, PRM* and RRT*, which are provably asymptotically optimal, i.e., such that the cost of the returned solution converges almost surely to the optimum. Moreover, it is shown that the computational complexity of the new algorithms is within a constant factor of that of their probabilistically complete (but not asymptotically optimal) counterparts. The analysis in this paper hinges on novel connections between stochastic sampling-based path planning algorithms and the theory of random geometric graphs.Comment: 76 pages, 26 figures, to appear in International Journal of Robotics Researc

    Functional programming and graph algorithms

    Get PDF
    This thesis is an investigation of graph algorithms in the non-strict purely functional language Haskell. Emphasis is placed on the importance of achieving an asymptotic complexity as good as with conventional languages. This is achieved by using the monadic model for including actions on the state. Work on the monadic model was carried out at Glasgow University by Wadler, Peyton Jones, and Launchbury in the early nineties and has opened up many diverse application areas. One area is the ability to express data structures that require sharing. Although graphs are not presented in this style, data structures that graph algorithms use are expressed in this style. Several examples of stateful algorithms are given including union/find for disjoint sets, and the linear time sort binsort. The graph algorithms presented are not new, but are traditional algorithms recast in a functional setting. Examples include strongly connected components, biconnected components, Kruskal's minimum cost spanning tree, and Dijkstra's shortest paths. The presentation is lucid giving more insight than usual. The functional setting allows for complete calculational style correctness proofs - which is demonstrated with many examples. The benefits of using a functional language for expressing graph algorithms are quantified by looking at the issues of execution times, asymptotic complexity, correctness, and clarity, in comparison with traditional approaches. The intention is to be as objective as possible, pointing out both the weaknesses and the strengths of using a functional language

    Parameterized Algorithms for Graph Partitioning Problems

    Get PDF
    In parameterized complexity, a problem instance (I, k) consists of an input I and an extra parameter k. The parameter k usually a positive integer indicating the size of the solution or the structure of the input. A computational problem is called fixed-parameter tractable (FPT) if there is an algorithm for the problem with time complexity O(f(k).nc ), where f(k) is a function dependent only on the input parameter k, n is the size of the input and c is a constant. The existence of such an algorithm means that the problem is tractable for fixed values of the parameter. In this thesis, we provide parameterized algorithms for the following NP-hard graph partitioning problems: (i) Matching Cut Problem: In an undirected graph, a matching cut is a partition of vertices into two non-empty sets such that the edges across the sets induce a matching. The matching cut problem is the problem of deciding whether a given graph has a matching cut. The Matching Cut problem is expressible in monadic second-order logic (MSOL). The MSOL formulation, together with Courcelle’s theorem implies linear time solvability on graphs with bounded tree-width. However, this approach leads to a running time of f(||ϕ||, t) · n, where ||ϕ|| is the length of the MSOL formula, t is the tree-width of the graph and n is the number of vertices of the graph. The dependency of f(||ϕ||, t) on ||ϕ|| can be as bad as a tower of exponentials. In this thesis we give a single exponential algorithm for the Matching Cut problem with tree-width alone as the parameter. The running time of the algorithm is 2O(t) · n. This answers an open question posed by Kratsch and Le [Theoretical Computer Science, 2016]. We also show the fixed parameter tractability of the Matching Cut problem when parameterized by neighborhood diversity or other structural parameters. (ii) H-Free Coloring Problems: In an undirected graph G for a fixed graph H, the H-Free q-Coloring problem asks to color the vertices of the graph G using at most q colors such that none of the color classes contain H as an induced subgraph. That is every color class is H-free. This is a generalization of the classical q-Coloring problem, which is to color the vertices of the graph using at most q colors such that no pair of adjacent vertices are of the same color. The H-Free Chromatic Number is the minimum number of colors required to H-free color the graph. For a fixed q, the H-Free q-Coloring problem is expressible in monadic secondorder logic (MSOL). The MSOL formulation leads to an algorithm with time complexity f(||ϕ||, t) · n, where ||ϕ|| is the length of the MSOL formula, t is the tree-width of the graph and n is the number of vertices of the graph. In this thesis we present the following explicit combinatorial algorithms for H-Free Coloring problems: • An O(q O(t r ) · n) time algorithm for the general H-Free q-Coloring problem, where r = |V (H)|. • An O(2t+r log t · n) time algorithm for Kr-Free 2-Coloring problem, where Kr is a complete graph on r vertices. The above implies an O(t O(t r ) · n log t) time algorithm to compute the H-Free Chromatic Number for graphs with tree-width at most t. Therefore H-Free Chromatic Number is FPT with respect to tree-width. We also address a variant of H-Free q-Coloring problem which we call H-(Subgraph)Free q-Coloring problem, which is to color the vertices of the graph such that none of the color classes contain H as a subgraph (need not be induced). We present the following algorithms for H-(Subgraph)Free q-Coloring problems. • An O(q O(t r ) · n) time algorithm for the general H-(Subgraph)Free q-Coloring problem, which leads to an O(t O(t r ) · n log t) time algorithm to compute the H- (Subgraph)Free Chromatic Number for graphs with tree-width at most t. • An O(2O(t 2 ) · n) time algorithm for C4-(Subgraph)Free 2-Coloring, where C4 is a cycle on 4 vertices. • An O(2O(t r−2 ) · n) time algorithm for {Kr\e}-(Subgraph)Free 2-Coloring, where Kr\e is a graph obtained by removing an edge from Kr. • An O(2O((tr2 ) r−2 ) · n) time algorithm for Cr-(Subgraph)Free 2-Coloring problem, where Cr is a cycle of length r. (iii) Happy Coloring Problems: In a vertex-colored graph, an edge is happy if its endpoints have the same color. Similarly, a vertex is happy if all its incident edges are happy. we consider the algorithmic aspects of the following Maximum Happy Edges (k-MHE) problem: given a partially k-colored graph G, find an extended full k-coloring of G such that the number of happy edges are maximized. When we want to maximize the number of happy vertices, the problem is known as Maximum Happy Vertices (k-MHV). We show that both k-MHE and k-MHV admit polynomial-time algorithms for trees. We show that k-MHE admits a kernel of size k + `, where ` is the natural parameter, the number of happy edges. We show the hardness of k-MHE and k-MHV for some special graphs such as split graphs and bipartite graphs. We show that both k-MHE and k-MHV are tractable for graphs with bounded tree-width and graphs with bounded neighborhood diversity. vii In the last part of the thesis we present an algorithm for the Replacement Paths Problem which is defined as follows: Let G (|V (G)| = n and |E(G)| = m) be an undirected graph with positive edge weights. Let PG(s, t) be a shortest s − t path in G. Let l be the number of edges in PG(s, t). The Edge Replacement Path problem is to compute a shortest s − t path in G\{e}, for every edge e in PG(s, t). The Node Replacement Path problem is to compute a shortest s−t path in G\{v}, for every vertex v in PG(s, t). We present an O(TSP T (G) + m + l 2 ) time and O(m + l 2 ) space algorithm for both the problems, where TSP T (G) is the asymptotic time to compute a single source shortest path tree in G. The proposed algorithm is simple and easy to implement
    corecore