128,248 research outputs found

    Testing Cluster Structure of Graphs

    Full text link
    We study the problem of recognizing the cluster structure of a graph in the framework of property testing in the bounded degree model. Given a parameter ε\varepsilon, a dd-bounded degree graph is defined to be (k,ϕ)(k, \phi)-clusterable, if it can be partitioned into no more than kk parts, such that the (inner) conductance of the induced subgraph on each part is at least ϕ\phi and the (outer) conductance of each part is at most cd,kε4ϕ2c_{d,k}\varepsilon^4\phi^2, where cd,kc_{d,k} depends only on d,kd,k. Our main result is a sublinear algorithm with the running time O~(npoly(ϕ,k,1/ε))\widetilde{O}(\sqrt{n}\cdot\mathrm{poly}(\phi,k,1/\varepsilon)) that takes as input a graph with maximum degree bounded by dd, parameters kk, ϕ\phi, ε\varepsilon, and with probability at least 23\frac23, accepts the graph if it is (k,ϕ)(k,\phi)-clusterable and rejects the graph if it is ε\varepsilon-far from (k,ϕ)(k, \phi^*)-clusterable for ϕ=cd,kϕ2ε4logn\phi^* = c'_{d,k}\frac{\phi^2 \varepsilon^4}{\log n}, where cd,kc'_{d,k} depends only on d,kd,k. By the lower bound of Ω(n)\Omega(\sqrt{n}) on the number of queries needed for testing graph expansion, which corresponds to k=1k=1 in our problem, our algorithm is asymptotically optimal up to polylogarithmic factors.Comment: Full version of STOC 201

    Testing Higher-order Clusterability on graphs

    Full text link
    Analysis of higher-order organizations, usually small connected subgraphs called motifs, is a fundamental task on complex networks. This paper studies a new problem of testing higher-order clusterability: given query access to an undirected graph, can we judge whether this graph can be partitioned into a few clusters of highly-connected motifs? This problem is an extension of the former work proposed by Czumaj et al. (STOC' 15), who recognized cluster structure on graphs using the framework of property testing. In this paper, a good graph cluster on high dimensions is first defined for higher-order clustering. Then, query lower bound is given for testing whether this kind of good cluster exists. Finally, an optimal sublinear-time algorithm is developed for testing clusterability based on triangles

    Robust clustering oracle and local reconstructor of cluster structure of graphs

    Get PDF
    Due to the massive size of modern network data, local algorithms that run in sublinear time for analyzing the cluster structure of the graph are receiving growing interest. Two typical examples are local graph clustering algorithms that find a cluster from a seed node with running time proportional to the size of the output set, and clusterability testing algorithms that decide if a graph can be partitioned into a few clusters in the framework of property testing. In this work, we develop sublinear time algorithms for analyzing the cluster structure of graphs with noisy partial information. By using conductance based definitions for measuring the quality of clusters and the cluster structure, we formalize a definition of noisy clusterable graphs with bounded maximum degree. The algorithm is given query access to the adjacency list to such a graph. We then formalize the notion of robust clustering oracle for a noisy clusterable graph, and give an algorithm that builds such an oracle in sublinear time, which can be further used to support typical queries (e.g., IsOutlier(ss), SameCluster(s,ts,t)) regarding the cluster structure of the graph in sublinear time. All the answers are consistent with a partition of GG in which all but a small fraction of vertices belong to some good cluster. We also give a local reconstructor for a noisy clusterable graph that provides query access to a reconstructed graph that is guaranteed to be clusterable in sublinear time. All the query answers are consistent with a clusterable graph which is guaranteed to be close to the original graph

    Framework for Clique-based Fusion of Graph Streams in Multi-function System Testing

    Full text link
    The paper describes a framework for multi-function system testing. Multi-function system testing is considered as fusion (or revelation) of clique-like structures. The following sets are considered: (i) subsystems (system parts or units / components / modules), (ii) system functions and a subset of system components for each system function, and (iii) function clusters (some groups of system functions which are used jointly). Test procedures (as units testing) are used for each subsystem. The procedures lead to an ordinal result (states, colors) for each component, e.g., [1,2,3,4] (where 1 corresponds to 'out of service', 2 corresponds to 'major faults', 3 corresponds to 'minor faults', 4 corresponds to 'trouble free service'). Thus, for each system function a graph over corresponding system components is examined while taking into account ordinal estimates/colors of the components. Further, an integrated graph (i.e., colored graph) for each function cluster is considered (this graph integrates the graphs for corresponding system functions). For the integrated graph (for each function cluster) structure revelation problems are under examination (revelation of some subgraphs which can lead to system faults): (1) revelation of clique and quasi-clique (by vertices at level 1, 2, etc.; by edges/interconnection existence) and (2) dynamical problems (when vertex colors are functions of time) are studied as well: existence of a time interval when clique or quasi-clique can exist. Numerical examples illustrate the approach and problems.Comment: 6 pages, 13 figure

    Model validation of simple-graph representations of metabolism

    Full text link
    The large-scale properties of chemical reaction systems, such as the metabolism, can be studied with graph-based methods. To do this, one needs to reduce the information -- lists of chemical reactions -- available in databases. Even for the simplest type of graph representation, this reduction can be done in several ways. We investigate different simple network representations by testing how well they encode information about one biologically important network structure -- network modularity (the propensity for edges to be cluster into dense groups that are sparsely connected between each other). To reach this goal, we design a model of reaction-systems where network modularity can be controlled and measure how well the reduction to simple graphs capture the modular structure of the model reaction system. We find that the network types that best capture the modular structure of the reaction system are substrate-product networks (where substrates are linked to products of a reaction) and substance networks (with edges between all substances participating in a reaction). Furthermore, we argue that the proposed model for reaction systems with tunable clustering is a general framework for studies of how reaction-systems are affected by modularity. To this end, we investigate statistical properties of the model and find, among other things, that it recreate correlations between degree and mass of the molecules.Comment: to appear in J. Roy. Soc. Intefac

    A New Perspective on Clustered Planarity as a Combinatorial Embedding Problem

    Full text link
    The clustered planarity problem (c-planarity) asks whether a hierarchically clustered graph admits a planar drawing such that the clusters can be nicely represented by regions. We introduce the cd-tree data structure and give a new characterization of c-planarity. It leads to efficient algorithms for c-planarity testing in the following cases. (i) Every cluster and every co-cluster (complement of a cluster) has at most two connected components. (ii) Every cluster has at most five outgoing edges. Moreover, the cd-tree reveals interesting connections between c-planarity and planarity with constraints on the order of edges around vertices. On one hand, this gives rise to a bunch of new open problems related to c-planarity, on the other hand it provides a new perspective on previous results.Comment: 17 pages, 2 figure
    corecore