128,248 research outputs found
Testing Cluster Structure of Graphs
We study the problem of recognizing the cluster structure of a graph in the
framework of property testing in the bounded degree model. Given a parameter
, a -bounded degree graph is defined to be -clusterable, if it can be partitioned into no more than parts, such
that the (inner) conductance of the induced subgraph on each part is at least
and the (outer) conductance of each part is at most
, where depends only on . Our main
result is a sublinear algorithm with the running time
that takes as
input a graph with maximum degree bounded by , parameters , ,
, and with probability at least , accepts the graph if it
is -clusterable and rejects the graph if it is -far from
-clusterable for , where depends only on . By the lower
bound of on the number of queries needed for testing graph
expansion, which corresponds to in our problem, our algorithm is
asymptotically optimal up to polylogarithmic factors.Comment: Full version of STOC 201
Testing Higher-order Clusterability on graphs
Analysis of higher-order organizations, usually small connected subgraphs
called motifs, is a fundamental task on complex networks. This paper studies a
new problem of testing higher-order clusterability: given query access to an
undirected graph, can we judge whether this graph can be partitioned into a few
clusters of highly-connected motifs? This problem is an extension of the former
work proposed by Czumaj et al. (STOC' 15), who recognized cluster structure on
graphs using the framework of property testing. In this paper, a good graph
cluster on high dimensions is first defined for higher-order clustering. Then,
query lower bound is given for testing whether this kind of good cluster
exists. Finally, an optimal sublinear-time algorithm is developed for testing
clusterability based on triangles
Robust clustering oracle and local reconstructor of cluster structure of graphs
Due to the massive size of modern network data, local algorithms that run in
sublinear time for analyzing the cluster structure of the graph are receiving
growing interest. Two typical examples are local graph clustering algorithms
that find a cluster from a seed node with running time proportional to the size
of the output set, and clusterability testing algorithms that decide if a graph
can be partitioned into a few clusters in the framework of property testing.
In this work, we develop sublinear time algorithms for analyzing the cluster
structure of graphs with noisy partial information. By using conductance based
definitions for measuring the quality of clusters and the cluster structure, we
formalize a definition of noisy clusterable graphs with bounded maximum degree.
The algorithm is given query access to the adjacency list to such a graph. We
then formalize the notion of robust clustering oracle for a noisy clusterable
graph, and give an algorithm that builds such an oracle in sublinear time,
which can be further used to support typical queries (e.g., IsOutlier(),
SameCluster()) regarding the cluster structure of the graph in sublinear
time. All the answers are consistent with a partition of in which all but a
small fraction of vertices belong to some good cluster. We also give a local
reconstructor for a noisy clusterable graph that provides query access to a
reconstructed graph that is guaranteed to be clusterable in sublinear time. All
the query answers are consistent with a clusterable graph which is guaranteed
to be close to the original graph
Framework for Clique-based Fusion of Graph Streams in Multi-function System Testing
The paper describes a framework for multi-function system testing.
Multi-function system testing is considered as fusion (or revelation) of
clique-like structures. The following sets are considered: (i) subsystems
(system parts or units / components / modules), (ii) system functions and a
subset of system components for each system function, and (iii) function
clusters (some groups of system functions which are used jointly). Test
procedures (as units testing) are used for each subsystem. The procedures lead
to an ordinal result (states, colors) for each component, e.g., [1,2,3,4]
(where 1 corresponds to 'out of service', 2 corresponds to 'major faults', 3
corresponds to 'minor faults', 4 corresponds to 'trouble free service'). Thus,
for each system function a graph over corresponding system components is
examined while taking into account ordinal estimates/colors of the components.
Further, an integrated graph (i.e., colored graph) for each function cluster is
considered (this graph integrates the graphs for corresponding system
functions). For the integrated graph (for each function cluster) structure
revelation problems are under examination (revelation of some subgraphs which
can lead to system faults): (1) revelation of clique and quasi-clique (by
vertices at level 1, 2, etc.; by edges/interconnection existence) and (2)
dynamical problems (when vertex colors are functions of time) are studied as
well: existence of a time interval when clique or quasi-clique can exist.
Numerical examples illustrate the approach and problems.Comment: 6 pages, 13 figure
Model validation of simple-graph representations of metabolism
The large-scale properties of chemical reaction systems, such as the
metabolism, can be studied with graph-based methods. To do this, one needs to
reduce the information -- lists of chemical reactions -- available in
databases. Even for the simplest type of graph representation, this reduction
can be done in several ways. We investigate different simple network
representations by testing how well they encode information about one
biologically important network structure -- network modularity (the propensity
for edges to be cluster into dense groups that are sparsely connected between
each other). To reach this goal, we design a model of reaction-systems where
network modularity can be controlled and measure how well the reduction to
simple graphs capture the modular structure of the model reaction system. We
find that the network types that best capture the modular structure of the
reaction system are substrate-product networks (where substrates are linked to
products of a reaction) and substance networks (with edges between all
substances participating in a reaction). Furthermore, we argue that the
proposed model for reaction systems with tunable clustering is a general
framework for studies of how reaction-systems are affected by modularity. To
this end, we investigate statistical properties of the model and find, among
other things, that it recreate correlations between degree and mass of the
molecules.Comment: to appear in J. Roy. Soc. Intefac
A New Perspective on Clustered Planarity as a Combinatorial Embedding Problem
The clustered planarity problem (c-planarity) asks whether a hierarchically
clustered graph admits a planar drawing such that the clusters can be nicely
represented by regions. We introduce the cd-tree data structure and give a new
characterization of c-planarity. It leads to efficient algorithms for
c-planarity testing in the following cases. (i) Every cluster and every
co-cluster (complement of a cluster) has at most two connected components. (ii)
Every cluster has at most five outgoing edges.
Moreover, the cd-tree reveals interesting connections between c-planarity and
planarity with constraints on the order of edges around vertices. On one hand,
this gives rise to a bunch of new open problems related to c-planarity, on the
other hand it provides a new perspective on previous results.Comment: 17 pages, 2 figure
- …