59 research outputs found
Shared-Memory Parallel Maximal Clique Enumeration
We present shared-memory parallel methods for Maximal Clique Enumeration
(MCE) from a graph. MCE is a fundamental and well-studied graph analytics task,
and is a widely used primitive for identifying dense structures in a graph. Due
to its computationally intensive nature, parallel methods are imperative for
dealing with large graphs. However, surprisingly, there do not yet exist
scalable and parallel methods for MCE on a shared-memory parallel machine. In
this work, we present efficient shared-memory parallel algorithms for MCE, with
the following properties: (1) the parallel algorithms are provably
work-efficient relative to a state-of-the-art sequential algorithm (2) the
algorithms have a provably small parallel depth, showing that they can scale to
a large number of processors, and (3) our implementations on a multicore
machine shows a good speedup and scaling behavior with increasing number of
cores, and are substantially faster than prior shared-memory parallel
algorithms for MCE.Comment: 10 pages, 3 figures, proceedings of the 25th IEEE International
Conference on. High Performance Computing, Data, and Analytics (HiPC), 201
Cohesive subgraph identification in large graphs
Graph data is ubiquitous in real world applications, as the relationship among entities in the applications can be naturally captured by the graph model. Finding cohesive subgraphs is a fundamental problem in graph mining with diverse applications. Given the important roles of cohesive subgraphs, this thesis focuses on cohesive subgraph identification in large graphs.
Firstly, we study the size-bounded community search problem that aims to find a subgraph with the largest min-degree among all connected subgraphs that contain the query vertex q and have at least l and at most h vertices, where q, l, h are specified by the query. As the problem is NP-hard, we propose a branch-reduce-and-bound algorithm SC-BRB by developing nontrivial reducing techniques, upper bounding techniques, and branching techniques.
Secondly, we formulate the notion of similar-biclique in bipartite graphs which is a special kind of biclique where all vertices from a designated side are similar to each other, and aim to enumerate all maximal similar-bicliques. We propose a backtracking algorithm MSBE to directly enumerate maximal similar-bicliques, and power it by vertex reduction and optimization techniques. In addition, we design a novel index structure to speed up a time-critical operation of MSBE, as well as to speed up vertex reduction. Efficient index construction algorithms are developed.
Thirdly, we consider balanced cliques in signed graphs --- a clique is balanced if its vertex set can be partitioned into CL and CR such that all negative edges are between CL and CR --- and study the problem of maximum balanced clique computation. We propose techniques to transform the maximum balanced clique problem over G to a series of maximum dichromatic clique problems over small subgraphs of G. The transformation not only removes edge signs but also sparsifies the edge set
CSS Minification via Constraint Solving
Minification is a widely-accepted technique which aims at reducing the size
of the code transmitted over the web. We study the problem of minifying
Cascading Style Sheets (CSS) --- the de facto language for styling web
documents. Traditionally, CSS minifiers focus on simple syntactic
transformations (e.g. shortening colour names). In this paper, we propose a new
minification method based on merging similar rules in a CSS file.
We consider safe transformations of CSS files, which preserve the semantics
of the CSS file. The semantics of CSS files are sensitive to the ordering of
rules in the file. To automatically identify a rule merging opportunity that
best minimises file size, we reduce the rule-merging problem to a problem on
CSS-graphs, i.e., node-weighted bipartite graphs with a dependency ordering on
the edges, where weights capture the number of characters (e.g. in a selector
or in a property declaration). Roughly speaking, the corresponding CSS-graph
problem concerns minimising the total weight of a sequence of bicliques
(complete bipartite subgraphs) that covers the CSS-graph and respects the edge
order.
We provide the first full formalisation of CSS3 selectors and reduce
dependency detection to satisfiability of quantifier-free integer linear
arithmetic, for which highly-optimised SMT-solvers are available. To solve the
above NP-hard graph optimisation problem, we show how Max-SAT solvers can be
effectively employed. We have implemented our algorithms using Max-SAT and
SMT-solvers as backends, and tested against approximately 70 real-world
examples (including the top 20 most popular websites). In our benchmarks, our
tool yields larger savings than six well-known minifiers (which do not perform
rule-merging, but support many other optimisations). Our experiments also
suggest that better savings can be achieved in combination with one of these
six minifiers
Enumerating Maximal Induced Subgraphs
Given a graph G, the maximal induced subgraphs problem asks to enumerate all maximal induced subgraphs of G that belong to a certain hereditary graph class. While its optimization version, known as the minimum vertex deletion problem in literature, has been intensively studied, enumeration algorithms were only known for a few simple graph classes, e.g., independent sets, cliques, and forests, until very recently [Conte and Uno, STOC 2019]. There is also a connected variation of this problem, where one is concerned with only those induced subgraphs that are connected. We introduce two new approaches, which enable us to develop algorithms that solve both variations for a number of important graph classes. A general technique that has been proven very powerful in enumeration algorithms is to build a solution map, i.e., a multiple digraph on all the solutions of the problem, and the key of this approach is to make the solution map strongly connected, so that a simple traversal of the solution map solves the problem. First, we introduce retaliation-free paths to certify strong connectedness of the solution map we build. Second, generalizing the idea of Cohen, Kimelfeld, and Sagiv [JCSS 2008], we introduce an apparently very restricted version of the maximal (connected) induced subgraphs problem, and show that it is equivalent to the original problem in terms of solvability in incremental polynomial time. Moreover, we give reductions between the two variations, so that it suffices to solve one of the variations for each class we study. Our work also leads to direct and simpler proofs of several important known results
- …