457 research outputs found
Shared-Memory Parallel Maximal Clique Enumeration
We present shared-memory parallel methods for Maximal Clique Enumeration
(MCE) from a graph. MCE is a fundamental and well-studied graph analytics task,
and is a widely used primitive for identifying dense structures in a graph. Due
to its computationally intensive nature, parallel methods are imperative for
dealing with large graphs. However, surprisingly, there do not yet exist
scalable and parallel methods for MCE on a shared-memory parallel machine. In
this work, we present efficient shared-memory parallel algorithms for MCE, with
the following properties: (1) the parallel algorithms are provably
work-efficient relative to a state-of-the-art sequential algorithm (2) the
algorithms have a provably small parallel depth, showing that they can scale to
a large number of processors, and (3) our implementations on a multicore
machine shows a good speedup and scaling behavior with increasing number of
cores, and are substantially faster than prior shared-memory parallel
algorithms for MCE.Comment: 10 pages, 3 figures, proceedings of the 25th IEEE International
Conference on. High Performance Computing, Data, and Analytics (HiPC), 201
Efficient Algorithms for Finding Maximum and Maximal Cliques and Their Applications
The problem of finding a maximum clique or enumerating all maximal cliques is very important and has been explored in several excellent survey papers. Here, we focus our attention on the step-by-step examination of a series of branch-and-bound depth-first search algorithms: Basics, MCQ, MCR, MCS, and MCT. Subsequently, as with the depth-first search as above, we present our algorithm, CLIQUES, for enumerating all maximal cliques. Finally, we describe some of the applications of the algorithms and their variants in bioinformatics, data mining, and other fields
Robustness Verification of Tree-based Models
We study the robustness verification problem for tree-based models, including
decision trees, random forests (RFs) and gradient boosted decision trees
(GBDTs). Formal robustness verification of decision tree ensembles involves
finding the exact minimal adversarial perturbation or a guaranteed lower bound
of it. Existing approaches find the minimal adversarial perturbation by a mixed
integer linear programming (MILP) problem, which takes exponential time so is
impractical for large ensembles. Although this verification problem is
NP-complete in general, we give a more precise complexity characterization. We
show that there is a simple linear time algorithm for verifying a single tree,
and for tree ensembles, the verification problem can be cast as a max-clique
problem on a multi-partite graph with bounded boxicity. For low dimensional
problems when boxicity can be viewed as constant, this reformulation leads to a
polynomial time algorithm. For general problems, by exploiting the boxicity of
the graph, we develop an efficient multi-level verification algorithm that can
give tight lower bounds on the robustness of decision tree ensembles, while
allowing iterative improvement and any-time termination. OnRF/GBDT models
trained on 10 datasets, our algorithm is hundreds of times faster than the
previous approach that requires solving MILPs, and is able to give tight
robustness verification bounds on large GBDTs with hundreds of deep trees.Comment: Hongge Chen and Huan Zhang contributed equall
- …