1,868 research outputs found
Clear and Compress: Computing Persistent Homology in Chunks
We present a parallelizable algorithm for computing the persistent homology
of a filtered chain complex. Our approach differs from the commonly used
reduction algorithm by first computing persistence pairs within local chunks,
then simplifying the unpaired columns, and finally applying standard reduction
on the simplified matrix. The approach generalizes a technique by G\"unther et
al., which uses discrete Morse Theory to compute persistence; we derive the
same worst-case complexity bound in a more general context. The algorithm
employs several practical optimization techniques which are of independent
interest. Our sequential implementation of the algorithm is competitive with
state-of-the-art methods, and we improve the performance through parallelized
computation.Comment: This result was presented at TopoInVis 2013
(http://www.sci.utah.edu/topoinvis13.html
A Hybrid Multi-GPU Implementation of Simplex Algorithm with CPU Collaboration
The simplex algorithm has been successfully used for many years in solving
linear programming (LP) problems. Due to the intensive computations required
(especially for the solution of large LP problems), parallel approaches have
also extensively been studied. The computational power provided by the modern
GPUs as well as the rapid development of multicore CPU systems have led OpenMP
and CUDA programming models to the top preferences during the last years.
However, the desired efficient collaboration between CPU and GPU through the
combined use of the above programming models is still considered a hard
research problem. In the above context, we demonstrate here an excessively
efficient implementation of standard simplex, targeting to the best possible
exploitation of the concurrent use of all the computing resources, on a
multicore platform with multiple CUDA-enabled GPUs. More concretely, we present
a novel hybrid collaboration scheme which is based on the concurrent execution
of suitably spread CPU-assigned (via multithreading) and GPU-offloaded
computations. The experimental results extracted through the cooperative use of
OpenMP and CUDA over a notably powerful modern hybrid platform (consisting of
32 cores and two high-spec GPUs, Titan Rtx and Rtx 2080Ti) highlight that the
performance of the presented here hybrid GPU/CPU collaboration scheme is
clearly superior to the GPU-only implementation under almost all conditions.
The corresponding measurements validate the value of using all resources
concurrently, even in the case of a multi-GPU configuration platform.
Furthermore, the given implementations are completely comparable (and slightly
superior in most cases) to other related attempts in the bibliography, and
clearly superior to the native CPU-implementation with 32 cores.Comment: 12 page
JuliBootS: a hands-on guide to the conformal bootstrap
We introduce {\tt JuliBootS}, a package for numerical conformal bootstrap
computations coded in {\tt Julia}. The centre-piece of {\tt JuliBootS} is an
implementation of Dantzig's simplex method capable of handling arbitrary
precision linear programming problems with continuous search spaces. Current
supported features include conformal dimension bounds, OPE bounds, and
bootstrap with or without global symmetries. The code is trivially
parallelizable on one or multiple machines. We exemplify usage extensively with
several real-world applications. In passing we give a pedagogical introduction
to the numerical bootstrap methods.Comment: 29 page
- …