1,596 research outputs found
Dynamic load balancing for the distributed mining of molecular structures
In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of
methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the
past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially
render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to
discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no
reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic
partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated
load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer
Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed
approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable
for large-scale, multi-domain, heterogeneous environments, such as computational grids
Parallelizing a SAT-Based Product Configurator
This paper presents how state-of-the-art parallel algorithms designed to solve the Satisfiability (SAT) problem can be applied in the domain of product configuration. During an interactive configuration process, a user selects features step-by-step to find a suitable configuration that fulfills his desires and the set of product constraints. A configuration system can be used to guide the user through the process by validating the selections and providing feedback. Each validation of a user selection is formulated as a SAT problem. Furthermore, an optimization problem is identified to find solutions with the minimum amount of changes compared to the previous configuration. Another additional constraint is deterministic computation, which is not trivial to achieve in well performing parallel SAT solvers. In the paper we propose five new deterministic parallel algorithms and experimentally compare them. Experiments show that reasonable speedups are achieved by using multiple threads over the sequential counterpart
Evaluation of a Simple, Scalable, Parallel Best-First Search Strategy
Large-scale, parallel clusters composed of commodity processors are
increasingly available, enabling the use of vast processing capabilities and
distributed RAM to solve hard search problems. We investigate Hash-Distributed
A* (HDA*), a simple approach to parallel best-first search that asynchronously
distributes and schedules work among processors based on a hash function of the
search state. We use this approach to parallelize the A* algorithm in an
optimal sequential version of the Fast Downward planner, as well as a 24-puzzle
solver. The scaling behavior of HDA* is evaluated experimentally on a shared
memory, multicore machine with 8 cores, a cluster of commodity machines using
up to 64 cores, and large-scale high-performance clusters, using up to 2400
processors. We show that this approach scales well, allowing the effective
utilization of large amounts of distributed memory to optimally solve problems
which require terabytes of RAM. We also compare HDA* to Transposition-table
Driven Scheduling (TDS), a hash-based parallelization of IDA*, and show that,
in planning, HDA* significantly outperforms TDS. A simple hybrid which combines
HDA* and TDS to exploit strengths of both algorithms is proposed and evaluated.Comment: in press, to appear in Artificial Intelligenc
A Divide-and-Conquer Algorithm for Betweenness Centrality
The problem of efficiently computing the betweenness centrality of nodes has
been researched extensively. To date, the best known exact and centralized
algorithm for this task is an algorithm proposed in 2001 by Brandes. The
contribution of our paper is Brandes++, an algorithm for exact efficient
computation of betweenness centrality. The crux of our algorithm is that we
create a sketch of the graph, that we call the skeleton, by replacing subgraphs
with simpler graph structures. Depending on the underlying graph structure,
using this skeleton and by keeping appropriate summaries Brandes++ we can
achieve significantly low running times in our computations. Extensive
experimental evaluation on real life datasets demonstrate the efficacy of our
algorithm for different types of graphs. We release our code for benefit of the
research community.Comment: Shorter version of this paper appeared in Siam Data Mining 201
A GPU-based Evolution Strategy for Optic Disk Detection in Retinal Images
La ejecución paralela de aplicaciones usando unidades de procesamiento gráfico (gpu) ha ganado gran interés en la comunidad académica en los años recientes. La computación paralela puede ser aplicada a las estrategias evolutivas para procesar individuos dentro de una población, sin embargo, las estrategias evolutivas se caracterizan por un significativo consumo de recursos computacionales al resolver problemas de gran tamaño o aquellos que se modelan mediante funciones de aptitud complejas. Este artículo describe la implementación de una estrategia evolutiva para la detección del disco óptico en imágenes de retina usando Compute Unified Device Architecture (cuda). Los resultados experimentales muestran que el tiempo de ejecución para la detección del disco óptico logra una aceleración de 5 a 7 veces, comparado con la ejecución secuencial en una cpu convencional.Parallel processing using graphic processing units (GPUs) has attracted much research interest in recent years. Parallel computation can be applied to evolution strategy (ES) for processing individuals in a population, but evolutionary strategies are time consuming to solve large computational problems or complex fitness functions. In this paper we describe the implementation of an improved ES for optic disk detection in retinal images using the Compute Unified Device Architecture (CUDA) environment. In the experimental results we show that the computational time for optic disk detection task has a speedup factor of 5x and 7x compared to an implementation on a mainstream CPU
- …