14,600 research outputs found
A GPU-based Evolution Strategy for Optic Disk Detection in Retinal Images
La ejecución paralela de aplicaciones usando unidades de procesamiento gráfico (gpu) ha ganado gran interés en la comunidad académica en los años recientes. La computación paralela puede ser aplicada a las estrategias evolutivas para procesar individuos dentro de una población, sin embargo, las estrategias evolutivas se caracterizan por un significativo consumo de recursos computacionales al resolver problemas de gran tamaño o aquellos que se modelan mediante funciones de aptitud complejas. Este artículo describe la implementación de una estrategia evolutiva para la detección del disco óptico en imágenes de retina usando Compute Unified Device Architecture (cuda). Los resultados experimentales muestran que el tiempo de ejecución para la detección del disco óptico logra una aceleración de 5 a 7 veces, comparado con la ejecución secuencial en una cpu convencional.Parallel processing using graphic processing units (GPUs) has attracted much research interest in recent years. Parallel computation can be applied to evolution strategy (ES) for processing individuals in a population, but evolutionary strategies are time consuming to solve large computational problems or complex fitness functions. In this paper we describe the implementation of an improved ES for optic disk detection in retinal images using the Compute Unified Device Architecture (CUDA) environment. In the experimental results we show that the computational time for optic disk detection task has a speedup factor of 5x and 7x compared to an implementation on a mainstream CPU
Parallel memetic algorithms for independent job scheduling in computational grids
In this chapter we present parallel implementations of Memetic Algorithms (MAs) for the problem of scheduling independent jobs in computational grids. The problem of scheduling in computational grids is known for its high demanding computational time. In this work we exploit the intrinsic parallel nature of MAs as well as the fact that computational grids offer large amount of resources, a part of which could be used to compute the efficient allocation of jobs to grid resources.
The parallel models exploited in this work for MAs include both fine-grained and coarse-grained parallelization and their hybridization. The resulting schedulers have been tested through different grid scenarios generated by a grid simulator to match different possible configurations of computational grids in terms of size (number of jobs and resources) and computational characteristics of resources. All in all, the result of this work showed that Parallel MAs are very good alternatives in order to match different performance requirement on fast scheduling of jobs to grid resources.Peer ReviewedPostprint (author's final draft
Distributed Correlation-Based Feature Selection in Spark
CFS (Correlation-Based Feature Selection) is an FS algorithm that has been
successfully applied to classification problems in many domains. We describe
Distributed CFS (DiCFS) as a completely redesigned, scalable, parallel and
distributed version of the CFS algorithm, capable of dealing with the large
volumes of data typical of big data applications. Two versions of the algorithm
were implemented and compared using the Apache Spark cluster computing model,
currently gaining popularity due to its much faster processing times than
Hadoop's MapReduce model. We tested our algorithms on four publicly available
datasets, each consisting of a large number of instances and two also
consisting of a large number of features. The results show that our algorithms
were superior in terms of both time-efficiency and scalability. In leveraging a
computer cluster, they were able to handle larger datasets than the
non-distributed WEKA version while maintaining the quality of the results,
i.e., exactly the same features were returned by our algorithms when compared
to the original algorithm available in WEKA.Comment: 25 pages, 5 figure
Petuum: A New Platform for Distributed Machine Learning on Big Data
What is a systematic way to efficiently apply a wide spectrum of advanced ML
programs to industrial scale problems, using Big Models (up to 100s of billions
of parameters) on Big Data (up to terabytes or petabytes)? Modern
parallelization strategies employ fine-grained operations and scheduling beyond
the classic bulk-synchronous processing paradigm popularized by MapReduce, or
even specialized graph-based execution that relies on graph representations of
ML programs. The variety of approaches tends to pull systems and algorithms
design in different directions, and it remains difficult to find a universal
platform applicable to a wide range of ML programs at scale. We propose a
general-purpose framework that systematically addresses data- and
model-parallel challenges in large-scale ML, by observing that many ML programs
are fundamentally optimization-centric and admit error-tolerant,
iterative-convergent algorithmic solutions. This presents unique opportunities
for an integrative system design, such as bounded-error network synchronization
and dynamic scheduling based on ML program structure. We demonstrate the
efficacy of these system designs versus well-known implementations of modern ML
algorithms, allowing ML programs to run in much less time and at considerably
larger model sizes, even on modestly-sized compute clusters.Comment: 15 pages, 10 figures, final version in KDD 2015 under the same titl
Requirements for implementing real-time control functional modules on a hierarchical parallel pipelined system
Analysis of a robot control system leads to a broad range of processing requirements. One fundamental requirement of a robot control system is the necessity of a microcomputer system in order to provide sufficient processing capability.The use of multiple processors in a parallel architecture is beneficial for a number of reasons, including better cost performance, modular growth, increased reliability through replication, and flexibility for testing alternate control strategies via different partitioning. A survey of the progression from low level control synchronizing primitives to higher level communication tools is presented. The system communication and control mechanisms of existing robot control systems are compared to the hierarchical control model. The impact of this design methodology on the current robot control systems is explored
- …