4 research outputs found

    Heterogeneous architecture to process swarm optimization algorithms

    Get PDF
    Desde años recientes, el paralelismo hace parte de la arquitectura de las computadoras personales al incluir unidades de co-procesamiento como las unidades de procesamiento gráfico, para conformar así una arquitectura heterogénea. Este artículo presenta la implementación de algoritmos de enjambres sobre esta arquitectura para resolver problemas de optimización de funciones, destacando su estructura inherentemente paralela y sus propiedades de control distribuido. En estos algoritmos se paralelizan los individuos de la población y las dimensiones del problema gracias a la granuralidad del sistema de procesamiento, que además proporciona una baja latencia de comunicaciones entre los individuos debido al procesamiento embebido. Para evaluar las potencialidades de los algoritmos de enjambres sobre la plataforma heterogénea, son implementados dos de ellos: el algoritmo de enjambre de partículas y el algoritmo de enjambre de bacterias. Se utiliza la aceleración como métrica para contrastar los algoritmos en la arquitectura heterogénea compuesta por una GPU NVIDIA GTX480 y una unidad de procesamiento secuencial, donde el algoritmo de enjambre de partículas obtiene una aceleración de hasta 36,82x y el algoritmo de enjambre de bacterias logra una aceleración de hasta 9,26x. Además, se evalúa el efecto al incrementar el tamaño en las poblaciones donde la aceleración es significativamente diferenciable pero con riesgos en la calidad de las soluciones.Since few years ago, the parallel processing has been embedded in personal computers by including co-processing units as the graphics processing units resulting in a heterogeneous platform. This paper presents the implementation of swarm algorithms on this platform to solve several functions from optimization problems, where they highlight their inherent parallel processing and distributed control features. In the swarm algorithms, each individual and dimension problem are parallelized by the granularity of the processing system which also offer low communication latency between individuals through the embedded processing. To evaluate the potential of swarm algorithms on graphics processing units we have implemented two of them: the particle swarm optimization algorithm and the bacterial foraging optimization algorithm. The algorithms’ performance is measured using the acceleration where they are contrasted between a typical sequential processing platform and the NVIDIA GeForce GTX480 heterogeneous platform; the results show that the particle swarm algorithm obtained up to 36.82x and the bacterial foraging swarm algorithm obtained up to 9.26x. Finally, the effect to increase the size of the population is evaluated where we show both the dispersion and the quality of the solutions are decreased despite of high acceleration performance since the initial distribution of the individuals can converge to local optimal solution

    GPU Integration into a Software Defined Radio Framework

    Get PDF
    Software Defined Radio (SDR) was brought about by moving processing done on specific hardware components to reconfigurable software. Hardware components like General Purpose Processors (GPPs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs) are used to make the software and hardware processing of the radio more portable and as efficient as possible. Graphics Processing Units (GPUs) designed years ago for video rendering, are now finding new uses in research. The parallel architecture provided by the GPU gives developers the ability to speed up the performance of computationally intense programs. An open source tool for SDR, Open Source Software Communications Architecture (SCA) Implementation: Embedded (OSSIE), is a free waveform development environment for any developer who wants to experiment with SDR. In this work, OSSIE is integrated with a GPU computing framework to show how performance improvement can be gained from GPU parallelization. GPU research performed with SDR encompasses improving SDR simulations to implementing specific wireless protocols. In this thesis, we are aiming to show performance improvement within an SCA architected SDR implementation. The software components within OSSIE gained significant performance increases with little software changes due to the natural parallelism of the GPU, using Compute Unified Device Architecture (CUDA), Nvidia\u27s GPU programming API. Using sample data sizes for the I and Q channel inputs, performance improvements were seen in as little as 512 samples when using the GPU optimized version of OSSIE. As the sample size increased, the CUDA performance improved as well. Porting OSSIE components onto the CUDA architecture showed that improved performance can be seen in SDR related software through the use of GPU technology

    Riešenie problému globálnej optimalizácie využitím GPU

    Get PDF
    Problém globálnej optimalizácie, inými slovami problém hľadania globálnych extrémov funkcie v obmedzenom obore hodnôt, sa často objavuje v reálnych aplikáciách. Zvýšením účinnosti pri riešení tejto úlohy môže byť dosiahnuté zrýchlenie odozvy aplikácie, alebo poskytnutie presnejšieho výsledku, nakoľko sa úloha rieši pomocou aproximačných algoritmov. Táto práca je zameraná na praktické aspekty globálnej optimalizácie, najmä z oboru analýzy dát vo svete algoritmického obchodovania. Úspešné riešenia tejto úlohy za pomoci CPU sú už síce známe, ale ich hlavnou nevýhodou je veľká časová náročnosť. Hlavným cieľom tejto práce je preto navrhnúť riešenie problému globálnej optimalizácie za pomoci surovej výpočtovej sily GPU. Napriek neporovnateľne väčšiemu počtu výpočtových jadier, ktorými GPU oproti CPU disponuje, je však paralelizácia známych sériových algoritmov pomerne náročná, a to kvôli špecifikám GPU, ako sú napríklad výpočtový model, alebo architektúra pamäti. Druhotným cieľom tejto práce je preto preskúmať viacero možných prístupov k riešeniu úlohy globálnej optimalizácie a experimentálne porovnať dosiahnuté výsledky.The global optimization problem -- i.e., the problem of finding global extreme points of given function on restricted domain of values -- often appears in many real-world applications. Improving efficiency of this task can reduce the latency of the application or provide more precise result since the task is usually solved by an approximative algorithm. This thesis focuses on the practical aspects of global optimization algorithms, especially in the domain of algorithmic trading data analysis. Successful implementations of the global optimization solver already exist for CPUs, but they are quite time demanding. The main objective of this thesis is to design a GO solver that utilizes the raw computational power of the GPU devices. Despite the fact that the GPUs have significantly more computational cores than the CPUs, the parallelization of a known serial algorithm is often quite challenging due to the specific execution model and the memory architecture constraints of the existing GPU architectures. Therefore, the thesis will explore multiple approaches to the problem and present their experimental results.Department of Software EngineeringKatedra softwarového inženýrstvíMatematicko-fyzikální fakultaFaculty of Mathematics and Physic

    PSO-Particle Swarm Optimization

    Get PDF
    Práce se zabývá optimalizací na bázi částicových hejn. V teoretické části je nejprve stručně popsána problematika optimalizace. Poté se značná část věnuje celkovému popisu optimalizačního algoritmu na bázi částicových hejn (PSO). Jsou popsány jeho princip, chování, parametry, struktura a modifikace. Následuje rešerše variant PSO, včetně hybridizací PSO. V praktické části práce jsou nejprve blíže rozebrány dynamické problémy. Poté je popsán nově navržený algoritmus pro dynamické problémy AHPSO (z čeho vychází, čím byl inspirován a jaké prvky používá a proč). Algoritmus je spuštěn na sadě úloh (Moving peaks benchmark) a porovnán s dosud nejlepšími veřejně dostupnými algoritmy variant PSO na dynamické problémy.This work deals with particle swarm optimization. The theoretic part briefly describes the problem of optimization. The considerable part focuses on the overall description of particle swarm optimization (PSO). The principle, behavior, parameters, structure and modifications of PSO are described. The next part of the work is a recherché of variants of PSO, including hybridizations of PSO. In practical part the dynamic problems are analyzed and new designed algorithm for dynamic problems AHPSO is described (what it is based on, what was inspired, what elements are used and why). Algorithm is executed on the set of tasks (Moving peaks benchmark) and compared with the best publicly available variants of algorithm PSO on dynamic problems so far.
    corecore