Search CORE

8 research outputs found

Optimización de algoritmos bioinspirados en sistemas heterogéneos CPU-GPU.

Author: Llanes Castro Antonio
Publication venue
Publication date: 01/01/2016
Field of study

Los retos científicos del siglo XXI precisan del tratamiento y análisis de una ingente cantidad de información en la conocida como la era del Big Data. Los futuros avances en distintos sectores de la sociedad como la medicina, la ingeniería o la producción eficiente de energía, por mencionar sólo unos ejemplos, están supeditados al crecimiento continuo en la potencia computacional de los computadores modernos. Sin embargo, la estela de este crecimiento computacional, guiado tradicionalmente por la conocida “Ley de Moore”, se ha visto comprometido en las últimas décadas debido, principalmente, a las limitaciones físicas del silicio. Los arquitectos de computadores han desarrollado numerosas contribuciones multicore, manycore, heterogeneidad, dark silicon, etc, para tratar de paliar esta ralentización computacional, dejando en segundo plano otros factores fundamentales en la resolución de problemas como la programabilidad, la fiabilidad, la precisión, etc. El desarrollo de software, sin embargo, ha seguido un camino totalmente opuesto, donde la facilidad de programación a través de modelos de abstracción, la depuración automática de código para evitar efectos no deseados y la puesta en producción son claves para una viabilidad económica y eficiencia del sector empresarial digital. Esta vía compromete, en muchas ocasiones, el rendimiento de las propias aplicaciones; consecuencia totalmente inadmisible en el contexto científico. En esta tesis doctoral tiene como hipótesis de partida reducir las distancias entre los campos hardware y software para contribuir a solucionar los retos científicos del siglo XXI. El desarrollo de hardware está marcado por la consolidación de los procesadores orientados al paralelismo masivo de datos, principalmente GPUs Graphic Processing Unit y procesadores vectoriales, que se combinan entre sí para construir procesadores o computadores heterogéneos HSA. En concreto, nos centramos en la utilización de GPUs para acelerar aplicaciones científicas. Las GPUs se han situado como una de las plataformas con mayor proyección para la implementación de algoritmos que simulan problemas científicos complejos. Desde su nacimiento, la trayectoria y la historia de las tarjetas gráficas ha estado marcada por el mundo de los videojuegos, alcanzando altísimas cotas de popularidad según se conseguía más realismo en este área. Un hito importante ocurrió en 2006, cuando NVIDIA (empresa líder en la fabricación de tarjetas gráficas) lograba hacerse con un hueco en el mundo de la computación de altas prestaciones y en el mundo de la investigación con el desarrollo de CUDA “Compute Unified Device Arquitecture. Esta arquitectura posibilita el uso de la GPU para el desarrollo de aplicaciones científicas de manera versátil. A pesar de la importancia de la GPU, es interesante la mejora que se puede producir mediante su utilización conjunta con la CPU, lo que nos lleva a introducir los sistemas heterogéneos tal y como detalla el título de este trabajo. Es en entornos heterogéneos CPU-GPU donde estos rendimientos alcanzan sus cotas máximas, ya que no sólo las GPUs soportan el cómputo científico de los investigadores, sino que es en un sistema heterogéneo combinando diferentes tipos de procesadores donde podemos alcanzar mayor rendimiento. En este entorno no se pretende competir entre procesadores, sino al contrario, cada arquitectura se especializa en aquella parte donde puede explotar mejor sus capacidades. Donde mayor rendimiento se alcanza es en estos clústeres heterogéneos, donde múltiples nodos son interconectados entre sí, pudiendo dichos nodos diferenciarse no sólo entre arquitecturas CPU-GPU, sino también en las capacidades computacionales dentro de estas arquitecturas. Con este tipo de escenarios en mente, se presentan nuevos retos en los que lograr que el software que hemos elegido como candidato se ejecuten de la manera más eficiente y obteniendo los mejores resultados posibles. Estas nuevas plataformas hacen necesario un rediseño del software para aprovechar al máximo los recursos computacionales disponibles. Se debe por tanto rediseñar y optimizar los algoritmos existentes para conseguir que las aportaciones en este campo sean relevantes, y encontrar algoritmos que, por su propia naturaleza sean candidatos para que su ejecución en dichas plataformas de alto rendimiento sea óptima. Encontramos en este punto una familia de algoritmos denominados bioinspirados, que utilizan la inteligencia colectiva como núcleo para la resolución de problemas. Precisamente esta inteligencia colectiva es la que les hace candidatos perfectos para su implementación en estas plataformas bajo el nuevo paradigma de computación paralela, puesto que las soluciones pueden ser construidas en base a individuos que mediante alguna forma de comunicación son capaces de construir conjuntamente una solución común. Esta tesis se centrará especialmente en uno de estos algoritmos bioinspirados que se engloba dentro del término metaheurísticas bajo el paradigma del Soft Computing, el Ant Colony Optimization “ACO”. Se realizará una contextualización, estudio y análisis del algoritmo. Se detectarán las partes más críticas y serán rediseñadas buscando su optimización y paralelización, manteniendo o mejorando la calidad de sus soluciones. Posteriormente se pasará a implementar y testear las posibles alternativas sobre diversas plataformas de alto rendimiento. Se utilizará el conocimiento adquirido en el estudio teórico-práctico anterior para su aplicación a casos reales, más en concreto se mostrará su aplicación sobre el plegado de proteínas. Todo este análisis es trasladado a su aplicación a un caso concreto. En este trabajo, aunamos las nuevas plataformas hardware de alto rendimiento junto al rediseño e implementación software de un algoritmo bioinspirado aplicado a un problema científico de gran complejidad como es el caso del plegado de proteínas. Es necesario cuando se implementa una solución a un problema real, realizar un estudio previo que permita la comprensión del problema en profundidad, ya que se encontrará nueva terminología y problemática para cualquier neófito en la materia, en este caso, se hablará de aminoácidos, moléculas o modelos de simulación que son desconocidos para los individuos que no sean de un perfil biomédico.Ingeniería, Industria y Construcció

Institutional Repository UCAM

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Reconfigurable Antenna Systems: Platform implementation and low-power matters

Author: Ciccia Simone
Publication venue: Politecnico di Torino
Publication date: 01/01/2018
Field of study

Antennas are a necessary and often critical component of all wireless systems, of which they share the ever-increasing complexity and the challenges of present and emerging trends. 5G, massive low-orbit satellite architectures (e.g. OneWeb), industry 4.0, Internet of Things (IoT), satcom on-the-move, Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles, all call for highly flexible systems, and antenna reconfigurability is an enabling part of these advances. The terminal segment is particularly crucial in this sense, encompassing both very compact antennas or low-profile antennas, all with various adaptability/reconfigurability requirements. This thesis work has dealt with hardware implementation issues of Radio Frequency (RF) antenna reconfigurability, and in particular with low-power General Purpose Platforms (GPP); the work has encompassed Software Defined Radio (SDR) implementation, as well as embedded low-power platforms (in particular on STM32 Nucleo family of micro-controller). The hardware-software platform work has been complemented with design and fabrication of reconfigurable antennas in standard technology, and the resulting systems tested. The selected antenna technology was antenna array with continuously steerable beam, controlled by voltage-driven phase shifting circuits. Applications included notably Wireless Sensor Network (WSN) deployed in the Italian scientific mission in Antarctica, in a traffic-monitoring case study (EU H2020 project), and into an innovative Global Navigation Satellite Systems (GNSS) antenna concept (patent application submitted). The SDR implementation focused on a low-cost and low-power Software-defined radio open-source platform with IEEE 802.11 a/g/p wireless communication capability. In a second embodiment, the flexibility of the SDR paradigm has been traded off to avoid the power consumption associated to the relevant operating system. Application field of reconfigurable antenna is, however, not limited to a better management of the energy consumption. The analysis has also been extended to satellites positioning application. A novel beamforming method has presented demonstrating improvements in the quality of signals received from satellites. Regarding those who deal with positioning algorithms, this advancement help improving precision on the estimated position

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Many-core Algorithms for Combinatorial Optimization

Author: Strappaveccia Francesco <1982>
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 04/06/2015
Field of study

Combinatorial Optimization is becoming ever more crucial, in these days. From natural sciences to economics, passing through urban centers administration and personnel management, methodologies and algorithms with a strong theoretical background and a consolidated real-word effectiveness is more and more requested, in order to find, quickly, good solutions to complex strategical problems. Resource optimization is, nowadays, a fundamental ground for building the basements of successful projects. From the theoretical point of view, Combinatorial Optimization rests on stable and strong foundations, that allow researchers to face ever more challenging problems. However, from the application point of view, it seems that the rate of theoretical developments cannot cope with that enjoyed by modern hardware technologies, especially with reference to the one of processors industry. In this work we propose new parallel algorithms, designed for exploiting the new parallel architectures available on the market. We found that, exposing the inherent parallelism of some resolution techniques (like Dynamic Programming), the computational benefits are remarkable, lowering the execution times by more than an order of magnitude, and allowing to address instances with dimensions not possible before. We approached four Combinatorial Optimization’s notable problems: Packing Problem, Vehicle Routing Problem, Single Source Shortest Path Problem and a Network Design problem. For each of these problems we propose a collection of effective parallel solution algorithms, either for solving the full problem (Guillotine Cuts and SSSPP) or for enhancing a fundamental part of the solution method (VRP and ND). We endorse our claim by presenting computational results for all problems, either on standard benchmarks from the literature or, when possible, on data from real-world applications, where speed-ups of one order of magnitude are usually attained, not uncommonly scaling up to 40 X factors

AMS Tesi di Dottorato

Implementation of a high performance embedded MPC on FPGA using high-level synthesis

Author: Araujo Barrientos Antonio
Publication venue: 'Baishideng Publishing Group Inc.'
Publication date: 19/06/2017
Field of study

Model predictive control (MPC) has been, since its introduction in the late 70’s, a well accepted control technique, especially for industrial processes, which are typically slow and allow for on-line calculation of the control inputs. Its greatest advantage is its ability to consider constraints, on both inputs and states, directly and naturally. More recently, the improvements in processor speed have allowed its use in a wider range of problems, many involving faster dynamics. Nevertheless, implementation of MPC algorithms on embedded systems with resources, size, power consumption and cost constraints remains a challenge. In this thesis, High-Level Synthesis (HLS) is used to implement implicit MPC algo- rithms for linear (LMPC) and nonlinear (NMPC) plant models, considering constraints on both control inputs and states of the system. The algorithms are implemented in the Zynq@ -7000 All Programmable System-on-a-Chip (AP SoC) ZC706 Evaluation Kit, targeting Xilinx’s Zynq@-7000 AP SoC which contains a general purpose Field Programmable Gate Array (FPGA). In order to solve the optimization problem at each sampling instant, an Interior-Point Method (IPM) is used. The main computation cost of this method is the solution of a system of linear equations. A minimum residual (MINRES) algorithm is used for the solution of this system of equations taking into consideration its special structure in order to make it computationally efficient. A library was created for the linear algebra operations required for the IPM and MINRES algorithms. The implementation is tested on trajectory tracking case studies. Results for the linear case show good performance and implementation metrics, as well as computation times within the considered sampling periods. For the nonlinear case, although a high computation time was needed, the algorithm performed well on the case study presented. Because of resources constraints, implementation of the nonlinear algorithm on higher order systems was precluded.Modellprädiktive Regelung (engl: Model Predictive Control (MPC) ist, seit der Einfüh- rung in den späten 70er Jahren, eine gut angenommene Regelungstechnik, insbesondere für industrielle Prozesse, die typischerweise langsam sind und die online Steuergröße Berechnung ermöglichen. Ihr größter Vorteil ist die Fähigkeit, Beschränkungen bezüg- lich der Steuergrößen und der Regelgrößen zu berücksichtigen. In letzter Zeit hat die Verbesserung der Geschwindigkeit der Prozessoren den Einsatz in einer breitere Pro- blemreichweite mit einer schnelleren Dynamik ermöglicht. Allerdings bleibt die MPC Algorithmus-Implementierung in eingebetteten Systeme mit beschränkte Ressourcen, Größe, Energieverbrauch und Kosten eine Herausforderung. In dieser Arbeit wird die High-Level Synthesis (HLS) benutzt, um implizit MPC Algorithmen für lineare (LMPC) und nichtlineare (NMPC) Regelstrecken zu implemen- tieren, wobei Steuergröße- und Regelgrößenbeschränkungen berücksichtigt werden. Die Algorithmen sind im Zynq@-7000 AP SoC ZC706 Auswertungskit implementiert, wobei auf der Xilinxs Zynq@-7000 AP SoC, der ein allgemeiner Zweck FPGA enthält, abgezielt wird. Ein innere-Punkte Verfahren (engl: Interior-Point Method (IPM)) wird für die Lösung des Optimierungsproblems in jedem Sampling benutzt. Die größte Berechnungs- komplexität bei dem IPM ist die Lösung eines linearen Gleichungssystems. Ein minimaler Residuum-Algorithmus (MINRES) wird für die Lösung dieses Gleichungssystem benutzt, wobei die spezielle Struktur berücksichtigt wird, um das Verfahren recheneffizient zu machen. Es wurde eine Bibliothek mit Funktionen für die benötigten linearen Algebra Operationen in den IPM und MINRES Verfahren entwickelt. Die Implementierung wird in Trajektorieverfolgung Fallstudien getestet. Die Ergeb- nisse für den linearen Fall zeigen gute Leistungen und Metriken, sowie Rechenzeiten innerhalb des berücksichtigten Taktzeiten. Für den nichtlinearen Fall wurde eine ho- he Rechenzeit benötigt. Trotzdem hat der Algorithmus für die vorgestellte Fallstudie gut funktioniert. Infolge der Ressourcenbeschränkungen war die Implementierung des nichtlinearen Algorithmus für Systeme höherer Ordnung verhindert.Tesi

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Registro Nacional de Trabajos de Investigación y Proyectos

Repositorio Digital de Tesis PUCP

A Practical Hardware Implementation of Systemic Computation

Author: Sakellariou C
Publication venue: UCL (University College London)
Publication date: 28/12/2013
Field of study

It is widely accepted that natural computation, such as brain computation, is far superior to typical computational approaches addressing tasks such as learning and parallel processing. As conventional silicon-based technologies are about to reach their physical limits, researchers have drawn inspiration from nature to found new computational paradigms. Such a newly-conceived paradigm is Systemic Computation (SC). SC is a bio-inspired model of computation. It incorporates natural characteristics and defines a massively parallel non-von Neumann computer architecture that can model natural systems efficiently. This thesis investigates the viability and utility of a Systemic Computation hardware implementation, since prior software-based approaches have proved inadequate in terms of performance and flexibility. This is achieved by addressing three main research challenges regarding the level of support for the natural properties of SC, the design of its implied architecture and methods to make the implementation practical and efficient. Various hardware-based approaches to Natural Computation are reviewed and their compatibility and suitability, with respect to the SC paradigm, is investigated. FPGAs are identified as the most appropriate implementation platform through critical evaluation and the first prototype Hardware Architecture of Systemic computation (HAoS) is presented. HAoS is a novel custom digital design, which takes advantage of the inbuilt parallelism of an FPGA and the highly efficient matching capability of a Ternary Content Addressable Memory. It provides basic processing capabilities in order to minimize time-demanding data transfers, while the optional use of a CPU provides high-level processing support. It is optimized and extended to a practical hardware platform accompanied by a software framework to provide an efficient SC programming solution. The suggested platform is evaluated using three bio-inspired models and analysis shows that it satisfies the research challenges and provides an effective solution in terms of efficiency versus flexibility trade-off

UCL Discovery

XXIII Edición del Workshop de Investigadores en Ciencias de la Computación : Libro de actas

Author: Carmona Fernanda Beatriz
Frati Fernando Emmanuel
Publication venue: Universidad Nacional de Chilecito (UNdeC)
Publication date: 31/05/2021
Field of study

Compilación de las ponencias presentadas en el XXIII Workshop de Investigadores en Ciencias de la Computación (WICC), llevado a cabo en Chilecito (La Rioja) en abril de 2021.Red de Universidades con Carreras en Informátic

Servicio de Difusión de la Creación Intelectual

XX Workshop de Investigadores en Ciencias de la Computación - WICC 2018 : Libro de actas

Author: Dapozo Gladys N.
Irrazábal Emanuel
Red de Universidades con Carreras en Informática (RedUNCI)
Publication venue: Facultad de Ciencias Exactas, Naturales y Agrimensura (UNNE)
Publication date: 28/05/2018
Field of study

Actas del XX Workshop de Investigadores en Ciencias de la Computación (WICC 2018), realizado en Facultad de Ciencias Exactas y Naturales y Agrimensura de la Universidad Nacional del Nordeste, los dìas 26 y 27 de abril de 2018.Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual

XX Workshop de Investigadores en Ciencias de la Computación - WICC 2018 : Libro de actas

Author: Dapozo Gladys N.
Irrazábal Emanuel
Red de Universidades con Carreras en Informática (RedUNCI)
Publication venue: Facultad de Ciencias Exactas, Naturales y Agrimensura (UNNE)
Publication date: 28/05/2018
Field of study