3,117 research outputs found
Evaluation of the parallel computational capabilities of embedded platforms for critical systems
Modern critical systems need higher performance which cannot be delivered by the simple architectures used so far. Latest embedded architectures feature multi-cores and GPUs, which can be used to satisfy this need. In this thesis we parallelise relevant applications from multiple critical domains represented in the GPU4S benchmark suite, and perform a comparison of the parallel capabilities of candidate platforms for use in critical systems. In particular, we port the open source GPU4S Bench benchmarking suite in the OpenMP programming model, and we benchmark the candidate embedded heterogeneous multi-core platforms of the H2020 UP2DATE project, NVIDIA TX2, NVIDIA Xavier and Xilinx Zynq Ultrascale+, in order to drive the selection of the research platform which will be used in the next phases of the project. Our result indicate that in terms of CPU and GPU performance, the NVIDIA Xavier is the highest performing platform
Developing and applying heterogeneous phylogenetic models with XRate
Modeling sequence evolution on phylogenetic trees is a useful technique in
computational biology. Especially powerful are models which take account of the
heterogeneous nature of sequence evolution according to the "grammar" of the
encoded gene features. However, beyond a modest level of model complexity,
manual coding of models becomes prohibitively labor-intensive. We demonstrate,
via a set of case studies, the new built-in model-prototyping capabilities of
XRate (macros and Scheme extensions). These features allow rapid implementation
of phylogenetic models which would have previously been far more
labor-intensive. XRate's new capabilities for lineage-specific models,
ancestral sequence reconstruction, and improved annotation output are also
discussed. XRate's flexible model-specification capabilities and computational
efficiency make it well-suited to developing and prototyping phylogenetic
grammar models. XRate is available as part of the DART software package:
http://biowiki.org/DART .Comment: 34 pages, 3 figures, glossary of XRate model terminolog
Adaptive optimization for OpenCL programs on embedded heterogeneous systems
Heterogeneous multi-core architectures consisting of CPUs and GPUs are commonplace in today’s embedded systems. These architectures offer potential for energy efficient computing if the application task is mapped to the right core. Realizing such potential is challenging due to the complex and evolving nature of hardware and applications. This paper presents an automatic approach to map OpenCL kernels onto heterogeneous multi-cores for a given optimization criterion – whether it is faster runtime, lower energy consumption or a trade-off between them. This is achieved by developing a machine learning based approach to predict which processor to use to run the OpenCL kernel and the host program, and at what frequency the processor should operate. Instead of hand-tuning a model for each optimization metric, we use machine learning to develop a unified framework that first automatically learns the optimization heuristic for each metric off-line, then uses the learned knowledge to schedule OpenCL kernels at runtime based on code and runtime information of the program. We apply our approach to a set of representative OpenCL benchmarks and evaluate it on an ARM big.LITTLE mobile platform. Our approach achieves over 93% of the performance delivered by a perfect predictor.We obtain, on average, 1.2x, 1.6x, and 1.8x improvement respectively for runtime, energy consumption and the energy delay product when compared to a comparative heterogeneous-aware OpenCL task mapping scheme
Information-Theoretic Control of Multiple Sensor Platforms
This thesis is concerned with the development of a consistent, information-theoretic basis for understanding of coordination and cooperation decentralised multi-sensor multi-platform systems. Autonomous systems composed of multiple sensors and multiple platforms potentially have significant importance in applications such as defence, search and rescue mining or intelligent manufacturing. However, the effective use of multiple autonomous systems requires that an understanding be developed of the mechanisms of coordination and cooperation between component systems in pursuit of a common goal. A fundamental, quantitative, understanding of coordination and cooperation between decentralised autonomous systems is the main goal of this thesis. This thesis focuses on the problem of coordination and cooperation for teams of autonomous systems engaged in information gathering and data fusion tasks. While this is a subset of the general cooperative autonomous systems problem, it still encompasses a range of possible applications in picture compilation, navigation, searching and map building problems. The great advantage of restricting the domain of interest in this way is that an underlying mathematical model for coordination and cooperation can be based on the use of information-theoretic models of platform and sensor abilities. The information theoretic approach builds on the established principles and architecture previously developed for decentralised data fusion systems. In the decentralised control problem addressed in this thesis, each platform and sensor system is considered to be a distinct decision maker with an individual information-theoretic utility measure capturing both local objectives and the inter-dependencies among the decisions made by other members of the team. Together these information-theoretic utilities constitute the team objective. The key contributions of this thesis lie in the quantification and study of cooperative control between sensors and platforms using information as a common utility measure. In particular, * The problem of information gathering is formulated as an optimal control problem by identifying formal measures of information with utility or pay-off. * An information-theoretic utility model of coupling and coordination between decentralised decision makers is elucidated. This is used to describe how the information gathering strategies of a team of autonomous systems are coupled. * Static and dynamic information structures for team members are defined. It is shown that the use of static information structures can lead to efficient, although sub-optimal, decentralised control strategies for the team. * Significant examples in decentralised control of a team of sensors are developed. These include the multi-vehicle multi-target bearings-only tracking problem, and the area coverage or exploration problem for multiple vehicles. These examples demonstrate the range of non-trivial problems to which the theory in this thesis can be employed
Information-Theoretic Control of Multiple Sensor Platforms
This thesis is concerned with the development of a consistent, information-theoretic basis for understanding of coordination and cooperation decentralised multi-sensor multi-platform systems. Autonomous systems composed of multiple sensors and multiple platforms potentially have significant importance in applications such as defence, search and rescue mining or intelligent manufacturing. However, the effective use of multiple autonomous systems requires that an understanding be developed of the mechanisms of coordination and cooperation between component systems in pursuit of a common goal. A fundamental, quantitative, understanding of coordination and cooperation between decentralised autonomous systems is the main goal of this thesis. This thesis focuses on the problem of coordination and cooperation for teams of autonomous systems engaged in information gathering and data fusion tasks. While this is a subset of the general cooperative autonomous systems problem, it still encompasses a range of possible applications in picture compilation, navigation, searching and map building problems. The great advantage of restricting the domain of interest in this way is that an underlying mathematical model for coordination and cooperation can be based on the use of information-theoretic models of platform and sensor abilities. The information theoretic approach builds on the established principles and architecture previously developed for decentralised data fusion systems. In the decentralised control problem addressed in this thesis, each platform and sensor system is considered to be a distinct decision maker with an individual information-theoretic utility measure capturing both local objectives and the inter-dependencies among the decisions made by other members of the team. Together these information-theoretic utilities constitute the team objective. The key contributions of this thesis lie in the quantification and study of cooperative control between sensors and platforms using information as a common utility measure. In particular, * The problem of information gathering is formulated as an optimal control problem by identifying formal measures of information with utility or pay-off. * An information-theoretic utility model of coupling and coordination between decentralised decision makers is elucidated. This is used to describe how the information gathering strategies of a team of autonomous systems are coupled. * Static and dynamic information structures for team members are defined. It is shown that the use of static information structures can lead to efficient, although sub-optimal, decentralised control strategies for the team. * Significant examples in decentralised control of a team of sensors are developed. These include the multi-vehicle multi-target bearings-only tracking problem, and the area coverage or exploration problem for multiple vehicles. These examples demonstrate the range of non-trivial problems to which the theory in this thesis can be employed
An analysis of key generation efficiency of RSA cryptosystem in distributed environments
Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2005Includes bibliographical references (leaves: 68)Text in English Abstract: Turkish and Englishix, 74 leavesAs the size of the communication through networks and especially through Internet grew, there became a huge need for securing these connections. The symmetric and asymmetric cryptosystems formed a good complementary approach for providing this security. While the asymmetric cryptosystems were a perfect solution for the distribution of the keys used by the communicating parties, they were very slow for the actual encryption and decryption of the data flowing between them. Therefore, the symmetric cryptosystems perfectly filled this space and were used for the encryption and decryption process once the session keys had been exchanged securely. Parallelism is a hot research topic area in many different fields and being used to deal with problems whose solutions take a considerable amount of time. Cryptography is no exception and, computer scientists have discovered that parallelism could certainly be used for making the algorithms for asymmetric cryptosystems go faster and the experimental results have shown a good promise so far. This thesis is based on the parallelization of a famous public-key algorithm, namely RSA
Parallel source code transformation techniques using design patterns
Mención Internacional en el título de doctorIn recent years, the traditional approaches for improving performance, such as increasing
the clock frequency, has come to a dead-end. To tackle this issue, parallel architectures,
such as multi-/many-core processors, have been envisioned to increase
the performance by providing greater processing capabilities. However, programming
efficiently for this architectures demands big efforts in order to transform sequential
applications into parallel and to optimize such applications. Compared to
sequential programming, designing and implementing parallel applications for operating
on modern hardware poses a number of new challenges to developers such
as data races, deadlocks, load imbalance, etc.
To pave the way, parallel design patterns provide a way to encapsulate algorithmic
aspects, allowing users to implement robust, readable and portable solutions
with such high-level abstractions. Basically, these patterns instantiate parallelism
while hiding away the complexity of concurrency mechanisms, such as thread management,
synchronizations or data sharing. Nonetheless, frameworks following this
philosophy does not share the same interface and users require understanding different
libraries, and their capabilities, not only to decide which fits best for their
purposes but also to properly leverage them. Furthermore, in order to parallelize
these applications, it is necessary to analyze the sequential code in order to detect the
regions of code that can be parallelized that is a time consuming and complex task.
Additionally, different libraries targeted to specific devices provide some algorithms
implementations that are already parallel and highly-tuned. In these situations, it is
also necessary to analyze and determine which routine implementation is the most
suitable for a given problem.
To tackle these issues, this thesis aims at simplifying and minimizing the necessary
efforts to transform sequential applications into parallel. This way, resulting
codes will improve their performance by fully exploiting the available resources
while the development efforts will be considerably reduced. Basically, in this thesis,
we contribute with the following. First, we propose a technique to detect potential
parallel patterns in sequential code. Second, we provide a novel generic C++ interface
for parallel patterns which acts as a switch among existing frameworks. Third,
we implement a framework that is able to transform sequential code into parallel
using the proposed pattern discovery technique and pattern interface. Finally, we
propose mechanisms that are able to select the most suitable device and routine implementation
to solve a given problem based on previous performance information.
The evaluation demonstrates that using the proposed techniques can minimize the
refactoring and optimization time while improving the performance of the resulting
applications with respect to the original code.En los últimos años, las técnicas tradicionales para mejorar el rendimiento, como es
el caso del incremento de la frecuencia de reloj, han llegado a sus límites. Con el
fin de seguir mejorando el rendimiento, se han desarrollado las arquitecturas paralelas,
las cuales proporcionan un incremento del rendimiento al estar provistas de
mayores capacidades de procesamiento. Sin embargo, programar de forma eficiente
para estas arquitecturas requieren de grandes esfuerzos por parte de los desarrolladores.
Comparado con la programación secuencial, diseñar e implementar aplicaciones
paralelas enfocadas a trabajar en estas arquitecturas presentan una gran
cantidad de dificultades como son las condiciones de carrera, los deadlocks o el incorrecto
balanceo de la carga.
En este sentido, los patrones paralelos son una forma de encapsular aspectos
algorítmicos de las aplicaciones permitiendo el desarrollo de soluciones robustas,
portables y legibles gracias a las abstracciones de alto nivel. En general, estos patrones
son capaces de proporcionar el paralelismo a la vez que ocultan las complejidades
derivadas de los mecanismos de control de concurrencia necesarios como el
manejo de los hilos, las sincronizaciones o la compartición de datos. No obstante,
los diferentes frameworks que siguen esta filosofía no comparten una única interfaz
lo que conlleva que los usuarios deban conocer múltiples bibliotecas y sus capacidades,
con el fin de decidir cuál de ellos es mejor para una situación concreta y
como usarlos de forma eficiente. Además, con el fin de paralelizar aplicaciones existentes,
es necesario analizar e identificar las regiones del código que pueden ser paralelizadas,
lo cual es una tarea ardua y compleja. Además, algunos algoritmos ya se
encuentran implementados en paralelo y optimizados para arquitecturas concretas
en diversas bibliotecas. Esto da lugar a que sea necesario analizar y determinar que
implementación concreta es la más adecuada para solucionar un problema dado.
Para paliar estas situaciones, está tesis busca simplificar y minimizar el esfuerzo
necesario para transformar aplicaciones secuenciales en paralelas. De esta forma,
los códigos resultantes serán capaces de explotar los recursos disponibles a la vez
que se reduce considerablemente el esfuerzo de desarrollo necesario. En general,
esta tesis contribuye con lo siguiente. En primer lugar, se propone una técnica de
detección de patrones paralelos en códigos secuenciales. En segundo lugar, se presenta
una interfaz genérica de patrones paralelos para C++ que permite seleccionar
la implementación de dichos patrones proporcionada por frameworks ya existentes.
En tercer lugar, se introduce un framework de transformación de código secuencial
a paralelo que hace uso de las técnicas de detección de patrones y la interfaz
presentadas. Finalmente, se proponen mecanismos capaces de seleccionar la implementación
más adecuada para solucionar un problema concreto basándose en el
rendimiento obtenido en ejecuciones previas. Gracias a la evaluación realizada se ha
podido demostrar que uso de las técnicas presentadas pueden minimizar el tiempo
necesario para transformar y optimizar el código a la vez que mejora el rendimiento
de las aplicaciones transformadas.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: David Expósito Singh.- Secretario: Rafael Asenjo Plaza.- Vocal: Marco Aldinucc
- …