202 research outputs found
Easing parallel programming on heterogeneous systems
El modo más frecuente de resolver aplicaciones de HPC (High performance Computing) en tiempos de ejecución razonables y de una forma escalable es mediante el uso de sistemas de cómputo paralelo. La tendencia actual en los sistemas de HPC es la inclusión en la misma máquina de ejecución de varios dispositivos de cómputo, de diferente tipo y arquitectura.
Sin embargo, su uso impone al programador retos específicos. Un programador debe ser experto en las herramientas y abstracciones existentes para memoria distribuida, los modelos de programación para sistemas de memoria compartida, y los modelos de programación específicos para para cada tipo de co-procesador, con el fin de crear programas híbridos que puedan explotar eficientemente todas las capacidades de la máquina.
Actualmente, todos estos problemas deben ser resueltos por el programador, haciendo así la programación de una máquina heterogénea un auténtico reto.
Esta Tesis trata varios de los problemas principales relacionados con la programación en paralelo de los sistemas altamente heterogéneos y distribuidos. En ella se realizan propuestas que resuelven problemas que van desde la creación de códigos portables entre diferentes tipos de dispositivos, aceleradores, y arquitecturas, consiguiendo a su vez máxima eficiencia, hasta los problemas que aparecen en los sistemas de memoria distribuida relacionados con las comunicaciones y la partición de estructuras de datosDepartamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Doctorado en Informátic
Recommended from our members
The automatic implementation of a dynamic load balancing strategy within structured mesh codes generated using a parallelisation tool
This research demonstrates that the automatic implementation of a dynamic load balancing (DLB) strategy within a parallel SPMD (single program multiple data) structured mesh application code is possible. It details how DLB can be effectively employed to reduce the level of load imbalance in a parallel system without expert knowledge of the application. Furnishing CAPTools (the Computer Aided Parallelisation Tools) with the additional functionality of DLB, a DLB parallel version of the serial Fortran 77 application code can be generated quickly and easily with the press of a few buttons, allowing the user to obtain results on various platforms rather than concentrate on implementing a DLB strategy within their code. Results show that the devised DLB strategy has successfully decreased idle time by locally increasing/decreasing processor workloads as and when required to suit the parallel application, utilising the available resources efficiently.
Several possible DLB strategies are examined with the understanding that it needs to be generic if it is to be automatically implemented within CAPTools and applied to a wide range of application codes. This research investigates the issues surrounding load imbalance, distinguishing between processor and physical imbalance in terms of the load redistribution of a parallel application executed on a homogeneous or heterogeneous system. Issues such as where to redistribute the workload, how often to redistribute, calculating and implementing the new distribution (deciding what data arrays to redistribute in the latter case), are all covered in detail, with many of these issues common to the automatic implementation of DLB for unstructured mesh application codes.
The devised DLB Staggered Limit Strategy discussed in this thesis offers flexibility as well as ease of implementation whilst minimising changes to the user's code. The generic utilities developed for this research are discussed along with their manual implementation upon which the automation algorithms are based, where these utilities are interchangeable with alternative methods if desired. This thesis aims to encourage the use of the DLB Staggered Limit Strategy since its benefits are evidently significant and are now easily achievable with its automatic implementation using CAPTools
Compiler Techniques for Optimizing Communication and Data Distribution for Distributed-Memory Computers
Advanced Research Projects Agency (ARPA)National Aeronautics and Space AdministrationOpe
Virtual SATCOM, Long Range Broadband Digital Communications
The current naval strategy is based on a distributed force, networked together with high-speed communications that enable operations as an intelligent, fast maneuvering force. Satellites, the existing network connector, are weak and vulnerable to attack. HF is an alternative, but it does not have the information throughput to meet the distributed warfighting need. The US Navy does not have a solution to reduce dependency on space-based communication systems while providing the warfighter with the required information speed.
Virtual SATCOM is a solution that can match satellite communications (SATCOM) data speed without the vulnerable satellite. It is wireless communication on a High Frequency (HF) channel at SATCOM speed. We have developed an innovative design using high power and gain, ground-based relay systems. We transmit extremely wide-wideband HF channels from ground stations using large directional antennas. Our system starts with a highly directional antenna with a narrow beam that enables increased bandwidth without interfering with other spectrum users. The beam focus and power provide a high SNR across a wideband channel with data rates of 10 Mbps; 1000 times increase in HF data speed.
Our modeling of the ionosphere shows that the ionosphere has more than adequate bandwidth to communicate at 3000 km and high speeds while avoiding detection. We designed a flexible structure adjustable to the dynamic ionosphere. Our design provides a high-speed communications path without the geo-location vulnerability of legacy HF methods.
Our invention will benefit mobile users using steerable beam forming apertures with wide bandwidth signals. This dissertation will focus on three areas: an examination of the ionosphere’s ability to support the channel, design of a phased array antenna that can produce the narrow beam, and design of signal processing that can accommodate the wideband HF frequency range.
Virtual SATCOM is exciting research that can reduce cost and increase access to long-range, high data rate wireless communications
Optimization within a Unified Transformation Framework
Programmers typically want to write scientific programs in a high level
language with semantics based on a sequential execution model. To execute
efficiently on a parallel machine, however, a program typically needs to
contain explicit parallelism and possibly explicit communication and
synchronization. So, we need compilers to convert programs from the first
of these forms to the second. There are two basic choices to be made when
parallelizing a program. First, the computations of the program need to be
distributed amongst the set of available processors. Second, the computations
on each processor need to be ordered. My contribution has been the development
of simple mathematical abstractions for representing these choices and the
development of new algorithms for making these choices. I have developed a new
framework that achieves good performance by minimizing communication between
processors, minimizing the time processors spend waiting for messages from
other processors, and ordering data accesses so as to exploit the memory
hierarchy. This framework can be used by optimizing compilers, as well as by
interactive transformation tools. The state of the art for vectorizing
compilers is already quite good, but much work remains to bring parallelizing
compilers up to the same standard. The main contribution of my work can be
summarized as improving this situation by replacing existing ad hoc
parallelization techniques with a sound underlying foundation on which future
work can be built.
(Also cross-referenced as UMIACS-TR-96-93
- …