    Enabling GPU Support for the COMPSs-Mobile Framework

    Using the GPUs embedded in mobile devices allows for increasing the performance of the applications running on them while reducing the energy consumption of their execution. This article presents a task-based solution for adaptative, collaborative heterogeneous computing on mobile cloud environments. To implement our proposal, we extend the COMPSs-Mobile framework – an implementation of the COMPSs programming model for building mobile applications that offload part of the computation to the Cloud – to support offloading computation to GPUs through OpenCL. To evaluate our solution, we subject the prototype to three benchmark applications representing different application patterns.This work is partially supported by the Joint-Laboratory on Extreme Scale Computing (JLESC), by the European Union through the Horizon 2020 research and innovation programme under contract 687584 (TANGO Project), by the Spanish Goverment (TIN2015-65316-P, BES-2013-067167, EEBB-2016-11272, SEV-2011-00067) and the Generalitat de Catalunya (2014-SGR-1051).

    Programming models for mobile environments

    Premi extraordinari doctorat UPC curs 2017-2018. Àmbit d’Enginyeria de les TICFor the last decade, mobile devices have grown in popularity and became the best-selling computing devices. Despite their high capabilities for user interactions and network connectivity, the computing power of mobile devices is low and the lifetime of the application running on them limited by the battery. Mobile Cloud Computing (MCC) is a technology that tackles the limitations of mobile devices by bringing together their mobility with the vast computing power of the Cloud. Programming applications for Mobile Cloud Computing (MCC) environments is not as straightforward as coding monolithic applications. Developers have to deal with the issues related to parallel programming for distributed infrastructures while considering the battery lifetime and the variability of the network produced by the high mobility of this kind of devices. As with any other distributed environment, developers turn to programming models to improve their productivity by avoiding the complexity of manually dealing with these issues and delegate on the corresponding model all the management of these concerns. This thesis contributes to the current state of the art with an adaptation of the COMPSs programming model for MCC environments. COMPSs allows application programmers to code their applications in a sequential, infrastructure-agnostic fashion without calls to any COMPSs-specific API using the native language for the target platform as if they were to run on the mobile device. At execution time, a runtime system automatically partitions the application into tasks and orchestrates their execution on top of the available resources. This thesis contributes with an extension to the programming model to allow task polymorphism and let the runtime exploit computational resources other than the CPU of the resources. Besides, the runtime architecture has been redesigned with the characteristics of MCC in mind, and it runs as a common service which all the applications running simultaneously on the mobile device contact for submitting the execution of their tasks. For collaboratively exploiting both, local and remote resources, the runtime clusters the computational devices into Computing Platforms according to the mechanisms required to provide the processing elements with the necessary input values, launch the task execution avoiding resource oversubscription and fetching the results back from them. The CPU Platform run tasks on the cores of the CPU. The GPU Platform leverages on OpenCL to run tasks as kernels on GPUs or other accelerators embedded in the mobile device. Finally, the Cloud Platform offloads the execution of tasks onto remote resources. To holistically decide whether is worth running a task on embedded or on remote resources, the runtime considers the the costs -- time, energy and money -- of running the computation on each of the platforms and picks the best. Each platform manages internally its resources and orchestrates the execution of tasks on them using different scheduling policies. Using local and remote computing devices forces the runtime to share data values among the nodes of the infrastructure. This data is potentially privacy-sensitive, and the runtime exposes it to possible attackers when transferring it through the network. To protect the application user from data leaks, the runtime has to provide communications with secrecy, integrity and authenticity. In the extreme case of a network breakdown that isolates the mobile device from the remote nodes, the runtime has to ensure that the execution continues to provide the application user with the expected result even if the connection never re-establishes. The mobile device has to respond using only the resources embedded in it, what could incur in the re-execution of computations already ran on the remote resources. Remote workers have to continue with the execution so that, in case of reconnection, both parts synchronize its progress to reduce the impact of the disruption.Els últims anys, els dispositius mòbils han guanyat en popularitat i s'han convertit en els dispositius més venuts. Tot i la connectivitat i la bona interacció amb l'usuari que ofereixen, la seva capacitat de càlcul is baixa i limitada per la vida de la bateria. El Mobile Cloud Computing (MCC) és una tecnologia que soluciona les limitacions d'aquests dispositius ajuntant la seva mobilitat amb la gran capacitat de còmput del Cloud. Programar aplicacions per entorns MCC no és tan directe com fer aplicacions monolítiques. Els desenvolupadors han de tractar amb els problemes relacionats amb la programació paral·lela mentre tenen en compte la duració de la bateria i la variabilitat de la xarxa degut a la mobilitat inherent a aquest tipus de dispositius. Com per qualsevol altre entorn distribuït, els desenvolupadors recorren a models de programació que millorin la seva productivitat i els evitin tractar manualment amb aquests problemes delegant la seva gestió en el model. Aquesta tesis contribueix a l'estat de l'art actual amb una adaptació del model de programació COMPSs als entorns MCC. COMPSs permet als desenvolupadors programar les aplicacions de forma agnòstica a la infraestructura i seqüencial sense necessitat d'invocar cap API específica utilitzant el llenguatge natiu de la platforma com si l'aplicació s'executés directament en el mòbil. En temps d'execució, una eina (runtime) automàticament divideix l'aplicació en tasques i n'orquestra la seva execució sobre els recursos disponibles. Aquesta tesis estèn el model de programació per tal de permetre polimorfisme a nivell de tasca i deixar al runtime explotar els recursos computacionals dels que disposa el mòbil a part de la CPU. A més a més, l'arquitectura del runtime s'ha redissenyat tenint en compte les característiques pròpies del MCC, i aquest s'executa com un servei comú al que totes les aplicacions del mòbil contacten per tal d'executar les seves tasques. Per explotar col·laborativament tots els recursos, locals i remots, el runtime agrupa els recursos en Computing Platforms en funció dels mecanismes necessaris per proveir el recurs amb les dades d'entrada necessàries, llançar l'execució i recuperar-ne els resultats. La CPU Platform executa tasques en els nuclis de la CPU. La GPU Platform utilitza OpenCL per executar tasques en forma de kernels a la GPU o altres acceleradors integrats en el mòbil. Finalment, la Cloud Platform descàrrega l'execució de tasques en recursos remots. Per decidir holisticament si és millor executar una tasca en un recurs local o en un remot, el runtime considera els costs (temporal, energètic econòmic) d'executar la tasca en cada una de les plataformes i n'escull la millor. Cada plataforma gestiona internament els seus recursos i orquestra l'execució de les tasques en ells seguint diferents polítiques de planificació. L'ús de recursos locals i remots força la compartició de dades entre els nodes de la infraestructura. Aquestes dades són potencialment sensibles i de caràcter privat i el runtime les exposa a possibles atacs que les transfereix per la xarxa. Per tal de protegir l'usuari de possibles fuites de dades, el runtime ha de dotar les comunicacions amb confidencialitat, integritat i autenticitat. En el cas extrem en que un error de xarxa aïlli el dispositiu mòbil dels nodes remots, el runtime ha d'assegurar que l'execució continua i que eventualment l'usuari rebrà el resultat esperat fins i tot en cas de que la connexió no és restableixi mai. El mòbil ha de ser capaç d'executar l'aplicació utilitzant únicament les dades i recursos disponibles en aquell moment, la qual cosa pot forçar la re-execució d'algunes tasques ja calculades en els recursos remots. Els recursos remots han de continuar l'execució per tal que en cas de reconnexió, ambdues parts sincronitzin el seu progrés i es minimitzi l'impacte de la desconnexió.

    Programming models for mobile environments

    Technologies and Applications for Big Data Value

    Technologies and Applications for Big Data Value

    This open access book explores cutting-edge solutions and best practices for big data and data-driven AI applications for the data-driven economy. It provides the reader with a basis for understanding how technical issues can be overcome to offer real-world solutions to major industrial areas. The book starts with an introductory chapter that provides an overview of the book by positioning the following chapters in terms of their contributions to technology frameworks which are key elements of the Big Data Value Public-Private Partnership and the upcoming Partnership on AI, Data and Robotics. The remainder of the book is then arranged in two parts. The first part "Technologies and Methods" contains horizontal contributions of technologies and methods that enable data value chains to be applied in any sector. The second part "Processes and Applications" details experience reports and lessons from using big data and data-driven approaches in processes and applications. Its chapters are co-authored with industry experts and cover domains including health, law, finance, retail, manufacturing, mobility, and smart cities. Contributions emanate from the Big Data Value Public-Private Partnership and the Big Data Value Association, which have acted as the European data community's nucleus to bring together businesses with leading researchers to harness the value of data to benefit society, business, science, and industry. The book is of interest to two primary audiences, first, undergraduate and postgraduate students and researchers in various fields, including big data, data science, data engineering, and machine learning and AI. Second, practitioners and industry experts engaged in data-driven systems, software design and deployment projects who are interested in employing these advanced methods to address real-world problems

    TANGO: Transparent heterogeneous hardware Architecture deployment for eNergy Gain in Operation

    The paper is concerned with the issue of how software systems actually use Heterogeneous Parallel Architectures (HPAs), with the goal of optimizing power consumption on these resources. It argues the need for novel methods and tools to support software developers aiming to optimise power consumption resulting from designing, developing, deploying and running software on HPAs, while maintaining other quality aspects of software to adequate and agreed levels. To do so, a reference architecture to support energy efficiency at application construction, deployment, and operation is discussed, as well as its implementation and evaluation plans.

    Enabling Analytic and HPC Workflows with COMPSs

    In the recent joint venture between High-Performance Computing (HPC) and Big-Data (BD) Ecosystems towards the Exascale Computing, the scientific community has realized that powerful programming models and high-level abstraction tools are a must. Within this context, the Barcelona Supercomputing Center (BSC) is developing the COMP Superscalar (COMPSs) programming model, whose main objective is to develop applications in a sequential way, while the Runtime System handles the inherent parallelism of the application and abstracts the programmer from the different underlying infrastructures. The parallelism is achieved by defining an application Interface that allows COMPSs to detect methods that operate on a set of parameters (called tasks), and execute them distributedly and transparently. This Master Thesis aims to enhance COMPSs, adapting it to the needs of the Big-Data Ecosystems, by supporting Analytic and HPC workflows. To this end, we propose a straightforward integration with the execution of binaries, and MPI and OmpSs applications. Although the COMPSs programming model is kept untouched, we extend the COMPSs Annotations and some of the COMPSs internals such as the task schedulers and the worker executors. To support our contribution, we have ported to COMPSs two real use cases. On the one hand, NMMB BSC-Dust, a workflow to predict the atmospheric life cycle of the desert dust and, on the other hand, Guidance, an integrated solution for Genome and Phenome association analysis

    Novel high performance techniques for high definition computer aided tomography

    Mención Internacional en el título de doctorMedical image processing is an interdisciplinary field in which multiple research areas are involved: image acquisition, scanner design, image reconstruction algorithms, visualization, etc. X-Ray Computed Tomography (CT) is a medical imaging modality based on the attenuation suffered by the X-rays as they pass through the body. Intrinsic differences in attenuation properties of bone, air, and soft tissue result in high-contrast images of anatomical structures. The main objective of CT is to obtain tomographic images from radiographs acquired using X-Ray scanners. The process of building a 3D image or volume from the 2D radiographs is known as reconstruction. One of the latest trends in CT is the reduction of the radiation dose delivered to patients through the decrease of the amount of acquired data. This reduction results in artefacts in the final images if conventional reconstruction methods are used, making it advisable to employ iterative reconstruction algorithms. There are numerous reconstruction algorithms available, from which we can highlight two specific types: traditional algorithms, which are fast but do not enable the obtaining of high quality images in situations of limited data; and iterative algorithms, slower but more reliable when traditional methods do not reach the quality standard requirements. One of the priorities of reconstruction is the obtaining of the final images in near real time, in order to reduce the time spent in diagnosis. To accomplish this objective, new high performance techniques and methods for accelerating these types of algorithms are needed. This thesis addresses the challenges of both traditional and iterative reconstruction algorithms, regarding acceleration and image quality. One common approach for accelerating these algorithms is the usage of shared-memory and heterogeneous architectures. In this thesis, we propose a novel simulation/reconstruction framework, namely FUX-Sim. This framework follows the hypothesis that the development of new flexible X-ray systems can benefit from computer simulations, which may also enable performance to be checked before expensive real systems are implemented. Its modular design abstracts the complexities of programming for accelerated devices to facilitate the development and evaluation of the different configurations and geometries available. In order to obtain near real execution times, low-level optimizations for the main components of the framework are provided for Graphics Processing Unit (GPU) architectures. Other alternative tackled in this thesis is the acceleration of iterative reconstruction algorithms by using distributed memory architectures. We present a novel architecture that unifies the two most important computing paradigms for scientific computing nowadays: High Performance Computing (HPC). The proposed architecture combines Big Data frameworks with the advantages of accelerated computing. The proposed methods presented in this thesis provide more flexible scanner configurations as they offer an accelerated solution. Regarding performance, our approach is as competitive as the solutions found in the literature. Additionally, we demonstrate that our solution scales with the size of the problem, enabling the reconstruction of high resolution images.El procesamiento de imágenes médicas es un campo interdisciplinario en el que participan múltiples áreas de investigación como la adquisición de imágenes, diseño de escáneres, algoritmos de reconstrucción de imágenes, visualización, etc. La tomografía computarizada (TC) de rayos X es una modalidad de imágen médica basada en el cálculo de la atenuación sufrida por los rayos X a medida que pasan por el cuerpo a escanear. Las diferencias intrínsecas en la atenuación de hueso, aire y tejido blando dan como resultado imágenes de alto contraste de estas estructuras anatómicas. El objetivo principal de la TC es obtener imágenes tomográficas a partir estas radiografías obtenidas mediante escáneres de rayos X. El proceso de construir una imagen o volumen en 3D a partir de las radiografías 2D se conoce como reconstrucción. Una de las últimas tendencias en la tomografía computarizada es la reducción de la dosis de radiación administrada a los pacientes a través de la reducción de la cantidad de datos adquiridos. Esta reducción da como resultado artefactos en las imágenes finales si se utilizan métodos de reconstrucción convencionales, por lo que es aconsejable emplear algoritmos de reconstrucción iterativos. Existen numerosos algoritmos de reconstrucción disponibles a partir de los cuales podemos destacar dos categorías: algoritmos tradicionales, rápidos pero no permiten obtener imágenes de alta calidad en situaciones en las que los datos son limitados; y algoritmos iterativos, más lentos pero más estables en situaciones donde los métodos tradicionales no alcanzan los requisitos en cuanto a la calidad de la imagen. Una de las prioridades de la reconstrucción es la obtención de las imágenes finales en tiempo casi real, con el fin de reducir el tiempo de diagnóstico. Para lograr este objetivo, se necesitan nuevas técnicas y métodos de alto rendimiento para acelerar estos algoritmos. Esta tesis aborda los desafíos de los algoritmos de reconstrucción tradicionales e iterativos, con respecto a la aceleración y la calidad de imagen. Un enfoque común para acelerar estos algoritmos es el uso de arquitecturas de memoria compartida y heterogéneas. En esta tesis, proponemos un nuevo sistema de simulación/reconstrucción, llamado FUX-Sim. Este sistema se construye alrededor de la hipótesis de que el desarrollo de nuevos sistemas de rayos X flexibles puede beneficiarse de las simulaciones por computador, en los que también se puede realizar un control del rendimiento de los nuevos sistemas a desarrollar antes de su implementación física. Su diseño modular abstrae las complejidades de la programación para aceleradores con el objetivo de facilitar el desarrollo y la evaluación de las diferentes configuraciones y geometrías disponibles. Para obtener ejecuciones en casi tiempo real, se proporcionan optimizaciones de bajo nivel para los componentes principales del sistema en las arquitecturas GPU. Otra alternativa abordada en esta tesis es la aceleración de los algoritmos de reconstrucción iterativa mediante el uso de arquitecturas de memoria distribuidas. Presentamos una arquitectura novedosa que unifica los dos paradigmas informáticos más importantes en la actualidad: computación de alto rendimiento (HPC) y Big Data. La arquitectura propuesta combina sistemas Big Data con las ventajas de los dispositivos aceleradores. Los métodos propuestos presentados en esta tesis proporcionan configuraciones de escáner más flexibles y ofrecen una solución acelerada. En cuanto al rendimiento, nuestro enfoque es tan competitivo como las soluciones encontradas en la literatura. Además, demostramos que nuestra solución escala con el tamaño del problema, lo que permite la reconstrucción de imágenes de alta resolución.This work has been mainly funded thanks to a FPU fellowship (FPU14/03875) from the Spanish Ministry of Education. It has also been partially supported by other grants: • DPI2016-79075-R. "Nuevos escenarios de tomografía por rayos X", from the Spanish Ministry of Economy and Competitiveness. • TIN2016-79637-P Towards unification of HPC and Big Data Paradigms from the Spanish Ministry of Economy and Competitiveness. • Short-term scientific missions (STSM) grant from NESUS COST Action IC1305. • TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness. • RTC-2014-3028-1 NECRA Nuevos escenarios clinicos con radiología avanzada from the Spanish Ministry of Economy and Competitiveness.

    Serverless Strategies and Tools in the Cloud Computing Continuum

    Tesis por compendio[ES] En los últimos años, la popularidad de la computación en nube ha permitido a los usuarios acceder a recursos de cómputo, red y almacenamiento sin precedentes bajo un modelo de pago por uso. Esta popularidad ha propiciado la aparición de nuevos servicios para resolver determinados problemas informáticos a gran escala y simplificar el desarrollo y el despliegue de aplicaciones. Entre los servicios más destacados en los últimos años se encuentran las plataformas FaaS (Función como Servicio), cuyo principal atractivo es la facilidad de despliegue de pequeños fragmentos de código en determinados lenguajes de programación para realizar tareas específicas en respuesta a eventos. Estas funciones son ejecutadas en los servidores del proveedor Cloud sin que los usuarios se preocupen de su mantenimiento ni de la gestión de su elasticidad, manteniendo siempre un modelo de pago por uso de grano fino. Las plataformas FaaS pertenecen al paradigma informático conocido como Serverless, cuyo propósito es abstraer la gestión de servidores por parte de los usuarios, permitiéndoles centrar sus esfuerzos únicamente en el desarrollo de aplicaciones. El problema del modelo FaaS es que está enfocado principalmente en microservicios y tiende a tener limitaciones en el tiempo de ejecución y en las capacidades de computación (por ejemplo, carece de soporte para hardware de aceleración como GPUs). Sin embargo, se ha demostrado que la capacidad de autoaprovisionamiento y el alto grado de paralelismo de estos servicios pueden ser muy adecuados para una mayor variedad de aplicaciones. Además, su inherente ejecución dirigida por eventos hace que las funciones sean perfectamente adecuadas para ser definidas como pasos en flujos de trabajo de procesamiento de archivos (por ejemplo, flujos de trabajo de computación científica). Por otra parte, el auge de los dispositivos inteligentes e integrados (IoT), las innovaciones en las redes de comunicación y la necesidad de reducir la latencia en casos de uso complejos han dado lugar al concepto de Edge computing, o computación en el borde. El Edge computing consiste en el procesamiento en dispositivos cercanos a las fuentes de datos para mejorar los tiempos de respuesta. La combinación de este paradigma con la computación en nube, formando arquitecturas con dispositivos a distintos niveles en función de su proximidad a la fuente y su capacidad de cómputo, se ha acuñado como continuo de la computación en la nube (o continuo computacional). Esta tesis doctoral pretende, por lo tanto, aplicar diferentes estrategias Serverless para permitir el despliegue de aplicaciones generalistas, empaquetadas en contenedores de software, a través de los diferentes niveles del continuo computacional. Para ello, se han desarrollado múltiples herramientas con el fin de: i) adaptar servicios FaaS de proveedores Cloud públicos; ii) integrar diferentes componentes software para definir una plataforma Serverless en infraestructuras privadas y en el borde; iii) aprovechar dispositivos de aceleración en plataformas Serverless; y iv) facilitar el despliegue de aplicaciones y flujos de trabajo a través de interfaces de usuario. Además, se han creado y adaptado varios casos de uso para evaluar los desarrollos conseguidos.[CA] En els últims anys, la popularitat de la computació al núvol ha permès als usuaris accedir a recursos de còmput, xarxa i emmagatzematge sense precedents sota un model de pagament per ús. Aquesta popularitat ha propiciat l'aparició de nous serveis per resoldre determinats problemes informàtics a gran escala i simplificar el desenvolupament i desplegament d'aplicacions. Entre els serveis més destacats en els darrers anys hi ha les plataformes FaaS (Funcions com a Servei), el principal atractiu de les quals és la facilitat de desplegament de petits fragments de codi en determinats llenguatges de programació per realitzar tasques específiques en resposta a esdeveniments. Aquestes funcions són executades als servidors del proveïdor Cloud sense que els usuaris es preocupen del seu manteniment ni de la gestió de la seva elasticitat, mantenint sempre un model de pagament per ús de gra fi. Les plataformes FaaS pertanyen al paradigma informàtic conegut com a Serverless, el propòsit del qual és abstraure la gestió de servidors per part dels usuaris, permetent centrar els seus esforços únicament en el desenvolupament d'aplicacions. El problema del model FaaS és que està enfocat principalment a microserveis i tendeix a tenir limitacions en el temps d'execució i en les capacitats de computació (per exemple, no té suport per a maquinari d'acceleració com GPU). Tot i això, s'ha demostrat que la capacitat d'autoaprovisionament i l'alt grau de paral·lelisme d'aquests serveis poden ser molt adequats per a més aplicacions. A més, la seva inherent execució dirigida per esdeveniments fa que les funcions siguen perfectament adequades per ser definides com a passos en fluxos de treball de processament d'arxius (per exemple, fluxos de treball de computació científica). D'altra banda, l'auge dels dispositius intel·ligents i integrats (IoT), les innovacions a les xarxes de comunicació i la necessitat de reduir la latència en casos d'ús complexos han donat lloc al concepte d'Edge computing, o computació a la vora. L'Edge computing consisteix en el processament en dispositius propers a les fonts de dades per millorar els temps de resposta. La combinació d'aquest paradigma amb la computació en núvol, formant arquitectures amb dispositius a diferents nivells en funció de la proximitat a la font i la capacitat de còmput, s'ha encunyat com a continu de la computació al núvol (o continu computacional). Aquesta tesi doctoral pretén, doncs, aplicar diferents estratègies Serverless per permetre el desplegament d'aplicacions generalistes, empaquetades en contenidors de programari, a través dels diferents nivells del continu computacional. Per això, s'han desenvolupat múltiples eines per tal de: i) adaptar serveis FaaS de proveïdors Cloud públics; ii) integrar diferents components de programari per definir una plataforma Serverless en infraestructures privades i a la vora; iii) aprofitar dispositius d'acceleració a plataformes Serverless; i iv) facilitar el desplegament d'aplicacions i fluxos de treball mitjançant interfícies d'usuari. A més, s'han creat i s'han adaptat diversos casos d'ús per avaluar els desenvolupaments aconseguits.[EN] In recent years, the popularity of Cloud computing has allowed users to access unprecedented compute, network, and storage resources under a pay-per-use model. This popularity led to new services to solve specific large-scale computing challenges and simplify the development and deployment of applications. Among the most prominent services in recent years are FaaS (Function as a Service) platforms, whose primary appeal is the ease of deploying small pieces of code in certain programming languages to perform specific tasks on an event-driven basis. These functions are executed on the Cloud provider's servers without users worrying about their maintenance or elasticity management, always keeping a fine-grained pay-per-use model. FaaS platforms belong to the computing paradigm known as Serverless, which aims to abstract the management of servers from the users, allowing them to focus their efforts solely on the development of applications. The problem with FaaS is that it focuses on microservices and tends to have limitations regarding the execution time and the computing capabilities (e.g. lack of support for acceleration hardware such as GPUs). However, it has been demonstrated that the self-provisioning capability and high degree of parallelism of these services can be well suited to broader applications. In addition, their inherent event-driven triggering makes functions perfectly suitable to be defined as steps in file processing workflows (e.g. scientific computing workflows). Furthermore, the rise of smart and embedded devices (IoT), innovations in communication networks and the need to reduce latency in challenging use cases have led to the concept of Edge computing. Edge computing consists of conducting the processing on devices close to the data sources to improve response times. The coupling of this paradigm together with Cloud computing, involving architectures with devices at different levels depending on their proximity to the source and their compute capability, has been coined as Cloud Computing Continuum (or Computing Continuum). Therefore, this PhD thesis aims to apply different Serverless strategies to enable the deployment of generalist applications, packaged in software containers, across the different tiers of the Cloud Computing Continuum. To this end, multiple tools have been developed in order to: i) adapt FaaS services from public Cloud providers; ii) integrate different software components to define a Serverless platform on on-premises and Edge infrastructures; iii) leverage acceleration devices on Serverless platforms; and iv) facilitate the deployment of applications and workflows through user interfaces. Additionally, several use cases have been created and adapted to assess the developments achieved.

    Energy-Aware Self-Adaptation for Application Execution on Heterogeneous Parallel Architectures

    Hardware in High Performance Computing environments in recent years have increasingly become more heterogeneous in order to improve computational performance. An additional aspect of such systems is the management of power and energy consumption. The increase in heterogeneity requires middleware and programming model abstractions to eliminate additional complexities that it brings, while also offering opportunities such as improved power management. In this paper, we explore application level self-adaptation including aspects such as automated configuration and deployment of applications to different heterogeneous infrastructure and for their redeployment. This therefore not only mitigates complexities associated with heterogeneous devices but aims to take advantage of the heterogeneity. The overall result of this paper is a self-adaptive framework that manages application Quality of Service (QoS) at runtime, which includes the automatic migration of applications between different accelerated infrastructures. Discussion covers when this migration is appropriate and quantifies the likely benefits