180 research outputs found

    LEGaTO: first steps towards energy-efficient toolset for heterogeneous computing

    Get PDF
    LEGaTO is a three-year EU H2020 project which started in December 2017. The LEGaTO project will leverage task-based programming models to provide a software ecosystem for Made-in-Europe heterogeneous hardware composed of CPUs, GPUs, FPGAs and dataflow engines. The aim is to attain one order of magnitude energy savings from the edge to the converged cloud/HPC.Peer ReviewedPostprint (author's final draft

    Building Computing-As-A-Service Mobile Cloud System

    Get PDF
    The last five years have witnessed the proliferation of smart mobile devices, the explosion of various mobile applications and the rapid adoption of cloud computing in business, governmental and educational IT deployment. There is also a growing trends of combining mobile computing and cloud computing as a new popular computing paradigm nowadays. This thesis envisions the future of mobile computing which is primarily affected by following three trends: First, servers in cloud equipped with high speed multi-core technology have been the main stream today. Meanwhile, ARM processor powered servers is growingly became popular recently and the virtualization on ARM systems is also gaining wide ranges of attentions recently. Second, high-speed internet has been pervasive and highly available. Mobile devices are able to connect to cloud anytime and anywhere. Third, cloud computing is reshaping the way of using computing resources. The classic pay/scale-as-you-go model allows hardware resources to be optimally allocated and well-managed. These three trends lend credence to a new mobile computing model with the combination of resource-rich cloud and less powerful mobile devices. In this model, mobile devices run the core virtualization hypervisor with virtualized phone instances, allowing for pervasive access to more powerful, highly-available virtual phone clones in the cloud. The centralized cloud, powered by rich computing and memory recourses, hosts virtual phone clones and repeatedly synchronize the data changes with virtual phone instances running on mobile devices. Users can flexibly isolate different computing environments. In this dissertation, we explored the opportunity of leveraging cloud resources for mobile computing for the purpose of energy saving, performance augmentation as well as secure computing enviroment isolation. We proposed a framework that allows mo- bile users to seamlessly leverage cloud to augment the computing capability of mobile devices and also makes it simpler for application developers to run their smartphone applications in the cloud without tedious application partitioning. This framework was built with virtualization on both server side and mobile devices. It has three building blocks including agile virtual machine deployment, efficient virtual resource management, and seamless mobile augmentation. We presented the design, imple- mentation and evaluation of these three components and demonstrated the feasibility of the proposed mobile cloud model

    Supporting soft real-time tasks in the xen hypervisor

    Full text link
    Virtualization technology enables server consolidation and has given an impetus to low-cost green data centers. However, current hypervisors do not provide adequate support for real-time applications, and this has limited the adoption of virtualization in some domains. Soft real-time applications, such as media-based ones, are impeded by components of virtualization including low-performance virtualization I/O, increased scheduling latency, and shared-cache contention. The virtual machine scheduler is central to all these issues. The goal in this paper is to adapt the virtual machine scheduler to be more soft-real-time friendly. We improve two aspects of the VMM scheduler – managing scheduling latency as a first-class resource and managing shared caches. We use enterprise IP telephony as an illustrative soft real-time workload and design a scheduler S that incorporates th

    The AXIOM platform for next-generation cyber physical systems

    Get PDF
    Cyber-Physical Systems (CPSs) are widely used in many applications that require interactions between humans and their physical environment. These systems usually integrate a set of hardware-software components for optimal application execution in terms of performance and energy consumption. The AXIOM project (Agile, eXtensible, fast I/O Module), presented in this paper, proposes a hardware-software platform for CPS coupled with an easy parallel programming model and sufficient connectivity so that the performance can scale-up by adding multiple boards. AXIOM supports a task-based programming model based on OmpSs and leverages a high-speed, inexpensive communication interface called AXIOM-Link. The board also tightly couples the CPU with reconfigurable resources to accelerate portions of the applications. As case studies, AXIOM uses smart video surveillance, and smart home living applicationsThis work is partially supported by the European Union H2020 program through the AXIOM project (grant ICT-01-2014 GA 645496) and HiPEAC (GA 687698), by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology through TIN2015-65316-P project, and by the Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272). We also thank the Xilinx University Program for its hardware and software donations.Peer ReviewedPostprint (author's final draft

    Virtual distributed environments for systems with time requirements

    Get PDF
    Virtualization is widely propagating technology that is used to run multiple virtual machines on the same computational unit by means of a piece of firmware, hardware or software called a hypervisor. Despite having been used since the 60as, the current indisputable need for fast reliable communication may put this technology to question. This project analyzes the amount of impact the virtualization has on the transmission times. In the first part, the Xen hypervisor, configured with different virtual environments, simulating complex scenarios, will be evaluated to determine the size of the impact. As a bridge between the multiple virtual machines, middleware Ice, will be used. Furthermore lower in the scale, for embedded systems, the XtratuM hypervisor was designed to support real-time systems. The second part is dedicated to evaluating whether the communication maintains the real time property of these systems. Bare boned virtualization will be implemented in this second part of the project.Ingeniería en Tecnologías de Telecomunicació

    Multicore Scheduling of Real-Time Irregular Parallel Algorithms in Linux

    Get PDF
    Face à estagnação da tecnologia uniprocessador registada na passada década, aos principais fabricantes de microprocessadores encontraram na tecnologia multi-core a resposta `as crescentes necessidades de processamento do mercado. Durante anos, os desenvolvedores de software viram as suas aplicações acompanhar os ganhos de performance conferidos por cada nova geração de processadores sequenciais, mas `a medida que a capacidade de processamento escala em função do número de processadores, a computação sequencial tem de ser decomposta em várias partes concorrentes que possam executar em paralelo, para que possam utilizar as unidades de processamento adicionais e completar mais rapidamente. A programação paralela implica um paradigma completamente distinto da programação sequencial. Ao contrário dos computadores sequenciais tipificados no modelo de Von Neumann, a heterogeneidade de arquiteturas paralelas requer modelos de programação paralela que abstraiam os programadores dos detalhes da arquitectura e simplifiquem o desenvolvimento de aplicações concorrentes. Os modelos de programação paralela mais populares incitam os programadores a identificar instruções concorrentes na sua lógica de programação, e a especificá-las sob a forma de tarefas que possam ser atribuídas a processadores distintos para executarem em simultâneo. Estas tarefas são tipicamente lançadas durante a execução, e atribuídas aos processadores pelo motor de execução subjacente. Como os requisitos de processamento costumam ser variáveis, e não são conhecidos a priori, o mapeamento de tarefas para processadores tem de ser determinado dinamicamente, em resposta a alterações imprevisíveis dos requisitos de execução. `A medida que o volume da computação cresce, torna-se cada vez menos viável garantir as suas restrições temporais em plataformas uniprocessador. Enquanto os sistemas de tempo real se começam a adaptar ao paradigma de computação paralela, há uma crescente aposta em integrar execuções de tempo real com aplicações interativas no mesmo hardware, num mundo em que a tecnologia se torna cada vez mais pequena, leve, ubíqua, e portável. Esta integração requer soluções de escalonamento que simultaneamente garantam os requisitos temporais das tarefas de tempo real e mantenham um nível aceitável de QoS para as restantes execuções. Para tal, torna-se imperativo que as aplicações de tempo real paralelizem, de forma a minimizar os seus tempos de resposta e maximizar a utilização dos recursos de processamento. Isto introduz uma nova dimensão ao problema do escalonamento, que tem de responder de forma correcta a novos requisitos de execução imprevisíveis e rapidamente conjeturar o mapeamento de tarefas que melhor beneficie os critérios de performance do sistema. A técnica de escalonamento baseado em servidores permite reservar uma fração da capacidade de processamento para a execução de tarefas de tempo real, e assegurar que os efeitos de latência na sua execução não afectam as reservas estipuladas para outras execuções. No caso de tarefas escalonadas pelo tempo de execução máximo, ou tarefas com tempos de execução variáveis, torna-se provável que a largura de banda estipulada não seja consumida por completo. Para melhorar a utilização do sistema, os algoritmos de partilha de largura de banda (capacity-sharing) doam a capacidade não utilizada para a execução de outras tarefas, mantendo as garantias de isolamento entre servidores. Com eficiência comprovada em termos de espaço, tempo, e comunicação, o mecanismo de work-stealing tem vindo a ganhar popularidade como metodologia para o escalonamento de tarefas com paralelismo dinâmico e irregular. O algoritmo p-CSWS combina escalonamento baseado em servidores com capacity-sharing e work-stealing para cobrir as necessidades de escalonamento dos sistemas abertos de tempo real. Enquanto o escalonamento em servidores permite partilhar os recursos de processamento sem interferências a nível dos atrasos, uma nova política de work-stealing que opera sobre o mecanismo de capacity-sharing aplica uma exploração de paralelismo que melhora os tempos de resposta das aplicações e melhora a utilização do sistema. Esta tese propõe uma implementação do algoritmo p-CSWS para o Linux. Em concordância com a estrutura modular do escalonador do Linux, ´e definida uma nova classe de escalonamento que visa avaliar a aplicabilidade da heurística p-CSWS em circunstâncias reais. Ultrapassados os obstáculos intrínsecos `a programação da kernel do Linux, os extensos testes experimentais provam que o p-CSWS ´e mais do que um conceito teórico atrativo, e que a exploração heurística de paralelismo proposta pelo algoritmo beneficia os tempos de resposta das aplicações de tempo real, bem como a performance e eficiência da plataforma multiprocessador.With sequential machines approaching their physical bounds, parallel computers are rapidly becoming pervasive in most areas of modern technology. To realize the full potential of parallel platforms, applications must split onto concurrent parts that can be assigned to different processors and execute in parallel. Parallel programming models abstract the myriad of parallel computer specifications to simplify the development of concurrent applications, allowing programmers to decompose their code onto concurrent tasks, and leaving it to the runtime system to schedule these tasks for parallel execution. The resulting parallelism is often input-dependent and irregular, requiring that the mapping of tasks to processors be performed at runtime in response to dynamic changes of the workload. Motivated by the promises of performance scalability and cost effectiveness, real-time researchers are now beginning to exploit the benefits of parallel processing, with ground-breaking scheduling heuristics to improve the efficiency of time-sensitive concurrent applications. Realtime developments are switching to open scenarios, where real-time tasks of variable and unpredictable size share the available processing resources with other applications, making it essential to utilize as much of the available processing capacity as possible. The p-CSWS algorithm employs bandwidth isolation, capacity-sharing and work-stealing to exploit the intra-task parallelism of hard and soft real-time executions on parallel platforms. This thesis proposes an implementation of the p-CSWS scheduler for the Linux kernel, to evaluate its applicability to real scenarios and bring Linux one step closer to becoming a viable open real-time platform. To the best of our knowledge we are the first to employ scheduling heuristics to exploit dynamic parallelism of real-time tasks on the Linux kernel. Through extensive tests, we show that...

    Applying Hypervisor-Based Fault Tolerance Techniques to Safety-Critical Embedded Systems

    Get PDF
    This document details the work conducted through the development of this thesis, and it is structured as follows: • Chapter 1, Introduction, has briefly presented the motivation, objectives, and contributions of this thesis. • Chapter 2, Fundamentals, exposes a series of concepts that are necessary to correctly understand the information presented in the rest of the thesis, such as the concepts of virtualization, hypervisors, or software-based fault tolerance. In addition, this chapter includes an exhaustive review and comparison between the different hypervisors used in scientific studies dealing with safety-critical systems, and a brief review of some works that try to improve fault tolerance in the hypervisor itself, an area of research that is outside the scope of this work, but that complements the mechanism presented and could be established as a line of future work. • Chapter 3, Problem Statement and Related Work, explains the main reasons why the concept of Hypervisor-Based Fault Tolerance was born and reviews the main articles and research papers on the subject. This review includes both papers related to safety-critical embedded systems (such as the research carried out in this thesis) and papers related to cloud servers and cluster computing that, although not directly applicable to embedded systems, may raise useful concepts that make our solution more complete or allow us to establish future lines of work. • Chapter 4, Proposed Solution, begins with a brief comparison of the work presented in Chapter 3 to establish the requirements that our solution must meet in order to be as complete and innovative as possible. It then sets out the architecture of the proposed solution and explains in detail the two main elements of the solution: the Voter and the Health Monitoring partition. • Chapter 5, Prototype, explains in detail the prototyping of the proposed solution, including the choice of the hypervisor, the processing board, and the critical functionality to be redundant. With respect to the voter, it includes prototypes for both the software version (the voter is implemented in a virtual machine) and the hardware version (the voter is implemented as IP cores on the FPGA). • Chapter 6, Evaluation, includes the evaluation of the prototype developed in Chapter 5. As a preliminary step and given that there is no evidence in this regard, an exercise is carried out to measure the overhead involved in using the XtratuM hypervisor versus not using it. Subsequently, qualitative tests are carried out to check that Health Monitoring is working as expected and a fault injection campaign is carried out to check the error detection and correction rate of our solution. Finally, a comparison is made between the performance of the hardware and software versions of Voter. • Chapter 7, Conclusions and Future Work, is dedicated to collect the conclusions obtained and the contributions made during the research (in the form of articles in journals, conferences and contributions to projects and proposals in the industry). In addition, it establishes some lines of future work that could complete and extend the research carried out during this doctoral thesis.Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Katzalin Olcoz Herrero.- Secretario: Félix García Carballeira.- Vocal: Santiago Rodríguez de la Fuent

    Characterizing and Mitigating Virtual Machine Interference in Public Clouds.

    Full text link
    This dissertation studies the mitigation of the performance and security interference between guest virtual machines (VMs) in public clouds. The goals are to characterize the impact of VM interference, uncover the root cause of the negative impact, and design novel techniques to mitigate such impact. The central premise of this dissertation is that by identifying the shared resources that cause the VM interference and by exploiting the properties of the workloads that share these resources with adapted scheduling policies, public cloud services can reduce conflicts of resource usage between guests and hence mitigate their interference. Current techniques for conflict reduction and interference mitigation overlook the virtualization semantic gap between the cloud host infrastructure and guest virtual ma- chines and the unique challenges posed by the multi-tenancy service model necessary to support public cloud services. This dissertation deals with both performance and security interference problems. It characterizes the impact of VM interference on inter-VM network latency using live measurements in a real public cloud and studies the root cause of the negative impact with controlled experiments on a local testbed. Two methods of improving the inter-VM net- work latency are explored. The first approach is a guest-centric solution that exploits the properties of application workloads to avoid interference without any support from the underlying host infrastructure. The second approach is a host-centric solution that adapts the scheduling policies for the contented resources that cause the interference without guest cooperation. Similarly, the characteristics of cache-based cross-VM attacks are studied in detail using both live cloud measurements and testbed experiments. To mitigate this security interference, a partition-based VM scheduling system is designed to reduce the effectiveness of these cache-based attacks.PhDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107111/1/yunjing_1.pd
    corecore