526 research outputs found

    Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing

    Get PDF
    The availability of many-core computing platforms enables a wide variety of technical solutions for systems across the embedded, high-performance and cloud computing domains. However, large scale manycore systems are notoriously hard to optimise. Choices regarding resource allocation alone can account for wide variability in timeliness and energy dissipation (up to several orders of magnitude). Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing covers dynamic resource allocation heuristics for manycore systems, aiming to provide appropriate guarantees on performance and energy efficiency. It addresses different types of systems, aiming to harmonise the approaches to dynamic allocation across the complete spectrum between systems with little flexibility and strict real-time guarantees all the way to highly dynamic systems with soft performance requirements. Technical topics presented in the book include: Load and Resource Models Admission Control Feedback-based Allocation and Optimisation Search-based Allocation Heuristics Distributed Allocation based on Swarm Intelligence Value-Based Allocation Each of the topics is illustrated with examples based on realistic computational platforms such as Network-on-Chip manycore processors, grids and private cloud environments.Note.-- EUR 6,000 BPC fee funded by the EC FP7 Post-Grant Open Access Pilo

    Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing

    Get PDF
    The availability of many-core computing platforms enables a wide variety of technical solutions for systems across the embedded, high-performance and cloud computing domains. However, large scale manycore systems are notoriously hard to optimise. Choices regarding resource allocation alone can account for wide variability in timeliness and energy dissipation (up to several orders of magnitude). Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing covers dynamic resource allocation heuristics for manycore systems, aiming to provide appropriate guarantees on performance and energy efficiency. It addresses different types of systems, aiming to harmonise the approaches to dynamic allocation across the complete spectrum between systems with little flexibility and strict real-time guarantees all the way to highly dynamic systems with soft performance requirements. Technical topics presented in the book include: • Load and Resource Models• Admission Control• Feedback-based Allocation and Optimisation• Search-based Allocation Heuristics• Distributed Allocation based on Swarm Intelligence• Value-Based AllocationEach of the topics is illustrated with examples based on realistic computational platforms such as Network-on-Chip manycore processors, grids and private cloud environments

    Self-adaptivity of applications on network on chip multiprocessors: the case of fault-tolerant Kahn process networks

    Get PDF
    Technology scaling accompanied with higher operating frequencies and the ability to integrate more functionality in the same chip has been the driving force behind delivering higher performance computing systems at lower costs. Embedded computing systems, which have been riding the same wave of success, have evolved into complex architectures encompassing a high number of cores interconnected by an on-chip network (usually identified as Multiprocessor System-on-Chip). However these trends are hindered by issues that arise as technology scaling continues towards deep submicron scales. Firstly, growing complexity of these systems and the variability introduced by process technologies make it ever harder to perform a thorough optimization of the system at design time. Secondly, designers are faced with a reliability wall that emerges as age-related degradation reduces the lifetime of transistors, and as the probability of defects escaping post-manufacturing testing is increased. In this thesis, we take on these challenges within the context of streaming applications running in network-on-chip based parallel (not necessarily homogeneous) systems-on-chip that adopt the no-remote memory access model. In particular, this thesis tackles two main problems: (1) fault-aware online task remapping, (2) application-level self-adaptation for quality management. For the former, by viewing fault tolerance as a self-adaptation aspect, we adopt a cross-layer approach that aims at graceful performance degradation by addressing permanent faults in processing elements mostly at system-level, in particular by exploiting redundancy available in multi-core platforms. We propose an optimal solution based on an integer linear programming formulation (suitable for design time adoption) as well as heuristic-based solutions to be used at run-time. We assess the impact of our approach on the lifetime reliability. We propose two recovery schemes based on a checkpoint-and-rollback and a rollforward technique. For the latter, we propose two variants of a monitor-controller- adapter loop that adapts application-level parameters to meet performance goals. We demonstrate not only that fault tolerance and self-adaptivity can be achieved in embedded platforms, but also that it can be done without incurring large overheads. In addressing these problems, we present techniques which have been realized (depending on their characteristics) in the form of a design tool, a run-time library or a hardware core to be added to the basic architecture

    Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing

    Get PDF
    The availability of many-core computing platforms enables a wide variety of technical solutions for systems across the embedded, high-performance and cloud computing domains. However, large scale manycore systems are notoriously hard to optimise. Choices regarding resource allocation alone can account for wide variability in timeliness and energy dissipation (up to several orders of magnitude). Dynamic Resource Allocation in Embedded, High-Performance and Cloud Computing covers dynamic resource allocation heuristics for manycore systems, aiming to provide appropriate guarantees on performance and energy efficiency. It addresses different types of systems, aiming to harmonise the approaches to dynamic allocation across the complete spectrum between systems with little flexibility and strict real-time guarantees all the way to highly dynamic systems with soft performance requirements. Technical topics presented in the book include: • Load and Resource Models• Admission Control• Feedback-based Allocation and Optimisation• Search-based Allocation Heuristics• Distributed Allocation based on Swarm Intelligence• Value-Based AllocationEach of the topics is illustrated with examples based on realistic computational platforms such as Network-on-Chip manycore processors, grids and private cloud environments

    Geo-distributed Multi-tier Workload Migration Over Multi-timescale Electricity Markets

    Get PDF
    Virtual machine (VM) migration enables cloud service providers (CSPs) to balance workload, perform zero-downtime maintenance, and reduce applications\u27 power consumption and response time. Migrating a VM consumes energy at the source, destination, and backbone networks, i.e., intermediate routers and switches, especially in a Geo-distributed setting. In this context, we propose a VM migration model called Low Energy Application Workload Migration (LEAWM) aimed at reducing the per-bit migration cost in migrating VMs over Geo-distributed clouds. With a Geo-distributed cloud connected through multiple Internet Service Providers (ISPs), we develop an approach to find out the migration path across ISPs leading to the most feasible destination. For this, we use the variation in the electricity price at the ISPs to decide the migration paths. However, reduced power consumption at the expense of higher migration time is intolerable for real-time applications. As finding an optimal relocation is NP\mathcal {NP}-Hard, we propose an Ant Colony Optimization (ACO) based bi-objective optimization technique to strike a balance between migration delay and migration power. A thorough simulation analysis of the proposed approach shows that the proposed model can reduce the migration time by 25%25\%–30%30\% and electricity cost by approximately 25%25\% compared to the baseline

    Practical Gpgpu Application Resilience Estimation And Fortification

    Get PDF
    Graphics Processing Units (GPUs) are becoming a de facto solution for accelerating a wide range of applications but remain susceptible to transient hardware faults (soft errors) that can easily compromise application output. One of the major challenges in the domain of GPU reliability is to accurately measure general purpose GPU (GPGPU) application resilience to transient faults. This challenge stems from the fact that a typical GPGPU application spawns a huge number of threads and then utilizes a large amount of potentially unreliable compute and memory resources available on the GPUs. As the number of possible fault locations can be in the billions, evaluating every fault and examining its effect on the application error resilience is impractical. Alternatively, fault site selection techniques have been proposed to approach high accuracy with less fault injection experiments. However, most of the existing methods in the literature only focus on the single-bit fault model and only one input. In this dissertation, we offer solutions to the two problems above. We extend a progressive fault site pruning technique for two multi-bit fault models: (a) multi-bit faults in the same word; (b) multiple single-bit faults in different words accessed by the same thread. We devise a methodology, SUGAR (Speeding Up GPGPU Application Resilience Estimation with input sizing), that dramatically speeds up the evaluation of application error resilience. Key of the SUGAR estimation methodology is the identification of repeating thread patterns that develop as a function of the size of the input. These patterns allow for accurate prediction of application error resilience for arbitrarily large inputs. With the presence of input-aware estimation strategies, we are able to pinpoint the vulnerabilities in a GPGPU application and propose low overhead protection techniques accordingly. Based on the variety of thread resilience in GPGPU applications, we propose a methodology that identifies the resilience of threads and aims to map threads with the same resilience characteristics to the same warp. Our technique allows engaging partial protection mechanisms at the warp level. We illustrate that threads can be remapped into reliable or unreliable warps with only minimal introduced overhead, and then selective protection via replication is applied in unreliable warps. We show how this remapping facilitates warp replication for error detection and correction and achieves a significant reduction of execution cycles, comparing to standard techniques. In addition to input-aware estimation and fortification, we present a detailed characterization comparing microarchitecture-level and software-level fault injection and show the gap of resilience estimation introduced by injecting faults into different layers in the system execution stack. We also implement a software-level redundancy protection mechanism and measure its effectiveness using microarchitecture-level and software-level fault injection

    Dynamic Resource Management of Network-on-Chip Platforms for Multi-stream Video Processing

    Get PDF
    This thesis considers resource management in the context of parallel multiple video stream decoding, on multicore/many-core platforms. Such platforms have tens or hundreds of on-chip processing elements which are connected via a Network-on-Chip (NoC). Inefficient task allocation configurations can negatively affect the communication cost and resource contention in the platform, leading to predictability and performance issues. Efficient resource management for large-scale complex workloads is considered a challenging research problem; especially when applications such as video streaming and decoding have dynamic and unpredictable workload characteristics. For these type of applications, runtime heuristic-based task mapping techniques are required. As the application and platform size increase, decentralised resource management techniques are more desirable to overcome the reliability and performance bottlenecks in centralised management. In this work, several heuristic-based runtime resource management techniques, targeting real-time video decoding workloads are proposed. Firstly, two admission control approaches are proposed; one fully deterministic and highly predictable; the other is heuristic-based, which balances predictability and performance. Secondly, a pair of runtime task mapping schemes are presented, which make use of limited known application properties, communication cost and blocking-aware heuristics. Combined with the proposed deterministic admission controller, these techniques can provide strict timing guarantees for hard real-time streams whilst improving resource usage. The third contribution in this thesis is a distributed, bio-inspired, low-overhead, task re-allocation technique, which is used to further improve the timeliness and workload distribution of admitted soft real-time streams. Finally, this thesis explores parallelisation and resource management issues, surrounding soft real-time video streams that have been encoded using complex encoding tools and modern codecs such as High Efficiency Video Coding (HEVC). Properties of real streams and decoding trace data are analysed, to statistically model and generate synthetic HEVC video decoding workloads. These workloads are shown to have complex and varying task dependency structures and resource requirements. To address these challenges, two novel runtime task clustering and mapping techniques for Tile-parallel HEVC decoding are proposed. These strategies consider the workload communication to computation ratio and stream-specific characteristics to balance predictability improvement and communication energy reduction. Lastly, several task to memory controller port assignment schemes are explored to alleviate performance bottlenecks, resulting from memory traffic contention

    Thin Hypervisor-Based Security Architectures for Embedded Platforms

    Get PDF
    Virtualization has grown increasingly popular, thanks to its benefits of isolation, management, and utilization, supported by hardware advances. It is also receiving attention for its potential to support security, through hypervisor-based services and advanced protections supplied to guests. Today, virtualization is even making inroads in the embedded space, and embedded systems, with their security needs, have already started to benefit from virtualization’s security potential. In this thesis, we investigate the possibilities for thin hypervisor-based security on embedded platforms. In addition to significant background study, we present implementation of a low-footprint, thin hypervisor capable of providing security protections to a single FreeRTOS guest kernel on ARM. Backed by performance test results, our hypervisor provides security to a formerly unsecured kernel with minimal performance overhead, and represents a first step in a greater research effort into the security advantages and possibilities of embedded thin hypervisors. Our results show that thin hypervisors are both possible and beneficial even on limited embedded systems, and sets the stage for more advanced investigations, implementations, and security applications in the future


    Get PDF
    The Research Institute for Advanced Computer Science (RIACS) was established by the Universities Space Research Association (USRA) at the NASA Ames Research Center (ARC) on June 6, 1983. RIACS is privately operated by USRA, a consortium of universities that serves as a bridge between NASA and the academic community. Under a five-year co-operative agreement with NASA, research at RIACS is focused on areas that are strategically enabling to the Ames Research Center's role as NASA's Center of Excellence for Information Technology. The primary mission of RIACS is charted to carry out research and development in computer science. This work is devoted in the main to tasks that are strategically enabling with respect to NASA's bold mission in space exploration and aeronautics. There are three foci for this work: (1) Automated Reasoning. (2) Human-Centered Computing. and (3) High Performance Computing and Networking. RIACS has the additional goal of broadening the base of researcher in these areas of importance to the nation's space and aeronautics enterprises. Through its visiting scientist program, RIACS facilitates the participation of university-based researchers, including both faculty and students, in the research activities of NASA and RIACS. RIACS researchers work in close collaboration with NASA computer scientists on projects such as the Remote Agent Experiment on Deep Space One mission, and Super-Resolution Surface Modeling

    CloudMon: a resource-efficient IaaS cloud monitoring system based on networked intrusion detection system virtual appliances

    Get PDF
    The networked intrusion detection system virtual appliance (NIDS-VA), also known as virtualized NIDS, plays an important role in the protection and safeguard of IaaS cloud environments. However, it is nontrivial to guarantee both of the performance of NIDS-VA and the resource efficiency of cloud applications because both are sharing computing resources in the same cloud environment. To overcome this challenge and trade-off, we propose a novel system, named CloudMon, which enables dynamic resource provision and live placement for NIDS-VAs in IaaS cloud environments. CloudMon provides two techniques to maintain high resource efficiency of IaaS cloud environments without degrading the performance of NIDS-VAs and other virtual machines (VMs). The first technique is a virtual machine monitor based resource provision mechanism, which can minimize the resource usage of a NIDS-VA with given performance guarantee. It uses a fuzzy model to characterize the complex relationship between performance and resource demands of a NIDS-VA and develops an online fuzzy controller to adaptively control the resource allocation for NIDS-VAs under varying network traffic. The second one is a global resource scheduling approach for optimizing the resource efficiency of the entire cloud environments. It leverages VM migration to dynamically place NIDS-VAs and VMs. An online VM mapping algorithm is designed to maximize the resource utilization of the entire cloud environment. Our virtual machine monitor based resource provision mechanism has been evaluated by conducting comprehensive experiments based on Xen hypervisor and Snort NIDS in a real cloud environment. The results show that the proposed mechanism can allocate resources for a NIDS-VA on demand while still satisfying its performance requirements. We also verify the effectiveness of our global resource scheduling approach by comparing it with two classic vector packing algorithms, and the results show that our approach improved the resource utilization of cloud environments and reduced the number of in-use NIDS-VAs and physical hosts.The authors gratefully acknowledge the anonymous reviewers for their helpful suggestions and insightful comments to improve the quality of the paper. The work reported in this paper has been partially supported by National Nature Science Foundation of China (No. 61202424, 61272165, 91118008), China 863 program (No. 2011AA01A202), Natural Science Foundation of Jiangsu Province of China (BK20130528) and China 973 Fundamental R&D Program (2011CB302600)
    • …