122 research outputs found

    Swap Fairness for Thrashing Mitigation

    Get PDF
    International audienceThe swap mechanis mallows an operating system to work with more memory than available RAM space, by temporarily flushing some data to disk. However, the system sometimes ends up spending more time swapping data in and out of disk than performing actual computation. This state is called thrashing. Classical strategies against thrashing rely on reducing system load, so as to decrease memory pressure and increase global throughput. Those approaches may however be counterproductive when tricked into advantaging malicious or long-standing processes. This is particularily true in the context of shared hosting or virtualization, where multiple users run uncoordinated and selfish workloads. To address this challenge, we propose an accounting layer that forces swap fairness among processes competing for main memory. It ensures that a process cannot monopolize the swap subsystem by delaying the swap operations of abusive processes, reducing the number of system-wide page faults while maximizing memory utilization

    Using TCP/IP traffic shaping to achieve iSCSI service predictability

    Get PDF
    This thesis reproduces the properties of load interference common in many storage devices using resource sharing for flexibility and maximum hardware utilization. The nature of resource sharing and load is studied and compared to assumptions and models used in previous work. The results are used to design a method for throttling iSCSI initiators, attached to an iSCSI target server, using a packet delay module in Linux Traffic Control. The packet delay throttle enables close-to-linear rate reduction for both read and write operations. Iptables and Ipset are used to add dynamic packet matching needed for rapidly changing throttling values. All throttling is achieved without triggering TCP retransmit timeout and subsequent slow start caused by packet loss. A control mechanism for dynamically adapting throttling values to rapidly changing workloads is implemented using a modified proportional integral derivative (PID) controller. Using experiments, control engineering filtering techniques and results from previous research, a suitable per resource saturation indicator was found. The indicator is an exponential moving average of the wait time of active resource consumers. It is used as input value to the PID controller managing the packet rates of resource consumers, creating a closed control loop managed by the PID controller. Finally a prototype of an autonomic resource prioritization framework is designed. The framework identifies and maintains information about resources, their consumers, their average wait time for active consumers and their set of throttleable consumers. The information is kept in shared memory and a PID controller is spawned for each resource, thus safeguarding read response times by throttling writers on a per-resource basis. The framework is exposed to extreme workload changes and demonstrates high ability to keep read response time below a predefined threshold. Using moderate tuning efforts the framework exhibits low overhead and resource consumption, promising suitability for large scale operation in production environments

    Energy Management for Hypervisor-Based Virtual Machines

    Get PDF
    Current approaches to power management are based on operating systems with full knowledge of and full control over the underlying hardware; the distributed nature of multi-layered virtual machine environments renders such approaches insufficient. In this paper, we present a novel framework for energy management in modular, multi-layered operating system structures. The framework provides a unified model to partition and distribute energy, and mechanisms for energy-aware resource accounting and allocation. As a key property, the framework explicitly takes the recursive energy consumption into account, which is spent, e.g., in the virtualization layer or subsequent driver components. Our prototypical implementation targets hypervisor- based virtual machine systems and comprises two components: a host-level subsystem, which controls machine-wide energy constraints and enforces them among all guest OSes and service components, and, complementary, an energy-aware guest operating system, capable of fine-grained applicationspecific energy management. Guest level energy management thereby relies on effective virtualization of physical energy effects provided by the virtual machine monitor. Experiments with CPU and disk devices and an external data acquisition system demonstrate that our framework accurately controls and stipulates the power consumption of individual hardware devices, both for energy-aware and energyunaware guest operating systems

    Many-Task Computing and Blue Waters

    Full text link
    This report discusses many-task computing (MTC) generically and in the context of the proposed Blue Waters systems, which is planned to be the largest NSF-funded supercomputer when it begins production use in 2012. The aim of this report is to inform the BW project about MTC, including understanding aspects of MTC applications that can be used to characterize the domain and understanding the implications of these aspects to middleware and policies. Many MTC applications do not neatly fit the stereotypes of high-performance computing (HPC) or high-throughput computing (HTC) applications. Like HTC applications, by definition MTC applications are structured as graphs of discrete tasks, with explicit input and output dependencies forming the graph edges. However, MTC applications have significant features that distinguish them from typical HTC applications. In particular, different engineering constraints for hardware and software must be met in order to support these applications. HTC applications have traditionally run on platforms such as grids and clusters, through either workflow systems or parallel programming systems. MTC applications, in contrast, will often demand a short time to solution, may be communication intensive or data intensive, and may comprise very short tasks. Therefore, hardware and software for MTC must be engineered to support the additional communication and I/O and must minimize task dispatch overheads. The hardware of large-scale HPC systems, with its high degree of parallelism and support for intensive communication, is well suited for MTC applications. However, HPC systems often lack a dynamic resource-provisioning feature, are not ideal for task communication via the file system, and have an I/O system that is not optimized for MTC-style applications. Hence, additional software support is likely to be required to gain full benefit from the HPC hardware

    RADIO: managing the performance of large, distributed storage systems

    Get PDF
    Els sistemes informàtics d’altes prestacions continuen creixent en grandària i complexitat, i sovint han de gestionar moltes tasques diferents simultàniament. El subsistema d’entrada i sortida és freqüentment un coll d’ampolla per al rendiment general del sistema, i les interferències entre aplicacions poden conduir a la degradació desproporcionada de les prestacions, a temps d’execució impredictibles i a l’ús ineficient dels recursos. Aquesta xerrada presenta la nostra recerca en curs sobre com s’ha de gestionar i garantir l’execució de grans sistemes d’emmagatzematge distribuït. Discutirem el nostre model general per a la gestió del rendiment, supervisarem les nostres solucions per a la UCP (unitat central de processament), el disc, la xarxa, l’emmagatzematge i el servidor de memòria cau, i discutirem la nostra recerca, encaminada a aplicar aquestes solucions per al control i la gestió de sistemes distribuïts

    Analysis of Performance and Power Aspects of Hypervisors in Soft Real-Time Embedded Systems

    Get PDF
    The exponential growth of malware designed to attack soft real-time embedded systems has necessitated solutions to secure these systems. Hypervisors are a solution, but the overhead imposed by them needs to be quantitatively understood. Experiments were conducted to quantify the overhead hypervisors impose on soft real-time embedded systems. A soft real-time computer vision algorithm was executed, with average and worst-case execution times measured as well as the average power consumption. These experiments were conducted with two hypervisors and a control configuration. The experiments showed that each hypervisor imposed differing amounts of overhead, with one achieving near native performance and the other noticeably impacting the performance of the system

    Memory-Aware Scheduling for Fixed Priority Hard Real-Time Computing Systems

    Get PDF
    As a major component of a computing system, memory has been a key performance and power consumption bottleneck in computer system design. While processor speeds have been kept rising dramatically, the overall computing performance improvement of the entire system is limited by how fast the memory can feed instructions/data to processing units (i.e. so-called memory wall problem). The increasing transistor density and surging access demands from a rapidly growing number of processing cores also significantly elevated the power consumption of the memory system. In addition, the interference of memory access from different applications and processing cores significantly degrade the computation predictability, which is essential to ensure timing specifications in real-time system design. The recent IC technologies (such as 3D-IC technology) and emerging data-intensive real-time applications (such as Virtual Reality/Augmented Reality, Artificial Intelligence, Internet of Things) further amplify these challenges. We believe that it is not simply desirable but necessary to adopt a joint CPU/Memory resource management framework to deal with these grave challenges. In this dissertation, we focus on studying how to schedule fixed-priority hard real-time tasks with memory impacts taken into considerations. We target on the fixed-priority real-time scheduling scheme since this is one of the most commonly used strategies for practical real-time applications. Specifically, we first develop an approach that takes into consideration not only the execution time variations with cache allocations but also the task period relationship, showing a significant improvement in the feasibility of the system. We further study the problem of how to guarantee timing constraints for hard real-time systems under CPU and memory thermal constraints. We first study the problem under an architecture model with a single core and its main memory individually packaged. We develop a thermal model that can capture the thermal interaction between the processor and memory, and incorporate the periodic resource sever model into our scheduling framework to guarantee both the timing and thermal constraints. We further extend our research to the multi-core architectures with processing cores and memory devices integrated into a single 3D platform. To our best knowledge, this is the first research that can guarantee hard deadline constraints for real-time tasks under temperature constraints for both processing cores and memory devices. Extensive simulation results demonstrate that our proposed scheduling can improve significantly the feasibility of hard real-time systems under thermal constraints

    User-aware performance evaluation and optimization of parallel job schedulers

    Get PDF
    Die Dissertation User-Aware Performance Evaluation and Optimization of Parallel Job Schedulers beschäftigt sich mit der realitätsnahen, dynamischen Simulation und Optimierung von Lastsituationen in parallelen Rechensystemen unter Berücksichtigung von Feedback-Effekten zwischen Performance und Nutzerverhalten. Die Besonderheit solcher Systeme liegt in der geteilten Nutzung durch mehrere Anwender, was zu einer Einschr änkung der Verfügbarkeit von Ressourcen führt. Sollten nicht alle Rechenanfragen, die sogenannten Jobs, gleichzeitig ausgeführt werden können, werden diese in Warteschlangen zwischengespeichert. Da das Verhalten der Nutzer nicht genau bekannt ist, entsteht eine große Unsicherheit bezüglich zukünftiger Lastsituationen. Ziel ist es, Methoden zu finden, die eine Ressourcenzuweisung erzeugt, die den Zielvorstellungen der Nutzer soweit wie möglich entspricht und diese Methoden realistisch zu evaluieren. Dabei ist auch zu berücksichtigen, dass das Nutzerverhalten und die Zielvorstellungen der Nutzer in Abhängigkeit von der Lastsituation und der Ressourcenzuweisung variieren. Es wird ein dreigliedriger Forschungsansatz gewählt: Analyse von Nutzerverhalten unter Ressourcenbeschränkung: In Traces von parallelen Rechensystem zeigt sich, dass die Wartezeit auf Rechenergebnisse mit dem zukünftigen Nutzerverhalten korreliert, d.h. dass es im Durchschnitt länger dauert bis ein Nutzer, der lange warten musste, erneut das System nutzt. Im Rahmen des Promotionsprojekts wurde diese Analyse fortgesetzt und zusätzliche Korrelationen zwischen weiteren Systemparametern (neben der Wartezeit) und Nutzerverhalten aufgedeckt. Des Weiteren wurden Funktionen zur Zufriedenheit und Reaktion von Nutzern auf variierende Antwortzeiten von Rechensystemen entwickelt. Diese Ergebnisse wurden durch eine Umfrage unter Nutzern von Parallelrechner an der TU Dortmund erzielt, für die ein spezieller Frageboden entwickelt wurde. Modellierung von Nutzerverhalten und Feedback Wegen des dynamischen Zusammenhangs zwischen Systemgeschwindigkeit und Nutzerverhalten ist es nötig, Zuweisungsstrategien in dynamischen, feedback-getriebenen Simulationen zu evaluieren. Hierzu wurde ein mehrstufiges Nutzermodell entwickelt, welches die aktuellen Annahmen an Nutzerverhalten beinhaltet und das zukünftige Hinzufügen von zusätzlichen Verhaltenskomponenten ermöglicht. Die Kernelemente umfassen bisher Modelle für den Tages- und Nachtrhythmus, den Arbeitsrhythmus und Eigenschaften der submittierten Jobs. Das dynamische Feedback ist derart gestaltet, dass erst die Fertigstellung von bestimmten Jobs die zukünftige Jobeinreichung auslöst. Optimierung der Allokationsstrategien zur Steigerung der Nutzerzufriedenheit Die mit Hilfe des Fragebogens entwickelte Wartezeitakzeptanz von Nutzern ist durch ein MILP optimiert worden. Das MILP sucht nach Lösungen, die möglichst viele Jobs innerhalb eines akzeptierten Wartezeitfensters startet bzw. die Summe der Verspätungen minimiert. Durch die Komplexität dieses Optimierungsalgorithmus besteht die Evaluation bisher nur auf fixierten, statischen Szenarien, die Abbilder bestimmter System- und Warteschlangenzustände abbilden. Deswegen ist weiterhin geplant, Schedulingverfahren zur Steigerung der Anzahl an eingereichten Jobs und der Wartezeitzufriedenheit mit Hilfe des dynamischen Modells zu evaluieren
    • …
    corecore