12 research outputs found

    Energy efficient HPC network topologies with on/off links

    Get PDF
    Producción CientíficaEnergy efficiency is a must in today HPC systems. To achieve this goal, a holistic design based on the use of power-aware components should be performed. One of the key components of an HPC system is the high-speed interconnect. In this paper, we compare and evaluate several design options for the interconnection network of an HPC system, including torus, fat-trees and dragonflies. State of the art low power modes are also used in the interconnection networks. The paper does not only consider energy efficiency at the interconnection network level but also at the system as a whole. The analysis is performed by using a simple yet realistic power model of the system. The model has been adjusted using actual power consumption values measured on a real system. Using this model, realistic multi-job trace-based workloads have been used, obtaining the execution time and energy consumed. The results are presented to ease choosing a system, depending on which parameter, performance or energy consumption, receives the most importance.Ministerio de Economía, Industria y Competitividad (projects PID2019-105903RB-100 and PID2021-123627OB)Junta de Comunidades de Castilla-La Mancha (project SBPLY/21/180501/ 000248

    Deviation-tolerant computation in concurrent failure-prone hardware

    Get PDF

    The Application of Hybridized Genetic Algorithms to the Protein Folding Problem

    Get PDF
    The protein folding problem consists of attempting to determine the native conformation of a protein given its primary structure. This study examines various methods of hybridizing a genetic algorithm implementation in order to minimize an energy function and predict the conformation (structure) of Met-enkephalin. Genetic Algorithms are semi-optimal algorithms designed to explore and exploit a search space. The genetic algorithm uses selection, recombination, and mutation operators on populations of strings which represent possible solutions to the given problem. One step in solving the protein folding problem is the design of efficient energy minimization techniques. A conjugate gradient minimization technique is described and tested with different replacement frequencies. Baidwinian, Lamarckian, and probabilistic Lamarckian evolution are all tested. Another extension of simple genetic algorithms can be accomplished with niching. Niching works by de-emphasizing solutions based on their proximity to other solutions in the space. Several variations of niching are tested. Experiments are conducted to determine the benefits of each hybridization technique versus each other and versus the genetic algorithm by itself. The experiments are geared toward trying to find the lowest possible energy and hence the minimum conformation of Met-enkephalin. In the experiments, probabilistic Lamarckian strategies were successful in achieving energies below that of the published minimum in QUANTA

    Integrated shared-memory and message-passing communication in the Alewife multiprocessor

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.Includes bibliographical references (p. 237-246) and index.by John David Kubiatowicz.Ph.D

    Multilevel distributed diagnosis and the design of a distributed network fault detection system based on the SNMP protocol.

    Get PDF
    In this thesis, we propose a new distributed diagnosis algorithm using the multilevel paradigm. This algorithm is a generalization of both the ADSD and Hi-ADSD algorithms. We present all details of the design and implementation of this multilevel adaptive distributed diagnosis algorithm called the ML-ADSD algorithm. We also present extensive simulation results comparing the performance of these three algorithms.In 1967, Preparata, Metze and Chien proposed a model and a framework for diagnosing faulty processors in a multiprocessor system. To exploit the inherent parallelism available in a multiprocessor system and thereby improving fault tolerance, Kuhl and Reddy, in 1980, pioneered a new area of research known as distributed system level diagnosis. Following this pioneering work, in 1991, Bianchini and Buskens proposed an adaptive distributed algorithm to diagnose fully connected networks. This algorithm called the ADSD algorithm has a diagnosis latency of O(N) testing rounds for a network with N nodes. With a view to improving the diagnosis latency of the ADSD algorithm, in 1998 Duarte and Nanya proposed a hierarchical distributed diagnosis algorithm for fully connected networks. This algorithm called the Hi-ADSD algorithm has a diagnosis latency of O(log2N) testing rounds. The Hi-ADSD algorithm can be viewed as a generalization of the ADSD algorithm.In all cases, the time required by the ML-ADSD algorithm is better than or the same as for the Hi-ADSD algorithm. The performance of the ML-ADSD algorithm can be improved by an appropriate choice of the number of clusters and the number of levels. Also, the ML-ADSD algorithm is scalable in the sense that only some minor modifications will be required to adapt the algorithm to networks of varying sizes. This property is not shared by the Hi-ADSD algorithm. The primary application of our research is to develop and implement a prototype network fault detection/monitoring system by integrating the ML-ADSD algorithm into a SNMP-based (Simple Network Management Protocol) fault management system. We report the details of the design and implementation of such a distributed network fault detection system

    New contributions for modeling and simulating high performance computing applications on parallel and distributed architectures

    Get PDF
    In this thesis we propose a new simulation platform specifically designed for modeling parallel and distributed architectures, which consists on integrating the model of the four basic systems into a single simulation platform. Those systems consist of storage system, memory system, processing system and network system. The main characteristics of this platform are flexibility, to embrace the widest range of possible designs; scalability, to check the limits of extending the architecture designs; and the necessary trade-offs between the execution time and the accuracy obtained. This simulation platform is aimed to model both existent and new designs of HPC architectures and applications. Then, depending on the user's requirements, the model can be focused on a set of the basic systems, or by the contrary on the complete system. Therefore, a complete distributed system can be modeled by integrating those basic systems in the model, each one with the corresponding level of detail, which provides a high level of flexibility. Moreover, it provides a good compromise between accuracy and performance, and flexibility provided for building a wide range of architectures with different configurations. A validation process of the proposed simulation platform has been fulfilled by comparing the results obtained in real architectures with those obtained in the analogous simulated environments. Furthermore, in order to evaluate and analyze how evolve both scalability and bottlenecks existent on a typical HPC multi-core architecture using different configurations, a set of experiments have been achieved. Basically those experiments consist on executing the two application models (HPC and checkpointing applications) in several HPC architectures. Finally, performance results of the simulation itself for executing the corresponding experiments have been achieved. The main purpose of this process is to calculate both the amount of time and memory needed for executing a specific simulation, depending of the size of the environment to be modeled, and the hardware resources available for executing each simulation. ----------------------------------------------------------------------------------------------------------------------------------------------------------En esta tesis se propone una nueva plataforma de simulación específicamente diseñada para modelar sistemas paralelos y distribuidos, la cual se basa en la integración del modelo de los cuatro sistemas básicos en una única plataforma de simulación. Estos sistemas están formados por el sistema de almacenamiento, el sistema de memoria, el sistema de procesamiento (CPU) y el sistema de red. Las principales características de esta plataforma de simulación son flexibilidad, para abarcar el mayor rango de diseños posible; escalabilidad, para comprobar los límites al incrementar el tamaño de las arquitecturas modeladas; y el balance entre los tiempos de ejecución y la precisión obtenida en las simulaciones. Esta plataforma de simulación está orientada a modelar tanto sistemas actuales como nuevos diseños de arquitecturas HPC y aplicaciones. De esta forma, dependiendo de los requisitos del usuario, el modelo puede estar enfocado a un conjunto de sistemas, o por el contrario, éste puede estar enfocado en el sistema completo. Por ello, se pueden modelar sistemas distribuidos completos integrando los sistemas básicos en un único modelo, cada uno con su nivel de detalle correspondiente, lo cual proporciona un alto nivel de flexibilidad. Además, esta plataforma proporciona un buen compromiso tanto entre precisión y rendimiento, como en la flexibilidad proporcionada para poder construir un amplio rango de arquitecturas utilizando diferentes configuraciones. Además, se ha llevado a cabo un proceso de validación de la plataforma de simulación propuesta, comparando los resultados obtenidos en entornos reales con aquellos obtenidos en los modelos análogos. Posteriormente, se han realizado una serie de experimentos para realizar una evaluación y análisis de cómo evolucionan, tanto la escalabilidad como los cuellos de botella, existentes en una arquitectura HPC típica multi-core utilizando diferentes configuraciones. Básicamente estos experimentos consisten en ejecutar 2 modelos de aplicaciones (HPC y checkpointing) en varias arquitecturas. Finalmente, se han calculado datos de rendimiento de la propia plataforma de simulación con los experimentos realizados. El propósito de este proceso es calcular, tanto el tiempo como la cantidad de memoria necesaria, para ejecutar una simulación concreta dependiendo tanto del tamaño del entorno simulado, como de los recursos disponibles para ejecutar tal simulación

    An efficient parallelization of a real scientific application

    Get PDF
    Bibliography: leaves 137-145.In the past decade the cost of computing has come down considerably making high-powered computing more easily affordable. As a result many institutions and organisations now have networks of high-powered workstations. Such networks provide a large, generally untapped, source of computing power which can be used for running large scientific applications which previously could only be run on supercomputers. This dissertation shows that a substantial improvement in performance can be achieved by the parallelization of a real scientific application for a heterogeneous network of Sun and Silicon Graphics workstations connected by an Ethernet network, but that this is affected by a number of factors. These factors include communication delays, load balancing, and the number of slaves used. This dissertation shows that performance can be improved by sending more, shorter messages, and by overlapping communication with computation. Part of this thesis concerns the difficulties involved in the evaluation of parallel performance on a heterogeneous network. This dissertation shows that conventional methods such as speedup and efficiency are not appropriate for evaluating the performance of a heterogeneous system, and that linear speed gives a much more representative indication of the actual performance achieved. We also proposed new concepts of perfect linear speed and linear efficiency, which help to evaluate the improvement in parallel performance on a heterogeneous system

    Metacomputing on clusters augmented with reconfigurable hardware

    Get PDF

    Energy efficient heterogeneous virtualized data centers

    Get PDF
    Meine Dissertation befasst sich mit software-gesteuerter Steigerung der Energie-Effizienz von Rechenzentren. Deren Anteil am weltweiten Gesamtstrombedarf wurde auf 1-2%geschätzt, mit stark steigender Tendenz. Server verursachen oft innerhalb von 3 Jahren Stromkosten, die die Anschaffungskosten übersteigen. Die Steigerung der Effizienz aller Komponenten eines Rechenzentrums ist daher von hoher ökonomischer und ökologischer Bedeutung. Meine Dissertation befasst sich speziell mit dem effizienten Betrieb der Server. Ein Großteil wird sehr ineffizient genutzt, Auslastungsbereiche von 10-20% sind der Normalfall, bei gleichzeitig hohem Strombedarf. In den letzten Jahren wurde im Bereich der Green Data Centers bereits Erhebliches an Forschung geleistet, etwa bei Kühltechniken. Viele Fragestellungen sind jedoch derzeit nur unzureichend oder gar nicht gelöst. Dazu zählt, inwiefern eine virtualisierte und heterogene Server-Infrastruktur möglichst stromsparend betrieben werden kann, ohne dass Dienstqualität und damit Umsatzziele Schaden nehmen. Ein Großteil der bestehenden Arbeiten beschäftigt sich mit homogenen Cluster-Infrastrukturen, deren Rahmenbedingungen nicht annähernd mit Business-Infrastrukturen vergleichbar sind. Hier dürfen verringerte Stromkosten im Allgemeinen nicht durch Umsatzeinbußen zunichte gemacht werden. Insbesondere ist ein automatischer Trade-Off zwischen mehreren Kostenfaktoren, von denen einer der Energiebedarf ist, nur unzureichend erforscht. In meiner Arbeit werden mathematische Modelle und Algorithmen zur Steigerung der Energie-Effizienz von Rechenzentren erforscht und bewertet. Es soll immer nur so viel an stromverbrauchender Hardware online sein, wie zur Bewältigung der momentan anfallenden Arbeitslast notwendig ist. Bei sinkender Arbeitslast wird die Infrastruktur konsolidiert und nicht benötigte Server abgedreht. Bei steigender Arbeitslast werden zusätzliche Server aufgedreht, und die Infrastruktur skaliert. Idealerweise geschieht dies vorausschauend anhand von Prognosen zur Arbeitslastentwicklung. Die Arbeitslast, gekapselt in VMs, wird in beiden Fällen per Live Migration auf andere Server verschoben. Die Frage, welche VM auf welchem Server laufen soll, sodass in Summe möglichst wenig Strom verbraucht wird und gewisse Nebenbedingungen nicht verletzt werden (etwa SLAs), ist ein kombinatorisches Optimierungsproblem in mehreren Variablen. Dieses muss regelmäßig neu gelöst werden, da sich etwa der Ressourcenbedarf der VMs ändert. Weiters sind Server hinsichtlich ihrer Ausstattung und ihres Strombedarfs nicht homogen. Aufgrund der Komplexität ist eine exakte Lösung praktisch unmöglich. Eine Heuristik aus verwandten Problemklassen (vector packing) wird angepasst, ein meta-heuristischer Ansatz aus der Natur (Genetische Algorithmen) umformuliert. Ein einfach konfigurierbares Kostenmodell wird formuliert, um Energieeinsparungen gegenüber der Dienstqualität abzuwägen. Die Lösungsansätze werden mit Load-Balancing verglichen. Zusätzlich werden die Forecasting-Methoden SARIMA und Holt-Winters evaluiert. Weiters werden Modelle entwickelt, die den negativen Einfluss einer Live Migration auf die Dienstqualität voraussagen können, und Ansätze evaluiert, die diesen Einfluss verringern. Abschließend wird untersucht, inwiefern das Protokollieren des Energieverbrauchs Auswirkungen auf Aspekte der Security und Privacy haben kann.My thesis is about increasing the energy efficiency of data centers by using a management software. It was estimated that world-wide data centers already consume 1-2%of the globally provided electrical energy. Furthermore, a typical server causes higher electricity costs over a 3 year lifespan than the purchase cost. Hence, increasing the energy efficiency of all components found in a data center is of high ecological as well as economic importance. The focus of my thesis is to increase the efficiency of servers in a data center. The vast majority of servers in data centers are underutilized for a significant amount of time, operating regions of 10-20%utilization are common. Still, these servers consume huge amounts of energy. A lot of efforts have been made in the area of Green Data Centers during the last years, e.g., regarding cooling efficiency. Nevertheless, there are still many open issues, e.g., operating a virtualized, heterogeneous business infrastructure with the minimum possible power consumption, under the constraint that Quality of Service, and in consequence, revenue are not severely decreased. The majority of existing work is dealing with homogeneous cluster infrastructures, where large assumptions can be made. Especially, an automatic trade-off between competing cost categories, with energy costs being just one of them, is insufficiently studied. In my thesis, I investigate and evaluate mathematical models and algorithms in the context of increasing the energy efficiency of servers in a data center. The amount of online, power consuming resources should at all times be close to the amount of actually required resources. If the workload intensity is decreasing, the infrastructure is consolidated by shutting down servers. If the intensity is rising, the infrastructure is scaled by waking up servers. Ideally, this happens pro-actively by making forecasts about the workload development. Workload is encapsulated in VMs and is live migrated to other servers. The problem of mapping VMs to physical servers in a way that minimizes power consumption, but does not lead to severe Quality of Service violations, is a multi-objective combinatorial optimization problem. It has to be solved frequently as the VMs' resource demands are usually dynamic. Further, servers are not homogeneous regarding their performance and power consumption. Due to the computational complexity, exact solutions are practically intractable. A greedy heuristic stemming from the problem of vector packing and a meta-heuristic genetic algorithm are investigated and evaluated. A configurable cost model is created in order to trade-off energy cost savings with QoS violations. The base for comparison is load balancing. Additionally, the forecasting methods SARIMA and Holt-Winters are evaluated. Further, models able to predict the negative impact of live migration on QoS are developed, and approaches to decrease this impact are investigated. Finally, an examination is carried out regarding the possible consequences of collecting and storing energy consumption data of servers on security and privacy
    corecore