620 research outputs found
Adaptive runtime techniques for power and resource management on multi-core systems
Energy-related costs are among the major contributors to the total cost of ownership of data centers and high-performance computing (HPC) clusters. As a result, future data centers must be energy-efficient to meet the continuously increasing computational demand. Constraining the power consumption of the servers is a widely used approach for managing energy costs and complying with power delivery limitations. In tandem, virtualization has become a common practice, as virtualization reduces hardware and power requirements by enabling consolidation of multiple applications on to a smaller set of physical resources. However, administration and management of data center resources have become more complex due to the growing number of virtualized servers installed in data centers. Therefore, designing autonomous and adaptive energy efficiency approaches is crucial to achieve sustainable and cost-efficient operation in data centers.
Many modern data centers running enterprise workloads successfully implement energy efficiency approaches today. However, the nature of multi-threaded applications, which are becoming more common in all computing domains, brings additional design and management challenges. Tackling these challenges requires a deeper understanding of the interactions between the applications and the underlying hardware nodes. Although cluster-level management techniques bring significant benefits, node-level techniques provide more visibility into application characteristics, which can then be used to further improve the overall energy efficiency of the data centers.
This thesis proposes adaptive runtime power and resource management techniques on multi-core systems. It demonstrates that taking the multi-threaded workload characteristics into account during management significantly improves the energy efficiency of the server nodes, which are the basic building blocks of data centers. The key distinguishing features of this work are as follows:
We implement the proposed runtime techniques on state-of-the-art commodity multi-core servers and show that their energy efficiency can be significantly improved by (1) taking multi-threaded application specific characteristics into account while making resource allocation decisions, (2) accurately tracking dynamically changing power constraints by using low-overhead application-aware runtime techniques, and (3) coordinating dynamic adaptive decisions at various layers of the computing stack, specifically at system and application levels. Our results show that efficient resource distribution under power constraints yields energy savings of up to 24% compared to existing approaches, along with the ability to meet power constraints 98% of the time for a diverse set of multi-threaded applications
Models of Architecture
The current trend in high performance and embedded computing consists of designing increasingly complex heterogeneous hardware architectures with non-uniform communication resources. In order to take hardware and software design decisions, early evaluations of the system non-functional properties are needed. These evaluations of system efficiency require high-level information on both the algorithms and the architecture. In state of the art Model Driven Engineering (MDE) methods, different communities have developed custom architecture models associated to languages of substantial complexity. This fact contrasts with Models of Computation (MoCs) that provide abstract representations of an algorithm behavior as well as tool interoperability.In this report, we define the notion of Model of Architecture (MoA) and study the combination of a MoC and an MoA to provide a design space exploration environment for the study of the algorithmic and architectural choices. An MoA provides reproducible cost computation for evaluating the efficiency of a system. A new MoA called Linear System-Level Architecture Model (LSLA) is introduced and compared to state of the art models. LSLA aims at representing hardware efficiency with a linear model. The computed cost results from the mapping of an application, represented by a model conforming a MoC on an architecture represented by a model conforming an MoA. The cost is composed of a processing-related part and a communication-related part. It is an abstract scalar value to be minimized and can represent any non-functional requirement of a system such as memory, energy, throughput or latency
Generalized strictly periodic scheduling analysis, resource optimization, and implementation of adaptive streaming applications
This thesis focuses on addressing four research problems in designing embedded streaming systems. Embedded streaming systems are those systems thatprocess a stream of input data coming from the environment and generate a stream of output data going into the environment. For many embeddedstreaming systems, the timing is a critical design requirement, in which the correct behavior depends on both the correctness of output data and on the time at which the data is produced. An embedded streaming system subjected to such a timing requirement is called a real-time system. Some examples of real-time embedded streaming systems can be found in various autonomous mobile systems, such as planes, self-driving cars, and drones. To handle the tight timing requirements of such real-time embedded streaming systems, modern embedded systems have been equipped with hardware platforms, the so-called Multi-Processor Systems-on-Chip (MPSoC), that contain multiple processors, memories, interconnections, and other hardware peripherals on a single chip, to benefit from parallel execution. To efficiently exploit the computational capacity of an MPSoC platform, a streaming application which is going to be executed on the MPSoC platform must be expressed primarily in a parallel fashion, i.e., the application is represented as a set of parallel executing and communicating tasks. Then, the main challenge is how to schedule the tasks spatially, i.e., task mapping, and temporally, i.e., task scheduling, on the MPSoC platform such that all timing requirements are satisfied while making efficient utilization of available resources (e.g, processors, memory, energy, etc.) on the platform. Another challenge is how to implement and run the mapped and scheduled application tasks on the MPSoC platform. This thesis proposes several techniques to address the aforementioned two challenges.NWOComputer Systems, Imagery and Medi
DVFS power management in HPC systems
Recent increase in performance of High Performance Computing (HPC) systems has been followed by
even higher increase in power consumption. Power draw of modern supercomputers leads to very high
operating costs and reliability concerns. Furthermore, it has negative consequences on the environment.
Accordingly, over the last decade there have been many works dealing with power/energy management
in HPC systems.
Since CPUs accounts for a high portion of the total system power consumption, our work aims at CPU
power reduction. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique for CPU
power management. Running an application at lower frequency/voltage reduces its power
consumption. However, frequency scaling should be used carefully since it has negative effects on the
application performance.
We argue that the job scheduler level presents a good place for power management in an HPC center
having in mind that a parallel job scheduler has a global overview of the entire system. In this thesis we
propose power-aware parallel job scheduling policies where the scheduler determines the job CPU
frequency, besides the job execution order. Based on the goal, the proposed policies can be classified
into two groups: energy saving and power budgeting policies. The energy saving policies aim to reduce
CPU energy consumption with a minimal job performance penalty. The first of the energy saving
policies assigns the job frequency based on system utilization while the other makes job performance
predictions. While for less loaded workloads these policies achieve energy savings, highly loaded
workloads suffer from a substantial performance degradation because of higher job wait times due to
an increase in load caused by longer job run times. Our results show higher potential of the DVFS
technique when applied for power budgeting.
The second group of policies are policies for power constrained systems. In contrast to the systems
without a power limitation, in the case of a given power budget the DVFS technique even improves
overall job performance reducing the average job wait time. This comes from a lower job power
consumption that allows more jobs to run simultaneously. The first proposed policy from this group
assigns CPU frequency using the job predicted performance and current power draw of already running
jobs. The other power budgeting policy is based on an optimization problem which solution determines
the job execution order, as well as power distribution among jobs selected for execution. This policy
fully exploits available power and leads to further performance improvements.
The last contribution of the thesis is an analysis of the DVFS technique potential for energyperformance
trade-off in current and future HPC systems. Ongoing changes in technology decrease the
DVFS applicability for energy savings but the technique still reduces power consumption making it
useful for power constrained systems. In order to analyze DVFS potential, a model of frequency
scaling impact on MPI application execution time has been proposed and validated against
measurements on a large-scale system. This parametric analysis showed for which
application/platform characteristic, frequency scaling leads to energy savings.El aumento de rendimiento que han experimentado los sistemas de altas prestaciones ha venido acompañado de un aumento aún mayor en el consumo de energía. El consumo de los supercomputadores actuales implica unos costes muy altos de funcionamiento. Estos costes no tienen simplemente implicaciones a nivel económico sino también implicaciones en el medio ambiente. Dado la importancia del problema, en los últimos tiempos se han realizado importantes esfuerzos de investigación para atacar el problema de la gestión eficiente de la energía que consumen los sistemas de supercomputación.
Dado que la CPU supone un alto porcentaje del consumo total de un sistema, nuestro trabajo se centra en la reducción y gestión eficiente de la energía consumida por la CPU. En concreto, esta tesis se centra en la viabilidad de realizar esta gestión mediante la técnica de Dynamic Voltage Frequency Scalingi (DVFS), una técnica ampliamente utilizada con el objetivo de reducir el consumo energético de la CPU. Sin embargo, esta técnica puede implicar una reducción en el rendimiento de las aplicaciones que se ejecutan, ya que implica una reducción de la frecuencia. Si tenemos en cuenta que el contexto de esta tesis son sistemas de alta prestaciones, minimizar el impacto en la pérdida de rendimiento será uno de nuestros objetivos. Sin embargo, en nuestro contexto, el rendimiento de un trabajo viene determinado por dos factores, tiempo de ejecución y tiempo de espera, por lo que habrá que considerar los dos componentes.
Los sistemas de supercomputación suelen estar gestionados por sistemas de colas. Los trabajos, dependiendo de la política que se aplique y el estado del sistema, deberán esperar más o menos tiempo antes de ser ejecutado. Dado las características del sistema objetivo de esta tesis, nosotros consideramos que el Planificador de trabajo (o Job Scheduler), es el mejor componente del sistema para incluir la gestión de la energía ya que es el único punto donde se tiene una visión global de todo el sistema.
En este trabajo de tesis proponemos un conjunto de políticas de planificación que considerarán el consumo energético como un recurso más. Estas políticas decidirán que trabajo ejecutar, el número de cpus asignadas y la lista de cpus (y nodos) sino también la frecuencia a la que estas cpus se ejecutarán. Estas políticas estarán orientadas a dos objetivos: reducir la energía total consumida por un conjunto de trabajos y controlar en consumo puntual de un conjunto puntual para evitar saturaciones del sistema en aquellos centros que puedan tener una capacidad limitada (permanente o puntual).
El primer grupo de políticas intentará reducir el consumo total minimizando el impacto en el rendimiento. En este grupo encontramos una primera política que asigna la frecuencia de las cpus en función de la utilización del sistema y una segunda que calcula una estimación de la penalización que sufrirá el trabajo que va a empezar para decidir si reducir o no la frecuencia. Estas políticas han mostrado unos resultados aceptables con sistemas poco cargados, pero han mostrado unas pérdidas de rendimiento significativas cuando el sistema está muy cargado. Estas pérdidas de rendimiento no han sido a nivel de incremento significativo del tiempo de ejecución de los trabajos, pero sí de las métricas de rendimiento que incluyen el tiempo de espera de los trabajos (habituales en este contexto).
El segundo grupo de políticas, orientadas a sistemas con limitaciones en cuanto a la potencia que pueden consumir, han mostrado un gran potencial utilizando DVFS como mecanismo de
gestión. En este caso, comparado con un sistema que no incluya esta gestión, han demostrado mejoras en el rendimiento ya que permiten ejecutar más trabajos de forma simultánea, reduciendo significativamente el tiempo de espera de los trabajos. En este segundo grupo proponemos una política basada en el rendimiento del trabajo que se va a ejecutar y una segunda que considera la asignación de todos los recursos como un problema de optimización lineal. Esta última política es la contribución más importante de la tesis ya que demuestra un buen comportamiento en todos los casos evaluados.
La última contribución de la tesis es un estudio del potencial de DVFS como técnica de gestión de la energía en un futuro próximo, en función de un estudio de las características de las aplicaciones, de la reducción de DVFS en el consumo de la CPU y del peso de la CPU dentro de todo el sistema. Este estudio indica que la capacidad de DVFS de ahorrar energía será limitado pero sigue mostrando un gran potencial de cara al control del consumo energético
Optimization and Communication in UAV Networks
UAVs are becoming a reality and attract increasing attention. They can be remotely controlled or completely autonomous and be used alone or as a fleet and in a large set of applications. They are constrained by hardware since they cannot be too heavy and rely on batteries. Their use still raises a large set of exciting new challenges in terms of trajectory optimization and positioning when they are used alone or in cooperation, and communication when they evolve in swarm, to name but a few examples. This book presents some new original contributions regarding UAV or UAV swarm optimization and communication aspects
- …