389 research outputs found

    Power efficient job scheduling by predicting the impact of processor manufacturing variability

    Get PDF
    Modern CPUs suffer from performance and power consumption variability due to the manufacturing process. As a result, systems that do not consider such variability caused by manufacturing issues lead to performance degradations and wasted power. In order to avoid such negative impact, users and system administrators must actively counteract any manufacturing variability. In this work we show that parallel systems benefit from taking into account the consequences of manufacturing variability when making scheduling decisions at the job scheduler level. We also show that it is possible to predict the impact of this variability on specific applications by using variability-aware power prediction models. Based on these power models, we propose two job scheduling policies that consider the effects of manufacturing variability for each application and that ensure that power consumption stays under a system-wide power budget. We evaluate our policies under different power budgets and traffic scenarios, consisting of both single- and multi-node parallel applications, utilizing up to 4096 cores in total. We demonstrate that they decrease job turnaround time, compared to contemporary scheduling policies used on production clusters, up to 31% while saving up to 5.5% energy.Postprint (author's final draft

    "Virtual malleability" applied to MPI jobs to improve their execution in a multiprogrammed environment"

    Get PDF
    This work focuses on scheduling of MPI jobs when executing in shared-memory multiprocessors (SMPs). The objective was to obtain the best performance in response time in multiprogrammed multiprocessors systems using batch systems, assuming all the jobs have the same priority. To achieve that purpose, the benefits of supporting malleability on MPI jobs to reduce fragmentation and consequently improve the performance of the system were studied. The contributions made in this work can be summarized as follows:· Virtual malleability: A mechanism where a job is assigned a dynamic processor partition, where the number of processes is greater than the number of processors. The partition size is modified at runtime, according to external requirements such as the load of the system, by varying the multiprogramming level, making the job contend for resources with itself. In addition to this, a mechanism which decides at runtime if applying local or global process queues to an application depending on the load balancing between processes of it. · A job scheduling policy, that takes decisions such as how many processes to start with and the maximum multiprogramming degree based on the type and number of applications running and queued. Moreover, as soon as a job finishes execution and where there are queued jobs, this algorithm analyzes whether it is better to start execution of another job immediately or just wait until there are more resources available. · A new alternative to backfilling strategies for the problema of window execution time expiring. Virtual malleability is applied to the backfilled job, reducing its partition size but without aborting or suspending it as in traditional backfilling. The evaluation of this thesis has been done using a practical approach. All the proposals were implemented, modifying the three scheduling levels: queuing system, processor scheduler and runtime library. The impact of the contributions were studied under several types of workloads, varying machine utilization, communication and, balance degree of the applications, multiprogramming level, and job size. Results showed that it is possible to offer malleability over MPI jobs. An application obtained better performance when contending for the resources with itself than with other applications, especially in workloads with high machine utilization. Load imbalance was taken into account obtaining better performance if applying the right queue type to each application independently.The job scheduling policy proposed exploited virtual malleability by choosing at the beginning of execution some parameters like the number of processes and maximum multiprogramming level. It performed well under bursty workloads with low to medium machine utilizations. However as the load increases, virtual malleability was not enough. That is because, when the machine is heavily loaded, the jobs, once shrunk are not able to expand, so they must be executed all the time with a partition smaller than the job size, thus degrading performance. Thus, at this point the job scheduling policy concentrated just in moldability.Fragmentation was alleviated also by applying backfilling techniques to the job scheduling algorithm. Virtual malleability showed to be an interesting improvement in the window expiring problem. Backfilled jobs even on a smaller partition, can continue execution reducing memory swapping generated by aborts/suspensions In this way the queueing system is prevented from reinserting the backfilled job in the queue and re-executing it in the future.Postprint (published version

    Grid-job scheduling with reservations and preemption

    Get PDF
    Computational grids make it possible to exploit grid resources across multiple clusters when grid jobs are deconstructed into tasks and allocated across clusters. Grid-job tasks are often scheduled in the form of workflows which require synchronization, and advance reservation makes it easy to guarantee predictable resource provisioning for these jobs. However, advance reservation for grid jobs creates roadblocks and fragmentation which adversely affects the system utilization and response times for local jobs. We provide a solution which incorporates relaxed reservations and uses a modified version of the standard grid-scheduling algorithm, HEFT, to obtain flexibility in placing reservations for workflow grid jobs. Furthermore, we deploy the relaxed reservation with modified HEFT as an extension of the preemption based job scheduling framework, SCOJO-PECT job scheduler. In SCOJO-PECT, relaxed reservations serve the additional purpose of permitting scheduler optimizations which shift the overall schedule forward. Furthermore, a propagation heuristics algorithm is used to alleviate the workflow job makespan extension caused by the slack of relaxed reservation. Our solution aims at decreasing the fragmentation caused by grid jobs, so that local jobs and system utilization are not compromised, and at the same time grid jobs also have reasonable response times

    Including accurate user estimates in HPC schedulers: ban empirical analysis

    Get PDF
    This article focuses on the problem of dealing with low accuracy of job runtime estimates provided by users of high performance computing systems. The main goal of the study is to evaluate the benefits on the system utilization of providing accurate estimations, in order to motivate users to make an effort to provide better estimates. We propose the Penalty Scheduling Policy for including information about user estimates. The experimental evaluation is performed over realistic workload and scenarios, and validated by the use of a job scheduler simulator. We simulated different static and dynamic scenarios, which emulate diverse user behavior regarding the estimation of jobs runtime. Results demonstrate that the accuracy of users runtime estimates influences the waiting time of jobs. Under our proposed policy, in a scenario where users improve their estimates, waiting time of users with high accuracy can be up to 2.43 times lower than users with the lowest accuracy.XV Workshop de Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

    The Resource Usage Aware Backfilling

    Full text link
    Abstract. Job scheduling policies for HPC centers have been extensively stud-ied in the last few years, especially backfilling based policies. Almost all of these studies have been done using simulation tools. All the existent simulators use the runtime (either estimated or real) provided in the workload as a basis of their sim-ulations. In our previous work we analyzed the impact on system performance of considering the resource sharing (memory bandwidth) of running jobs including a new resource model in the Alvio simulator. Based on this studies we proposed the LessConsume and LessConsume Threshold resource selection policies. Both are oriented to reduce the saturation of the shared resources thus increasing the performance of the system. The results showed how both resource allocation poli-cies shown how the performance of the system can be improved by considering where the jobs are finally allocated. Using the LessConsume Threshold Resource Selection Policy, we propose a new backfilling strategy: the Resource Usage Aware Backfilling job scheduling policy. This is a backfilling based scheduling policy where the algorithms which decide which job has to be executed and how jobs have to be backfilled are based on a different Threshold configurations. This backfilling variant that considers how the shared resources are used by the scheduled jobs. Rather than backfilling the first job that can moved to the run queue based on the job arrival time or job size, it looks ahead to the next queued jobs, and tries to allocate jobs that would experience lower penalized runtime caused by the resource sharing saturation. In the paper we demostrate how the exchange of scheduling information between the local resource manager and the scheduler can improve substantially the per-formance of the system when the resource sharing is considered. We show how it can achieve a close response time performance that the shorest job first Back-filling with First Fit (oriented to improve the start time for the allocated jobs) providing a qualitative improvement in the number of killed jobs and in the per-centage of penalized runtime.

    DVFS power management in HPC systems

    Get PDF
    Recent increase in performance of High Performance Computing (HPC) systems has been followed by even higher increase in power consumption. Power draw of modern supercomputers leads to very high operating costs and reliability concerns. Furthermore, it has negative consequences on the environment. Accordingly, over the last decade there have been many works dealing with power/energy management in HPC systems. Since CPUs accounts for a high portion of the total system power consumption, our work aims at CPU power reduction. Dynamic Voltage Frequency Scaling (DVFS) is a widely used technique for CPU power management. Running an application at lower frequency/voltage reduces its power consumption. However, frequency scaling should be used carefully since it has negative effects on the application performance. We argue that the job scheduler level presents a good place for power management in an HPC center having in mind that a parallel job scheduler has a global overview of the entire system. In this thesis we propose power-aware parallel job scheduling policies where the scheduler determines the job CPU frequency, besides the job execution order. Based on the goal, the proposed policies can be classified into two groups: energy saving and power budgeting policies. The energy saving policies aim to reduce CPU energy consumption with a minimal job performance penalty. The first of the energy saving policies assigns the job frequency based on system utilization while the other makes job performance predictions. While for less loaded workloads these policies achieve energy savings, highly loaded workloads suffer from a substantial performance degradation because of higher job wait times due to an increase in load caused by longer job run times. Our results show higher potential of the DVFS technique when applied for power budgeting. The second group of policies are policies for power constrained systems. In contrast to the systems without a power limitation, in the case of a given power budget the DVFS technique even improves overall job performance reducing the average job wait time. This comes from a lower job power consumption that allows more jobs to run simultaneously. The first proposed policy from this group assigns CPU frequency using the job predicted performance and current power draw of already running jobs. The other power budgeting policy is based on an optimization problem which solution determines the job execution order, as well as power distribution among jobs selected for execution. This policy fully exploits available power and leads to further performance improvements. The last contribution of the thesis is an analysis of the DVFS technique potential for energyperformance trade-off in current and future HPC systems. Ongoing changes in technology decrease the DVFS applicability for energy savings but the technique still reduces power consumption making it useful for power constrained systems. In order to analyze DVFS potential, a model of frequency scaling impact on MPI application execution time has been proposed and validated against measurements on a large-scale system. This parametric analysis showed for which application/platform characteristic, frequency scaling leads to energy savings.El aumento de rendimiento que han experimentado los sistemas de altas prestaciones ha venido acompañado de un aumento aún mayor en el consumo de energía. El consumo de los supercomputadores actuales implica unos costes muy altos de funcionamiento. Estos costes no tienen simplemente implicaciones a nivel económico sino también implicaciones en el medio ambiente. Dado la importancia del problema, en los últimos tiempos se han realizado importantes esfuerzos de investigación para atacar el problema de la gestión eficiente de la energía que consumen los sistemas de supercomputación. Dado que la CPU supone un alto porcentaje del consumo total de un sistema, nuestro trabajo se centra en la reducción y gestión eficiente de la energía consumida por la CPU. En concreto, esta tesis se centra en la viabilidad de realizar esta gestión mediante la técnica de Dynamic Voltage Frequency Scalingi (DVFS), una técnica ampliamente utilizada con el objetivo de reducir el consumo energético de la CPU. Sin embargo, esta técnica puede implicar una reducción en el rendimiento de las aplicaciones que se ejecutan, ya que implica una reducción de la frecuencia. Si tenemos en cuenta que el contexto de esta tesis son sistemas de alta prestaciones, minimizar el impacto en la pérdida de rendimiento será uno de nuestros objetivos. Sin embargo, en nuestro contexto, el rendimiento de un trabajo viene determinado por dos factores, tiempo de ejecución y tiempo de espera, por lo que habrá que considerar los dos componentes. Los sistemas de supercomputación suelen estar gestionados por sistemas de colas. Los trabajos, dependiendo de la política que se aplique y el estado del sistema, deberán esperar más o menos tiempo antes de ser ejecutado. Dado las características del sistema objetivo de esta tesis, nosotros consideramos que el Planificador de trabajo (o Job Scheduler), es el mejor componente del sistema para incluir la gestión de la energía ya que es el único punto donde se tiene una visión global de todo el sistema. En este trabajo de tesis proponemos un conjunto de políticas de planificación que considerarán el consumo energético como un recurso más. Estas políticas decidirán que trabajo ejecutar, el número de cpus asignadas y la lista de cpus (y nodos) sino también la frecuencia a la que estas cpus se ejecutarán. Estas políticas estarán orientadas a dos objetivos: reducir la energía total consumida por un conjunto de trabajos y controlar en consumo puntual de un conjunto puntual para evitar saturaciones del sistema en aquellos centros que puedan tener una capacidad limitada (permanente o puntual). El primer grupo de políticas intentará reducir el consumo total minimizando el impacto en el rendimiento. En este grupo encontramos una primera política que asigna la frecuencia de las cpus en función de la utilización del sistema y una segunda que calcula una estimación de la penalización que sufrirá el trabajo que va a empezar para decidir si reducir o no la frecuencia. Estas políticas han mostrado unos resultados aceptables con sistemas poco cargados, pero han mostrado unas pérdidas de rendimiento significativas cuando el sistema está muy cargado. Estas pérdidas de rendimiento no han sido a nivel de incremento significativo del tiempo de ejecución de los trabajos, pero sí de las métricas de rendimiento que incluyen el tiempo de espera de los trabajos (habituales en este contexto). El segundo grupo de políticas, orientadas a sistemas con limitaciones en cuanto a la potencia que pueden consumir, han mostrado un gran potencial utilizando DVFS como mecanismo de gestión. En este caso, comparado con un sistema que no incluya esta gestión, han demostrado mejoras en el rendimiento ya que permiten ejecutar más trabajos de forma simultánea, reduciendo significativamente el tiempo de espera de los trabajos. En este segundo grupo proponemos una política basada en el rendimiento del trabajo que se va a ejecutar y una segunda que considera la asignación de todos los recursos como un problema de optimización lineal. Esta última política es la contribución más importante de la tesis ya que demuestra un buen comportamiento en todos los casos evaluados. La última contribución de la tesis es un estudio del potencial de DVFS como técnica de gestión de la energía en un futuro próximo, en función de un estudio de las características de las aplicaciones, de la reducción de DVFS en el consumo de la CPU y del peso de la CPU dentro de todo el sistema. Este estudio indica que la capacidad de DVFS de ahorrar energía será limitado pero sigue mostrando un gran potencial de cara al control del consumo energético

    Improved self-management of datacenter systems applying machine learning

    Get PDF
    Autonomic Computing is a Computer Science and Technologies research area, originated during mid 2000's. It focuses on optimization and improvement of complex distributed computing systems through self-control and self-management. As distributed computing systems grow in complexity, like multi-datacenter systems in cloud computing, the system operators and architects need more help to understand, design and optimize manually these systems, even more when these systems are distributed along the world and belong to different entities and authorities. Self-management lets these distributed computing systems improve their resource and energy management, a very important issue when resources have a cost, by obtaining, running or maintaining them. Here we propose to improve Autonomic Computing techniques for resource management by applying modeling and prediction methods from Machine Learning and Artificial Intelligence. Machine Learning methods can find accurate models from system behaviors and often intelligible explanations to them, also predict and infer system states and values. These models obtained from automatic learning have the advantage of being easily updated to workload or configuration changes by re-taking examples and re-training the predictors. So employing automatic modeling and predictive abilities, we can find new methods for making "intelligent" decisions and discovering new information and knowledge from systems. This thesis departs from the state of the art, where management is based on administrators expertise, well known data, ad-hoc studied algorithms and models, and elements to be studied from computing machine point of view; to a novel state of the art where management is driven by models learned from the same system, providing useful feedback, making up for incomplete, missing or uncertain data, from a global network of datacenters point of view. - First of all, we cover the scenario where the decision maker works knowing all pieces of information from the system: how much will each job consume, how is and will be the desired quality of service, what are the deadlines for the workload, etc. All of this focusing on each component and policy of each element involved in executing these jobs. -Then we focus on the scenario where instead of fixed oracles that provide us information from an expert formula or set of conditions, machine learning is used to create these oracles. Here we look at components and specific details while some part of the information is not known and must be learned and predicted. - We reduce the problem of optimizing resource allocations and requirements for virtualized web-services to a mathematical problem, indicating each factor, variable and element involved, also all the constraints the scheduling process must attend to. The scheduling problem can be modeled as a Mixed Integer Linear Program. Here we face an scenario of a full datacenter, further we introduce some information prediction. - We complement the model by expanding the predicted elements, studying the main resources (this is CPU, Memory and IO) that can suffer from noise, inaccuracy or unavailability. Once learning predictors for certain components let the decision making improve, the system can become more ¿expert-knowledge independent¿ and research can focus on an scenario where all the elements provide noisy, uncertainty or private information. Also we introduce to the management optimization new factors as for each datacenter context and costs may change, turning the model as "multi-datacenter" - Finally, we review of the cost of placing datacenters depending on green energy sources, and distribute the load according to green energy availability
    • …
    corecore