22 research outputs found

    Efficient Algorithms for Scheduling Moldable Tasks

    Full text link
    We study the problem of scheduling nn independent moldable tasks on mm processors that arises in large-scale parallel computations. When tasks are monotonic, the best known result is a (32+ϵ)(\frac{3}{2}+\epsilon)-approximation algorithm for makespan minimization with a complexity linear in nn and polynomial in logm\log{m} and 1ϵ\frac{1}{\epsilon} where ϵ\epsilon is arbitrarily small. We propose a new perspective of the existing speedup models: the speedup of a task TjT_{j} is linear when the number pp of assigned processors is small (up to a threshold δj\delta_{j}) while it presents monotonicity when pp ranges in [δj,kj][\delta_{j}, k_{j}]; the bound kjk_{j} indicates an unacceptable overhead when parallelizing on too many processors. For a given integer δ5\delta\geq 5, let u=δ21u=\left\lceil \sqrt[2]{\delta} \right\rceil-1. In this paper, we propose a 1θ(δ)(1+ϵ)\frac{1}{\theta(\delta)} (1+\epsilon)-approximation algorithm for makespan minimization with a complexity O(nlognϵlogm)\mathcal{O}(n\log{\frac{n}{\epsilon}}\log{m}) where θ(δ)=u+1u+2(1km)\theta(\delta) = \frac{u+1}{u+2}\left( 1- \frac{k}{m} \right) (mkm\gg k). As a by-product, we also propose a θ(δ)\theta(\delta)-approximation algorithm for throughput maximization with a common deadline with a complexity O(n2logm)\mathcal{O}(n^{2}\log{m})

    Theory and Engineering of Scheduling Parallel Jobs

    Get PDF
    Scheduling is very important for an efficient utilization of modern parallel computing systems. In this thesis, four main research areas for scheduling are investigated: the interplay and distribution of decision makers, the efficient schedule computation, efficient scheduling for the memory hierarchy and energy-efficiency. The main result is a provably fast and efficient scheduling algorithm for malleable jobs. Experiments show the importance and possibilities of scheduling considering the memory hierarchy

    Approximation Schemes for Machine Scheduling

    Get PDF
    In the classical problem of makespan minimization on identical parallel machines, or machine scheduling for short, a set of jobs has to be assigned to a set of machines. The jobs have a processing time and the goal is to minimize the latest finishing time of the jobs. Machine scheduling is well known to be NP-hard and thus there is no polynomial time algorithm for this problem that is guaranteed to find an optimal solution unless P=NP. There is, however, a polynomial time approximation scheme (PTAS) for machine scheduling, that is, a family of approximation algorithms with ratios arbitrarily close to one. Whether a problem admits an approximation scheme or not is a fundamental question in approximation theory. In the present work, we consider this question for several variants of machine scheduling. We study the problem where the machines are partitioned into a constant number of types and the processing time of the jobs is also dependent on the machine type. We present so called efficient PTAS (EPTAS) results for this problem and variants thereof. We show that certain cases of machine scheduling with assignment restrictions do not admit a PTAS unless P=NP. Moreover, we introduce a graph framework based on the restrictions of the jobs and use it in the design of approximation schemes for other variants. We introduce an enhanced integer programming formulation for assignment problems, show that it can be efficiently solved, and use it in the EPTAS design for variants of machine scheduling with setup times. For one of the problems, we show that there is also a PTAS in the case with uniform machines, where machines have speeds influencing the processing times of the jobs. We consider cases in which each job requires a certain amount of a shared renewable resource and the processing time is depended on the amount of resource it receives or not. We present so called asymptotic fully polynomial time approximation schemes (AFPTAS) for the problems

    Optimization techniques for adaptability in MPI application

    Get PDF
    The first version of MPI (Message Passing Interface) was released in 1994. At that time, scientific applications for HPC (High Performance Computing) were characterized by a static execution environment. These applications usually had regular computation and communication patterns, operated on dense data structures accessed with good data locality, and ran on homogeneous computing platforms. For these reasons, MPI has become the de facto standard for developing scientific parallel applications for HPC during the last decades. In recent years scientific applications have evolved in order to cope with several challenges posed by different fields of engineering, economics and medicine among others. These challenges include large amounts of data stored in irregular and sparse data structures with poor data locality to be processed in parallel (big data), algorithms with irregular computation and communication patterns, and heterogeneous computing platforms (grid, cloud and heterogeneous cluster). On the other hand, over the last years MPI has introduced relevant improvements and new features in order to meet the requirements of dynamic execution environments. Some of them include asynchronous non-blocking communications, collective I/O routines and the dynamic process management interface introduced in MPI 2.0. The dynamic process management interface allows the application to spawn new processes at runtime and enable communication with them. However, this feature has some technical limitations that make the implementation of malleable MPI applications still a challenge. This thesis proposes FLEX-MPI, a runtime system that extends the functionalities of the MPI standard library and features optimization techniques for adaptability of MPI applications to dynamic execution environments. These techniques can significantly improve the performance and scalability of scientific applications and the overall efficiency of the HPC system on which they run. Specifically, FLEX-MPI focuses on dynamic load balancing and performance-aware malleability for parallel applications. The main goal of the design and implementation of the adaptability techniques is to efficiently execute MPI applications on a wide range of HPC platforms ranging from small to large-scale systems. Dynamic load balancing allows FLEX-MPI to adapt the workload assignments at runtime to the performance of the computing elements that execute the parallel application. On the other hand, performance-aware malleability leverages the dynamic process management interface of MPI to change the number of processes of the application at runtime. This feature allows to improve the performance of applications that exhibit irregular computation patterns and execute in computing systems with dynamic availability of resources. One of the main features of these techniques is that they do not require user intervention nor prior knowledge of the underlying hardware. We have validated and evaluated the performance of the adaptability techniques with three parallel MPI benchmarks and different execution environments with homogeneous and heterogeneous cluster configurations. The results show that FLEXMPI significantly improves the performance of applications when running with the support of dynamic load balancing and malleability, along with a substantial enhancement of their scalability and an improvement of the overall system efficiency.La primera versión de MPI (Message Passing Interface) fue publicada en 1994, cuando la base común de las aplicaciones científicas para HPC (High Performance Computing) se caracterizaba por un entorno de ejecución estático. Dichas aplicaciones presentaban generalmente patrones regulares de cómputo y comunicaciones, accesos a estructuras de datos densas con alta localidad, y ejecución sobre plataformas de computación homogéneas. Esto ha hecho que MPI haya sido la alternativa más adecuada para la implementación de aplicaciones científicas para HPC durante más de 20 años. Sin embargo, en los últimos años las aplicaciones científicas han evolucionado para adaptarse a diferentes retos propuestos por diferentes campos de la ingeniería, la economía o la medicina entre otros. Estos nuevos retos destacan por características como grandes cantidades de datos almacenados en estructuras de datos irregulares con baja localidad para el análisis en paralelo (big data), algoritmos con patrones irregulares de cómputo y comunicaciones, e infraestructuras de computación heterogéneas (cluster heterogéneos, grid y cloud). Por otra parte, MPI ha evolucionado significativamente en cada una de sus sucesivas versiones, siendo algunas de las mejoras más destacables presentadas hasta la reciente versión 3.0 las operaciones de comunicación asíncronas no bloqueantes, rutinas de E/S colectiva, y la interfaz de procesos dinámicos presentada en MPI 2.0. Esta última proporciona un procedimiento para la creación de procesos en tiempo de ejecución de la aplicación. Sin embargo, la implementación de la interfaz de procesos dinámicos por parte de las diferentes distribuciones de MPI aún presenta numerosas limitaciones que condicionan el desarrollo de aplicaciones maleables en MPI. Esta tesis propone FLEX-MPI, un sistema que extiende las funcionalidades de la librería MPI y proporciona técnicas de optimización para la adaptación de aplicaciones MPI a entornos de ejecución dinámicos. Las técnicas integradas en FLEX-MPI permiten mejorar el rendimiento y escalabilidad de las aplicaciones científicas y la eficiencia de las plataformas sobre las que se ejecutan. Entre estas técnicas destacan el balanceo de carga dinámico y maleabilidad para aplicaciones MPI. El diseño e implementación de estas técnicas está dirigido a plataformas de cómputo HPC de pequeña a gran escala. El balanceo de carga dinámico permite a las aplicaciones adaptar de forma eficiente su carga de trabajo a las características y rendimiento de los elementos de procesamiento sobre los que se ejecutan. Por otro lado, la técnica de maleabilidad aprovecha la interfaz de procesos dinámicos de MPI para modificar el número de procesos de la aplicación en tiempo de ejecución, una funcionalidad que permite mejorar el rendimiento de aplicaciones con patrones irregulares o que se ejecutan sobre plataformas de cómputo con disponibilidad dinámica de recursos. Una de las principales características de estas técnicas es que no requieren intervención del usuario ni conocimiento previo de la arquitectura sobre la que se ejecuta la aplicación. Hemos llevado a cabo un proceso de validación y evaluación de rendimiento de las técnicas de adaptabilidad con tres diferentes aplicaciones basadas en MPI, bajo diferentes escenarios de computación homogéneos y heterogéneos. Los resultados demuestran que FLEX-MPI permite obtener un significativo incremento del rendimiento de las aplicaciones, unido a una mejora sustancial de la escalabilidad y un aumento de la eficiencia global del sistema.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: Francisco Fernández Rivera.- Secretario: Florín Daniel Isaila.- Vocal: María Santos Pérez Hernánde

    27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

    Get PDF
    corecore