7,143 research outputs found

    Performance Evaluation of Adaptive Scheduling Algorithm for Shared Heterogeneous Cluster Systems

    Get PDF
    Cluster computing systems have recently generated enormous interest for providing easily scalable and cost-effective parallel computing solution for processing large-scale applications. Various adaptive space-sharing scheduling algorithms have been proposed to improve the performance of dedicated and homogeneous clusters. But commodity clusters are naturally non-dedicated and tend to be heterogeneous over the time as cluster hardware is usually upgraded and new fast machines are also added to improve cluster performance. The existing adaptive policies for dedicated homogeneous and heterogeneous parallel systems are not suitable for such conditions. Most of the existing adaptive policies assume a priori knowledge of certain job characteristics to take scheduling decisions. However such information is not readily available without incurring great cost. This paper fills these gaps by designing robust and effective space-sharing scheduling algorithm for non-dedicated heterogeneous cluster systems, assuming no job characteristics to reduce mean job response time. Evaluation results show that the proposed algorithm provide substantial improvement over existing algorithms at moderate to high system utilizations

    Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors

    Get PDF
    Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. In addition, given the growing interest for low-power high performance computing, this type of architectures is also being investigated as a means to improve the throughput-per-Watt of complex scientific applications. In this paper, we design and embed several architecture-aware optimizations into a multi-threaded general matrix multiplication (gemm), a key operation of the BLAS, in order to obtain a high performance implementation for ARM big.LITTLE AMPs. Our solution is based on the reference implementation of gemm in the BLIS library, and integrates a cache-aware configuration as well as asymmetric--static and dynamic scheduling strategies that carefully tune and distribute the operation's micro-kernels among the big and LITTLE cores of the target processor. The experimental results on a Samsung Exynos 5422, a system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric scheduling attain important gains in performance with respect to its architecture-oblivious counterparts while exploiting all the resources of the AMP to deliver considerable energy efficiency

    Adaptive Dispatching of Tasks in the Cloud

    Full text link
    The increasingly wide application of Cloud Computing enables the consolidation of tens of thousands of applications in shared infrastructures. Thus, meeting the quality of service requirements of so many diverse applications in such shared resource environments has become a real challenge, especially since the characteristics and workload of applications differ widely and may change over time. This paper presents an experimental system that can exploit a variety of online quality of service aware adaptive task allocation schemes, and three such schemes are designed and compared. These are a measurement driven algorithm that uses reinforcement learning, secondly a "sensible" allocation algorithm that assigns jobs to sub-systems that are observed to provide a lower response time, and then an algorithm that splits the job arrival stream into sub-streams at rates computed from the hosts' processing capabilities. All of these schemes are compared via measurements among themselves and with a simple round-robin scheduler, on two experimental test-beds with homogeneous and heterogeneous hosts having different processing capacities.Comment: 10 pages, 9 figure

    A general framework of multi-population methods with clustering in undetectable dynamic environments

    Get PDF
    Copyright @ 2011 IEEETo solve dynamic optimization problems, multiple population methods are used to enhance the population diversity for an algorithm with the aim of maintaining multiple populations in different sub-areas in the fitness landscape. Many experimental studies have shown that locating and tracking multiple relatively good optima rather than a single global optimum is an effective idea in dynamic environments. However, several challenges need to be addressed when multi-population methods are applied, e.g., how to create multiple populations, how to maintain them in different sub-areas, and how to deal with the situation where changes can not be detected or predicted. To address these issues, this paper investigates a hierarchical clustering method to locate and track multiple optima for dynamic optimization problems. To deal with undetectable dynamic environments, this paper applies the random immigrants method without change detection based on a mechanism that can automatically reduce redundant individuals in the search space throughout the run. These methods are implemented into several research areas, including particle swarm optimization, genetic algorithm, and differential evolution. An experimental study is conducted based on the moving peaks benchmark to test the performance with several other algorithms from the literature. The experimental results show the efficiency of the clustering method for locating and tracking multiple optima in comparison with other algorithms based on multi-population methods on the moving peaks benchmark

    A Resource Intensive Traffic-Aware Scheme for Cluster-based Energy Conservation in Wireless Devices

    Full text link
    Wireless traffic that is destined for a certain device in a network, can be exploited in order to minimize the availability and delay trade-offs, and mitigate the Energy consumption. The Energy Conservation (EC) mechanism can be node-centric by considering the traversed nodal traffic in order to prolong the network lifetime. This work describes a quantitative traffic-based approach where a clustered Sleep-Proxy mechanism takes place in order to enable each node to sleep according to the time duration of the active traffic that each node expects and experiences. Sleep-proxies within the clusters are created according to pairwise active-time comparison, where each node expects during the active periods, a requested traffic. For resource availability and recovery purposes, the caching mechanism takes place in case where the node for which the traffic is destined is not available. The proposed scheme uses Role-based nodes which are assigned to manipulate the traffic in a cluster, through the time-oriented backward difference traffic evaluation scheme. Simulation study is carried out for the proposed backward estimation scheme and the effectiveness of the end-to-end EC mechanism taking into account a number of metrics and measures for the effects while incrementing the sleep time duration under the proposed framework. Comparative simulation results show that the proposed scheme could be applied to infrastructure-less systems, providing energy-efficient resource exchange with significant minimization in the power consumption of each device.Comment: 6 pages, 8 figures, To appear in the proceedings of IEEE 14th International Conference on High Performance Computing and Communications (HPCC-2012) of the Third International Workshop on Wireless Networks and Multimedia (WNM-2012), 25-27 June 2012, Liverpool, U

    3E: Energy-Efficient Elastic Scheduling for Independent Tasks in Heterogeneous Computing Systems

    Get PDF
    Reducing energy consumption is a major design constraint for modern heterogeneous computing systems to minimize electricity cost, improve system reliability and protect environment. Conventional energy-efficient scheduling strategies developed on these systems do not sufficiently exploit the system elasticity and adaptability for maximum energy savings, and do not simultaneously take account of user expected finish time. In this paper, we develop a novel scheduling strategy named energy-efficient elastic (3E) scheduling for aperiodic, independent and non-real-time tasks with user expected finish times on DVFS-enabled heterogeneous computing systems. The 3E strategy adjusts processors’ supply voltages and frequencies according to the system workload, and makes trade-offs between energy consumption and user expected finish times. Compared with other energy-efficient strategies, 3E significantly improves the scheduling quality and effectively enhances the system elasticity
    corecore