7,143 research outputs found
Performance Evaluation of Adaptive Scheduling Algorithm for Shared Heterogeneous Cluster Systems
Cluster computing systems have recently generated enormous interest for providing easily scalable and cost-effective parallel computing solution for processing large-scale applications. Various adaptive space-sharing scheduling algorithms have been proposed to improve the performance of dedicated and homogeneous clusters. But commodity clusters are naturally non-dedicated and tend to be heterogeneous over the time as cluster hardware is usually upgraded and new fast machines are also added to improve cluster performance. The existing adaptive policies for dedicated homogeneous and heterogeneous parallel systems are not suitable for such conditions. Most of the existing adaptive policies assume a priori knowledge of certain job characteristics to take scheduling decisions. However such information is not readily available without incurring great cost. This paper fills these gaps by designing robust and effective space-sharing scheduling algorithm for non-dedicated heterogeneous cluster systems, assuming no job characteristics to reduce mean job response time. Evaluation results show that the proposed algorithm provide substantial improvement over existing algorithms at moderate to high system utilizations
Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors
Asymmetric multicore processors (AMPs) have recently emerged as an appealing
technology for severely energy-constrained environments, especially in mobile
appliances where heterogeneity in applications is mainstream. In addition,
given the growing interest for low-power high performance computing, this type
of architectures is also being investigated as a means to improve the
throughput-per-Watt of complex scientific applications.
In this paper, we design and embed several architecture-aware optimizations
into a multi-threaded general matrix multiplication (gemm), a key operation of
the BLAS, in order to obtain a high performance implementation for ARM
big.LITTLE AMPs. Our solution is based on the reference implementation of gemm
in the BLIS library, and integrates a cache-aware configuration as well as
asymmetric--static and dynamic scheduling strategies that carefully tune and
distribute the operation's micro-kernels among the big and LITTLE cores of the
target processor. The experimental results on a Samsung Exynos 5422, a
system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the
big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric
scheduling attain important gains in performance with respect to its
architecture-oblivious counterparts while exploiting all the resources of the
AMP to deliver considerable energy efficiency
Adaptive Dispatching of Tasks in the Cloud
The increasingly wide application of Cloud Computing enables the
consolidation of tens of thousands of applications in shared infrastructures.
Thus, meeting the quality of service requirements of so many diverse
applications in such shared resource environments has become a real challenge,
especially since the characteristics and workload of applications differ widely
and may change over time. This paper presents an experimental system that can
exploit a variety of online quality of service aware adaptive task allocation
schemes, and three such schemes are designed and compared. These are a
measurement driven algorithm that uses reinforcement learning, secondly a
"sensible" allocation algorithm that assigns jobs to sub-systems that are
observed to provide a lower response time, and then an algorithm that splits
the job arrival stream into sub-streams at rates computed from the hosts'
processing capabilities. All of these schemes are compared via measurements
among themselves and with a simple round-robin scheduler, on two experimental
test-beds with homogeneous and heterogeneous hosts having different processing
capacities.Comment: 10 pages, 9 figure
A general framework of multi-population methods with clustering in undetectable dynamic environments
Copyright @ 2011 IEEETo solve dynamic optimization problems, multiple
population methods are used to enhance the population diversity for an algorithm with the aim of maintaining multiple populations in different sub-areas in the fitness landscape. Many experimental studies have shown that locating and tracking multiple relatively good optima rather than a single global optimum is an effective idea in dynamic environments. However, several challenges need to be addressed when multi-population methods are applied, e.g.,
how to create multiple populations, how to maintain them in different sub-areas, and how to deal with the situation where changes can not be detected or predicted. To address these issues, this paper investigates a hierarchical clustering method to locate and track multiple optima for dynamic optimization problems. To deal with undetectable dynamic environments, this
paper applies the random immigrants method without change detection based on a mechanism that can automatically reduce redundant individuals in the search space throughout the run. These methods are implemented into several research areas, including particle swarm optimization, genetic algorithm, and differential evolution. An experimental study is conducted based on the moving peaks benchmark to test the performance with several other algorithms from the literature. The experimental
results show the efficiency of the clustering method for locating and tracking multiple optima in comparison with other algorithms based on multi-population methods on the moving peaks
benchmark
A Resource Intensive Traffic-Aware Scheme for Cluster-based Energy Conservation in Wireless Devices
Wireless traffic that is destined for a certain device in a network, can be
exploited in order to minimize the availability and delay trade-offs, and
mitigate the Energy consumption. The Energy Conservation (EC) mechanism can be
node-centric by considering the traversed nodal traffic in order to prolong the
network lifetime. This work describes a quantitative traffic-based approach
where a clustered Sleep-Proxy mechanism takes place in order to enable each
node to sleep according to the time duration of the active traffic that each
node expects and experiences. Sleep-proxies within the clusters are created
according to pairwise active-time comparison, where each node expects during
the active periods, a requested traffic. For resource availability and recovery
purposes, the caching mechanism takes place in case where the node for which
the traffic is destined is not available. The proposed scheme uses Role-based
nodes which are assigned to manipulate the traffic in a cluster, through the
time-oriented backward difference traffic evaluation scheme. Simulation study
is carried out for the proposed backward estimation scheme and the
effectiveness of the end-to-end EC mechanism taking into account a number of
metrics and measures for the effects while incrementing the sleep time duration
under the proposed framework. Comparative simulation results show that the
proposed scheme could be applied to infrastructure-less systems, providing
energy-efficient resource exchange with significant minimization in the power
consumption of each device.Comment: 6 pages, 8 figures, To appear in the proceedings of IEEE 14th
International Conference on High Performance Computing and Communications
(HPCC-2012) of the Third International Workshop on Wireless Networks and
Multimedia (WNM-2012), 25-27 June 2012, Liverpool, U
3E: Energy-Efficient Elastic Scheduling for Independent Tasks in Heterogeneous Computing Systems
Reducing energy consumption is a major design constraint for modern heterogeneous computing systems to minimize electricity cost, improve system reliability and protect environment. Conventional energy-efficient scheduling strategies developed on these systems do not sufficiently exploit the system elasticity and adaptability for maximum energy savings, and do not simultaneously take account of user expected finish time. In this paper, we develop a novel scheduling strategy named energy-efficient elastic (3E) scheduling for aperiodic, independent and non-real-time tasks with user expected finish times on DVFS-enabled heterogeneous computing systems. The 3E strategy adjusts processors’ supply voltages and frequencies according to the system workload, and makes trade-offs between energy consumption and user expected finish times. Compared with other energy-efficient strategies, 3E significantly improves the scheduling quality and effectively enhances the system elasticity
- …