Search CORE

110 research outputs found

Recommended from our members

Final Report: Performance Modeling Activities in PERC2

Author: Snavely Allan
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 25/02/2007
Field of study

Progress in Performance Modeling for PERC2 resulted in: • Automated modeling tools that are robust, able to characterize large applications running at scale while simultaneously simulating the memory hierarchies of mul-tiple machines in parallel. • Porting of the requisite tracer tools to multiple platforms. • Improved performance models by using higher resolution memory models that ever before. • Adding control-flow and data dependency analysis to the tracers used in perform-ance tools. • Exploring and developing several new modeling methodologies. • Using modeling tools to develop performance models for strategic codes. • Application of modeling methodology to make a large number of “blind” per-formance predictions on certain mission partner applications, targeting most cur-rently available system architectures. • Error analysis to correct some systematic biases encountered as part of the large-scale blind prediction exercises. • Addition of instrumentation capabilities for communication libraries other than MPI. • Dissemination the tools and modeling methods to several mission partners, in-cluding DoD HPCMO and two DARPA HPCS vendors (Cray and IBM), as well as to the wider HPC community via a series of tutorials

UNT Digital Library

Final Report: Performance Modeling Activities in PERC2

Author
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date
Field of study

Crossref

Modulated Branching Processes, Origins of Power Laws and Queueing Duality

Author: Jelenkovic Predrag R.
Tan Jian
Publication venue
Publication date: 01/01/2007
Field of study

Power law distributions have been repeatedly observed in a wide variety of socioeconomic, biological and technological areas. In many of the observations, e.g., city populations and sizes of living organisms, the objects of interest evolve due to the replication of their many independent components, e.g., births-deaths of individuals and replications of cells. Furthermore, the rates of the replication are often controlled by exogenous parameters causing periods of expansion and contraction, e.g., baby booms and busts, economic booms and recessions, etc. In addition, the sizes of these objects often have reflective lower boundaries, e.g., cities do not fall bellow a certain size, low income individuals are subsidized by the government, companies are protected by bankruptcy laws, etc. Hence, it is natural to propose reflected modulated branching processes as generic models for many of the preceding observations. Indeed, our main results show that the proposed mathematical models result in power law distributions under quite general polynomial Gartner-Ellis conditions, the generality of which could explain the ubiquitous nature of power law distributions. In addition, on a logarithmic scale, we establish an asymptotic equivalence between the reflected branching processes and the corresponding multiplicative ones. The latter, as recognized by Goldie (1991), is known to be dual to queueing/additive processes. We emphasize this duality further in the generality of stationary and ergodic processes.Comment: 36 pages, 2 figures; added references; a new theorem in Subsection 4.

arXiv.org e-Print Archive

CiteSeerX

Provably Efficient Adaptive Scheduling for Parallel Jobs

Author: He Yuxiong
Hsu Wen Jing
Leiserson Charles E.
Publication venue
Publication date: 01/01/2007
Field of study

Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level of fairness to user jobs. Various degrees of successes have been achieved over the years. However, few existing schemes address both efficiency and fairness over a wide range of work loads. Moreover, in order to obtain analytical results, most of them require prior information about jobs, which may be difficult to obtain in real applications. This paper presents two novel adaptive scheduling algorithms -- GRAD for centralized scheduling, and WRAD for distributed scheduling. Both GRAD and WRAD ensure fair allocation under all levels of workload, and they offer provable efficiency without requiring prior information of job's parallelism. Moreover, they provide effective control over the scheduling overhead and ensure efficient utilization of processors. To the best of our knowledge, they are the first non-clairvoyant scheduling algorithms that offer such guarantees. We also believe that our new approach of resource request-allotment protocol deserves further exploration. Specifically, both GRAD and WRAD are O(1)-competitive with respect to mean response time for batched jobs, and O(1)-competitive with respect to makespan for non-batched jobs with arbitrary release times. The simulation results show that, for non-batched jobs, the makespan produced by GRAD is no more than 1.39 times of the optimal on average and it never exceeds 4.5 times. For batched jobs, the mean response time produced by GRAD is no more than 2.37 times of the optimal on average, and it never exceeds 5.5 times.Singapore-MIT Alliance (SMA

CiteSeerX

DSpace@MIT

Towards ServMark, an Architecture for Testing Grid Services

Author: Andreica Mugurel Ionut
Dumitrescu Catalin
Epema Dick
Foster Ian
Iosup Alexandru
Raicu Ioan
Ripeanu Matei
Tapus Nicolae
Publication venue: HAL CCSD
Publication date: 01/01/2006
Field of study

Technical University of Delft - Technical Report ServMark-2006-002, July 2006Grid computing provides a natural way to aggregate resources from different administrative domains for building large scale distributed environments. The Web Services paradigm proposes a way by which virtual services can be seamlessly integrated into global-scale solutions to complex problems. While the usage of Grid technology ranges from academia and research to business world and production, two issues must be considered: that the promised functionality can be accurately quantified and that the performance can be evaluated based on well defined means. Without adequate functionality demonstrators, systems cannot be tuned or adequately configured, and Web services cannot be stressed adequately in production environment. Without performance evaluation systems, the system design and procurement processes are limp, and the performance of Web Services in production cannot be assessed. In this paper, we present ServMark, a carefully researched tool for Grid performance evaluation. While we acknowledge that a lot of ground must be covered to fulfill the requirements of a system for testing Grid environments, and Web (and Grid) Services, we believe that ServMark addresses the minimal set of critical issues

HAL-ENS-LYON

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

Analysis of a generic model for a bottleneck link in an integrated services communications network

Author: Boucherie Richard J.
Litjens Remco
Publication venue
Publication date: 01/12/2007
Field of study

University of Twente Research Information

Real-Time Divisible Load Scheduling with Different Processor Available Times

Author: Deogun Jitender
Goddard Steve
Lin Xuan
Lu Ying
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 27/02/2007
Field of study

Providing QoS and performance guarantees to arbitrarily divisible loads has become a significant problem for many cluster-based research computing facilities. While progress is being made in scheduling arbitrarily divisible loads, some of proposed approaches may cause Inserted Idle Times (IITs) that are detrimental to system performance. In this paper we propose a new approach that utilizes IITs and thus enhances the system performance. The novelty of our approach is that, to simplify the analysis, a homogenous system with IITs is transformed to an equivalent heterogeneous system, and that our algorithms can schedule real-time divisible loads with different processor available times. Intensive simulations show that the new approach outperforms the previous approach in all configurations. We also compare the performance of our algorithm to the current practice of manually splitting workloads by users. Simulation results validate the advantages of our approach

DigitalCommons@University of Nebraska