26 research outputs found

    LOMARC: Look ahead matchmaking for multi-resource coscheduling.

    Get PDF
    Hyper-Threading (HT) provides a new possibility for job coscheduling without context switch and without the cost for coordinating processes of one parallel job. However, HT achieves high processor throughput at the expense of reducing the performance of the individual process. Since the hardware resources are actually shared between two coscheduled jobs, the resource contention will harm the performance of each job. Most scheduling approaches only focus on the CPU without considering the impact on other resources. In this thesis we present LOMARC, a space-time sharing approach that takes multiple resources, including CPU, I/O, memory and network, into consideration for job coscheduling on HT processors. To improve resource utilization and reduce job response times, LOMARC matches two jobs with complementary resource requirements to coschedule. Our approach partially reorders the waiting job queue by lookahead to increase the possibility of finding a good match. LOMARC also generalizes for standard CPUs, using an adjusted matching scheme and only focusing on hiding I/O latency. In addition, LOMARC incorporates standard scheduling approaches such as priority ordering, aging and backfilling. In our simulation experiment, we use a realistic workload model to provide the convincing results. Our experimental results demonstrate that LOMARC delivers better performance than the standard space sharing approach and the other two job coscheduling approaches for HT processors. The performance gain is mainly due to an increased possibility of coscheduling two complementary jobs by looking ahead on the waiting queue. Source: Masters Abstracts International, Volume: 43-01, page: 0239. Adviser: Angela Sodan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    FairGV: Fair and Fast GPU Virtualization

    Get PDF
    Increasingly high-performance computing (HPC) application developers are opting to use cloud resources due to higher availability. Virtualized GPUs would be an obvious and attractive option for HPC application developers using cloud hosting services. Unfortunately, existing GPU virtualization software is not ready to address fairness, utilization, and performance limitations associated with consolidating mixed HPC workloads. This paper presents FairGV, a radically redesigned GPU virtualization system that achieves system-wide weighted fair sharing and strong performance isolation in mixed workloads that use GPUs with variable degrees of intensity. To achieve its objectives, FairGV introduces a trap-less GPU processing architecture, a new fair queuing method integrated with work-conserving and GPU-centric co-scheduling polices, and a collaborative scheduling method for non-preemptive GPUs. Our prototype implementation achieves near ideal fairness (? 0.97 Min-Max Ratio) with little performance degradation (? 1.02 aggregated overhead) in a range of mixed HPC workloads that leverage GPUs

    G-LOMARC-TS: Lookahead group matchmaking for time/space sharing on multi-core parallel machines

    Get PDF
    Parallel machines with multi-core nodes are becoming increasingly popular. The performances of applications running on these machines are improved gradually due to the resource competition in each node. Researches have found that coscheduling different applications with complementary resource characteristics on the same set of nodes (semi time sharing) may improve the performance. We propose a scheduling algorithm G-LOMARC-TS which incorporates both space and semi time sharing scheduling methods and matches groups of jobs if possible for coscheduling. Since matchmaking may select jobs further down the waiting queue and the jobs in front of the queue may be delayed subsequently, fairness for each individual job will be watched and the delay will be kept within a limited bound. Several heuristics are used to solve the NP-complete problem of forming groups. Our experiment results show both utilization gain and average relative response time improvements of G-LOMARC-TS over other several scheduling policies

    Extending Scojo-PECT by migration based on system-level checkpointing

    Get PDF
    In recent years, a significant amount of research has been done on job scheduling in high performance computing area. Parallel jobs have different running time and require a different number of processors, thus jobs need to be scheduled and packed to improve system utilization. Scojo-PECT is a job scheduler which provides service guarantees by using coarse-grain time sharing. However, Scojo-PECT does not provide process migration. We extend the Scojo-PECT by migrating parallel jobs based on system-level checkpointing. We investigate different cases in the Scojo-PECT scheduling algorithm where migration based on system-level checkpointing can be used to improve resource utilization and reduce job response time. Our experimental results show reduction of relative response times on medium jobs over the results of the original Scojo-PECT scheduler and the long jobs do not suffer any disadvantage

    Scalable Resource Management in High Performance Computers £

    Get PDF
    Abstract Clusters of workstations have emerged as an importan

    Dynamic multi-resource monitoring for predictive job scheduling.

    Get PDF
    Standard job schedulers rely on either the user\u27s estimation, or a few approaches that use performance databases to keep information about job runtimes to predict future runs. Co-scheduling for improved resource utilization, however, requires more detailed information as regards behavior on multiple resources to make predictions about slowdowns. Thus, information about communication, I/O, and computation at application level is needed but hard to estimate by the user. Furthermore, dynamic adaptive resource allocation requires information about the different processes on different machine nodes. We present an intelligent monitoring tool, ScoPro, which provides such information. To make monitoring more feasible, ScoPro harnesses the dynamic instrument techniques, which postpone insertion of instrumentation code until the application is executing. To keep intrusion low, we limit monitoring to short test phases. (Abstract shortened by UMI.)Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .L586. Source: Masters Abstracts International, Volume: 44-03, page: 1407. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Planificación de aplicaciones best-effort y soft real-time en NOWs

    Get PDF
    La aparición de nuevos tipos de aplicaciones, como vídeo bajo demanda, realidad virtual y videoconferencias entre otras, caracterizadas por la necesidad de cumplir sus deadlines. Este tipo de aplicaciones, han sido denominadas en la literatura aplicaciones soft-real time (SRT) periódicas. Este trabajo se centra en el problema de la planificación temporal de este nuevo tipo de aplicaciones en clusters no dedicados.L'aparició de nous tipus d'aplicacions, com vídeo sota demanda, realitat virtual i videoconferències entre unes altres, caracteritzades per la necessitat de complir les seves deadlines. Aquest tipus d'aplicacions, han estat denominades en la literatura aplicacions soft-real time (SRT) periòdiques. Aquest treball es centra en el problema de la planificació temporal d'aquest nou tipus d'aplicacions en clusters no dedicats

    Extending Scojo-PECT by migration based on application level checkpointing

    Get PDF
    In parallel computing, jobs have different runtimes and required computation resources. With runtimes correlated with resources, scheduling these jobs would be a packing problem getting the utilization and total execution time varies. Sometimes, resources are idle while jobs are preempted or have resource conflict with no chance to take use of them. This greatly wastes system resource at certain degree. Here we propose an approach which takes periodic checkpoints of running jobs with the chance to take advantage of migration to optimize our scheduler during long term scheduling. We improve our original Scojo-PECT preemptive scheduler which does not have checkpoint support before. We evaluate the gained execution time minus overhead of checkpointing/migration, to make comparison with original execution time

    Adaptive Resource Relocation in Virtualized Heterogeneous Clusters

    No full text
    Cluster computing has recently gone through an evolution from single processor systems to multicore/multi-socket systems. This has resulted in lowering the cost/performance ratio of the compute machines. Compute farms that host these machines tend to become heterogeneous over time due to incremental extensions, hardware upgrades and/or nodes being purchased for users with particular needs. This heterogeneity is not surprising given the wide range of processor, memory and network technologies that become available and the relatively small price difference between these various options. Different CPU architectures, memory capacities, communication and I/O interfaces of the participating compute nodes present many challenges to job scheduling and often result in under or over utilization of the compute resources. In general, it is not feasible for the application programmers to specifically optimize their programs for such a set of differing compute n odes, due to the difficulty and time-intensiveness of such a task. The trend of heterogeneous compute farms has coincided with resurgence in the virtualization technology. Virtualization technology is receiving widespread adoption, mainly due to the benefits of server consolidation and isolation, load balancing, security and fault tolerance. Virtualization has also generated considerable interest in the High Performance Computing (HPC) community, due to the resulting high availability, fault tolerance, cluster partitioning and accommodation of conflicting user requirements. However, the HPC community is still wary of the potential overheads associated with‘ virtualization, as it results in slower network communications and disk I/O, which need to be addressed. The live migration feature, available to most virtualization technologies, can be leveraged to improve the throughput of a heterogeneous compute farm (HC) used for HPC applications. For this we mitigated the slow network communication in Xen; an open source virtual machine monitor. We present a detailed analysis of the communication framework of Xen and propose communication configurations that give 50% improvement over the conventional Xen network configuration. From a detailed study of the migration facility in Xen, we propose an improvement in the live migration facility specifically targeting HPC applications. This optimization gives around 50% improvement over the default migration facility of Xen. In this thesis, we also investigate resource scheduling in heterogeneous compute farm with the perspective of dynamic resource re-mapping. Our approach is to profile each job in the compute farm at runtime, and propose a better resource mapping compared to the initial allocation. We then migrate the job(s) to the best-suited homogeneous sub-cluster to improve overall throughput of the HC. For this, we develop a novel heterogeneity and virtualization-aware profiling framework, which is able to predict the CPU and communication characteristics of high performance scientific applications. The prediction accuracy of our performance estimation model is over 80%. The framework implementation is lightweight, with an overhead of 3%. Our experiments show that we are able to improve the throughput of the compute farm by 25% and the time saved by the HC with our framework is over 30%. The framework can be readily extended to HCs supporting a cloud computing environment
    corecore