4,692 research outputs found
Priority-enabled Scheduling for Resizable Parallel Applications
In this paper, we illustrate the impact of dynamic resizability on parallel scheduling.
Our ReSHAPE framework includes an application scheduler that supports dynamic resizing of parallel applications. We propose and evaluate new scheduling policies made possible by our ReSHAPE framework. The framework also provides a platform to experiment with more interesting and sophisticated scheduling policies and scenarios for resizable parallel applications. The proposed policies support scheduling of parallel applications with and without user assigned priorities. Experimental results show that these scheduling policies significantly improve individual application turn around time as well as overall cluster utilization
Bulk Scheduling with the DIANA Scheduler
Results from the research and development of a Data Intensive and Network
Aware (DIANA) scheduling engine, to be used primarily for data intensive
sciences such as physics analysis, are described. In Grid analyses, tasks can
involve thousands of computing, data handling, and network resources. The
central problem in the scheduling of these resources is the coordinated
management of computation and data at multiple locations and not just data
replication or movement. However, this can prove to be a rather costly
operation and efficient sing can be a challenge if compute and data resources
are mapped without considering network costs. We have implemented an adaptive
algorithm within the so-called DIANA Scheduler which takes into account data
location and size, network performance and computation capability in order to
enable efficient global scheduling. DIANA is a performance-aware and
economy-guided Meta Scheduler. It iteratively allocates each job to the site
that is most likely to produce the best performance as well as optimizing the
global queue for any remaining jobs. Therefore it is equally suitable whether a
single job is being submitted or bulk scheduling is being performed. Results
indicate that considerable performance improvements can be gained by adopting
the DIANA scheduling approach.Comment: 12 pages, 11 figures. To be published in the IEEE Transactions in
Nuclear Science, IEEE Press. 200
G-LOMARC-TS: Lookahead group matchmaking for time/space sharing on multi-core parallel machines
Parallel machines with multi-core nodes are becoming increasingly popular. The performances of applications running on these machines are improved gradually due to the resource competition in each node. Researches have found that coscheduling different applications with complementary resource characteristics on the same set of nodes (semi time sharing) may improve the performance. We propose a scheduling algorithm G-LOMARC-TS which incorporates both space and semi time sharing scheduling methods and matches groups of jobs if possible for coscheduling. Since matchmaking may select jobs further down the waiting queue and the jobs in front of the queue may be delayed subsequently, fairness for each individual job will be watched and the delay will be kept within a limited bound. Several heuristics are used to solve the NP-complete problem of forming groups. Our experiment results show both utilization gain and average relative response time improvements of G-LOMARC-TS over other several scheduling policies
Model-driven Scheduling for Distributed Stream Processing Systems
Distributed Stream Processing frameworks are being commonly used with the
evolution of Internet of Things(IoT). These frameworks are designed to adapt to
the dynamic input message rate by scaling in/out.Apache Storm, originally
developed by Twitter is a widely used stream processing engine while others
includes Flink, Spark streaming. For running the streaming applications
successfully there is need to know the optimal resource requirement, as
over-estimation of resources adds extra cost.So we need some strategy to come
up with the optimal resource requirement for a given streaming application. In
this article, we propose a model-driven approach for scheduling streaming
applications that effectively utilizes a priori knowledge of the applications
to provide predictable scheduling behavior. Specifically, we use application
performance models to offer reliable estimates of the resource allocation
required. Further, this intuition also drives resource mapping, and helps
narrow the estimated and actual dataflow performance and resource utilization.
Together, this model-driven scheduling approach gives a predictable application
performance and resource utilization behavior for executing a given DSPS
application at a target input stream rate on distributed resources.Comment: 54 page
Extending Scojo-PECT by migration based on system-level checkpointing
In recent years, a significant amount of research has been done on job scheduling in high performance computing area. Parallel jobs have different running time and require a different number of processors, thus jobs need to be scheduled and packed to improve system utilization. Scojo-PECT is a job scheduler which provides service guarantees by using coarse-grain time sharing. However, Scojo-PECT does not provide process migration. We extend the Scojo-PECT by migrating parallel jobs based on system-level checkpointing. We investigate different cases in the Scojo-PECT scheduling algorithm where migration based on system-level checkpointing can be used to improve resource utilization and reduce job response time. Our experimental results show reduction of relative response times on medium jobs over the results of the original Scojo-PECT scheduler and the long jobs do not suffer any disadvantage
Fine-Grained Scheduling for Containerized HPC Workloads in Kubernetes Clusters
Containerization technology offers lightweight OS-level virtualization, and
enables portability, reproducibility, and flexibility by packing applications
with low performance overhead and low effort to maintain and scale them.
Moreover, container orchestrators (e.g., Kubernetes) are widely used in the
Cloud to manage large clusters running many containerized applications.
However, scheduling policies that consider the performance nuances of
containerized High Performance Computing (HPC) workloads have not been
well-explored yet. This paper conducts fine-grained scheduling policies for
containerized HPC workloads in Kubernetes clusters, focusing especially on
partitioning each job into a suitable multi-container deployment according to
the application profile. We implement our scheduling schemes on different
layers of management (application and infrastructure), so that each component
has its own focus and algorithms but still collaborates with others. Our
results show that our fine-grained scheduling policies outperform baseline and
baseline with CPU/memory affinity enabled policies, reducing the overall
response time by 35% and 19%, respectively, and also improving the makespan by
34% and 11%, respectively. They also provide better usability and flexibility
to specify HPC workloads than other comparable HPC Cloud frameworks, while
providing better scheduling efficiency thanks to their multi-layered approach.Comment: HPCC202
Supercomputer Emulation For Evaluating Scheduling Algorithms
Scheduling algorithms have a significant impact on the optimal
utilization of HPC facilities, yet the vast majority of the
research in this area is done using simulations. In working with
simulations, a great deal of factors that affect a real
scheduler, such as its scheduling processing time, communication
latencies and the scheduler intrinsic
implementation complexity are not considered. As a result,
despite theoretical improvements reported in several articles,
practically no new algorithms proposed have been implemented in
real schedulers, with HPC facilities still using the basic
first-come-first-served (FCFS) with Backfill policy scheduling
algorithm.
A better approach could be, therefore, the use of real schedulers
in an emulation environment to evaluate new algorithms.
This thesis investigates two related challenges in emulations:
computational cost and faithfulness of the results to real
scheduling environments.
It finds that the sampling, shrinking and shuffling of a trace
must be done carefully to keep the classical metrics invariant or
linear variant in relation to size and times of the original
workload. This is accomplished by the careful control of the
submission period and the consideration of drifts in the
submission period and trace duration.
This methodology can help researchers to better evaluate their
scheduling algorithms and help HPC administrators to optimize the
parameters of production schedulers.
In order to assess the proposed methodology, we evaluated both
the FCFS with Backfill and Suspend/Resume scheduling algorithms.
The results strongly suggest that Suspend/Resume leads to a
better utilization of a supercomputer when high priorities are
given to big jobs
- …