1 research outputs found
The Improved Job Scheduling Algorithm of Hadoop Platform
This paper discussed some job scheduling algorithms for Hadoop platform, and
proposed a jobs scheduling optimization algorithm based on Bayes Classification
viewing the shortcoming of those algorithms which are used. The proposed
algorithm can be summarized as follows. In the scheduling algorithm based on
Bayes Classification, the jobs in job queue will be classified into bad job and
good job by Bayes Classification, when JobTracker gets task request, it will
select a good job from job queue, and select tasks from good job to allocate
JobTracker, then the execution result will feedback to the JobTracker.
Therefore the scheduling algorithm based on Bayes Classification influence the
job classification via learning the result of feedback with the JobTracker will
select the most appropriate job to execute on TaskTracker every time. We need
to consider the feature usage of job resource and the influence of TaskTracker
resource on task execution, the former of which we call it job feature, for
instance, the average usage rate of CPU and average usage rate of memory, the
latter node feature, such as the usage rate of CPU and the size of idle
physical memory, the two are called feature variables. Results show that it has
a significant improvement in execution efficiency and stability of job
scheduling.Comment: 12 pages, 3 figure