508 research outputs found
Assessment of Response Time for New Multi Level Feedback Queue Scheduler
Response time is one of the characteristics of scheduler, happens to be a
prominent attribute of any CPU scheduling algorithm. The proposed New Multi
Level Feedback Queue [NMLFQ] Scheduler is compared with dynamic, real time,
Dependent Activity Scheduling Algorithm (DASA) and Lockes Best Effort
Scheduling Algorithm (LBESA). We abbreviated beneficial result of NMLFQ
scheduler in comparison with dynamic best effort schedulers with respect to
response time.Comment: 7 pages, 5 figure
Throughput Prediction of Asynchronous SGD in TensorFlow
Modern machine learning frameworks can train neural networks using multiple
nodes in parallel, each computing parameter updates with stochastic gradient
descent (SGD) and sharing them asynchronously through a central parameter
server. Due to communication overhead and bottlenecks, the total throughput of
SGD updates in a cluster scales sublinearly, saturating as the number of nodes
increases. In this paper, we present a solution to predicting training
throughput from profiling traces collected from a single-node configuration.
Our approach is able to model the interaction of multiple nodes and the
scheduling of concurrent transmissions between the parameter server and each
node. By accounting for the dependencies between received parts and pending
computations, we predict overlaps between computation and communication and
generate synthetic execution traces for configurations with multiple nodes. We
validate our approach on TensorFlow training jobs for popular image
classification neural networks, on AWS and on our in-house cluster, using nodes
equipped with GPUs or only with CPUs. We also investigate the effects of data
transmission policies used in TensorFlow and the accuracy of our approach when
combined with optimizations of the transmission schedule
Recommended from our members
Achieving Accurate Predictions of Future Events Under Hardware Heterogeneity
Heterogeneous hardware is becoming increasingly available in modern hardware, while research breakthroughs enforce the expectation that heterogeneity will keep increasing in the future. Significant gains can be achieved via appropriate utilization of heterogeneity, in terms of performance and power consumption, however, poor utilization can have a detrimental effect. Intelligent scheduling and resource management is a crucial challenge we need to overcome in order to harvest the full potential of heterogeneous hardware. As systems become larger and include greater levels of hardware diversity, the importance of intelligent scheduling and resource management is further accentuated.This dissertation presents techniques that aid the process of scheduling and resource management in the presence of heterogeneous hardware, via accurately predicting upcoming runtime events. With a proactive and accurate view of the near future, schedulers can utilize the underlying hardware more efficiently, and fully take advantage of the available benefits.By adapting a majority element heuristic, this dissertation significantly improves the accuracy of predicting memory addresses about to be accessed, while reducing prediction-related costs by a factor of ten thousand compared to previously proposed predictive approaches. Coupled with novel microarchitectural modifications, accurate address predictions are shown to improve the performance of heterogeneous memory architectures.Machine learning-based performance predictors are further presented, capable of predicting a program's performance when executed on a given general-purpose core. Trained to model the subtleties of the interaction between hardware and software, these predictors are capable of generating highly accurate predictions even for cores with varied Instruction Set Architectures. Utilizing these performance predictions for job scheduling, is shown to improve overall system performance. The trained predictors are further examined and interpreted in order to visualize the correlations between features picked up and amplified during training.Finally, this dissertation demonstrates that scheduling algorithms cannot guarantee deriving an optimal schedule during realistic execution scenarios due to the underlying hardware heterogeneity, the wide range of runtime requirements of software, as well as prediction error from performance predictors. In response, deep neural networks are trained to select one scheduling approach from a list of options with varied overheads and correctness guarantees. The scheduling approach chosen, is the one which will most likely return the highest-performance schedule with the lowest overhead, given a particular instance of the job-to-core assignment problem
- …