64 research outputs found
Scheduling Real-Time Jobs in Distributed Systems - Simulation and Performance Analysis
Proceedings of: First International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2014). Porto (Portugal), August 27-28, 2014.One of the major challenges in ultrascale systems is the effective scheduling of complex jobs within strict timing constraints. The distributed and heterogeneous system resources constitute another critical issue that must be addressed by the employed scheduling strategy. In this paper, we investigate by simulation the performance of various policies for the scheduling of real-time directed acyclic graphs in a heterogeneous distributed environment. We apply bin packing techniques during the processor selection phase of the scheduling process, in order to utilize schedule gaps
and thus enhance existing list scheduling methods. The simulation results show that the proposed policies outperform all of the other examined algorithms.The work presented in this paper has been partially supported by
EU under the COST program Action IC1305, “Network for Sustainable
Ultrascale Computing (NESUS)”
A Gossip-based optimistic replication for efficient delay-sensitive streaming using an interactive middleware support system
While sharing resources the efficiency is substantially degraded as a result
of the scarceness of availability of the requested resources in a multiclient
support manner. These resources are often aggravated by many factors like the
temporal constraints for availability or node flooding by the requested
replicated file chunks. Thus replicated file chunks should be efficiently
disseminated in order to enable resource availability on-demand by the mobile
users. This work considers a cross layered middleware support system for
efficient delay-sensitive streaming by using each device's connectivity and
social interactions in a cross layered manner. The collaborative streaming is
achieved through the epidemically replicated file chunk policy which uses a
transition-based approach of a chained model of an infectious disease with
susceptible, infected, recovered and death states. The Gossip-based stateful
model enforces the mobile nodes whether to host a file chunk or not or, when no
longer a chunk is needed, to purge it. The proposed model is thoroughly
evaluated through experimental simulation taking measures for the effective
throughput Eff as a function of the packet loss parameter in contrast with the
effectiveness of the replication Gossip-based policy.Comment: IEEE Systems Journal 201
Different aspects of workflow scheduling in large-scale distributed systems
As large-scale distributed systems gain momentum, the scheduling of workflow applications with multiple requirements in such computing platforms has become a crucial area of research. In this paper, we investigate the workflow scheduling problem in large-scale distributed systems, from the Quality of Service (QoS) and data locality perspectives. We present a scheduling approach, considering two models of synchronization for the tasks in a workflow application: (a) communication through the network and (b) communication through temporary files. Specifically, we investigate via simulation the performance of a heterogeneous distributed system, where multiple soft real-time workflow applications arrive dynamically. The applications are scheduled under various tardiness bounds, taking into account the communication cost in the first case study and the I/O cost and data locality in the second.The work presented in this paper has been partially supported by EU, under the COST program Action IC1305, “Network for Sustainable Ultrascale Computing (NESUS)”, and by the Ministerio de EconomĂa y Competitividad, Spain, under the project TIN2013-41350-P, “Scalable Data Management Techniques for High-End Computing Systems”
Exascale machines require new programming paradigms and runtimes
Extreme scale parallel computing systems will have tens of thousands of optionally accelerator-equiped nodes with hundreds of cores each, as well as deep memory hierarchies and complex interconnect topologies. Such Exascale systems will provide hardware parallelism at multiple levels and will be energy constrained. Their extreme scale and the rapidly deteriorating reliablity of their hardware components means that Exascale systems will exhibit low mean-time-between-failure values. Furthermore, existing programming models already require heroic programming and optimisation efforts to achieve high efficiency on current supercomputers. Invariably, these efforts are platform-specific and non-portable. In this paper we will explore the shortcomings of existing programming models and runtime systems for large scale computing systems. We then propose and discuss important features of programming paradigms and runtime system to deal with large scale computing systems with a special focus on data-intensive applications and resilience. Finally, we also discuss code sustainability issues and propose several software metrics that are of paramount importance for code development for large scale computing systems
- …