170,223 research outputs found

    Survey of dynamic scheduling in manufacturing systems

    Get PDF

    Self-organising agent communities for autonomic resource management

    No full text
    The autonomic computing paradigm addresses the operational challenges presented by increasingly complex software systems by proposing that they be composed of many autonomous components, each responsible for the run-time reconfiguration of its own dedicated hardware and software components. Consequently, regulation of the whole software system becomes an emergent property of local adaptation and learning carried out by these autonomous system elements. Designing appropriate local adaptation policies for the components of such systems remains a major challenge. This is particularly true where the system’s scale and dynamism compromise the efficiency of a central executive and/or prevent components from pooling information to achieve a shared, accurate evidence base for their negotiations and decisions.In this paper, we investigate how a self-regulatory system response may arise spontaneously from local interactions between autonomic system elements tasked with adaptively consuming/providing computational resources or services when the demand for such resources is continually changing. We demonstrate that system performance is not maximised when all system components are able to freely share information with one another. Rather, maximum efficiency is achieved when individual components have only limited knowledge of their peers. Under these conditions, the system self-organises into appropriate community structures. By maintaining information flow at the level of communities, the system is able to remain stable enough to efficiently satisfy service demand in resource-limited environments, and thus minimise any unnecessary reconfiguration whilst remaining sufficiently adaptive to be able to reconfigure when service demand changes

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load
    corecore