10,183 research outputs found

    Learning Scheduling Algorithms for Data Processing Clusters

    Full text link
    Efficiently scheduling data processing jobs on distributed compute clusters requires complex algorithms. Current systems, however, use simple generalized heuristics and ignore workload characteristics, since developing and tuning a scheduling policy for each workload is infeasible. In this paper, we show that modern machine learning techniques can generate highly-efficient policies automatically. Decima uses reinforcement learning (RL) and neural networks to learn workload-specific scheduling algorithms without any human instruction beyond a high-level objective such as minimizing average job completion time. Off-the-shelf RL techniques, however, cannot handle the complexity and scale of the scheduling problem. To build Decima, we had to develop new representations for jobs' dependency graphs, design scalable RL models, and invent RL training methods for dealing with continuous stochastic job arrivals. Our prototype integration with Spark on a 25-node cluster shows that Decima improves the average job completion time over hand-tuned scheduling heuristics by at least 21%, achieving up to 2x improvement during periods of high cluster load

    Spatial-temporal data modelling and processing for personalised decision support

    Get PDF
    The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less Keywords Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio

    Spatial-temporal data modelling and processing for personalised decision support

    Get PDF
    The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less Keywords Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio

    A Survey on Scheduling the Task in Fog Computing Environment

    Full text link
    With the rapid increase in the Internet of Things (IoT), the amount of data produced and processed is also increased. Cloud Computing facilitates the storage, processing, and analysis of data as needed. However, cloud computing devices are located far away from the IoT devices. Fog computing has emerged as a small cloud computing paradigm that is near to the edge devices and handles the task very efficiently. Fog nodes have a small storage capability than the cloud node but it is designed and deployed near to the edge device so that request must be accessed efficiently and executes in time. In this survey paper we have investigated and analysed the main challenges and issues raised in scheduling the task in fog computing environment. To the best of our knowledge there is no comprehensive survey paper on challenges in task scheduling of fog computing paradigm. In this survey paper research is conducted from 2018 to 2021 and most of the paper selection is done from 2020-2021. Moreover, this survey paper organizes the task scheduling approaches and technically plans the identified challenges and issues. Based on the identified issues, we have highlighted the future work directions in the field of task scheduling in fog computing environment

    Survey of dynamic scheduling in manufacturing systems

    Get PDF

    Reinforcement learning based local search for grouping problems: A case study on graph coloring

    Get PDF
    Grouping problems aim to partition a set of items into multiple mutually disjoint subsets according to some specific criterion and constraints. Grouping problems cover a large class of important combinatorial optimization problems that are generally computationally difficult. In this paper, we propose a general solution approach for grouping problems, i.e., reinforcement learning based local search (RLS), which combines reinforcement learning techniques with descent-based local search. The viability of the proposed approach is verified on a well-known representative grouping problem (graph coloring) where a very simple descent-based coloring algorithm is applied. Experimental studies on popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves competitive performances compared to a number of well-known coloring algorithms
    • …
    corecore