10,183 research outputs found
Learning Scheduling Algorithms for Data Processing Clusters
Efficiently scheduling data processing jobs on distributed compute clusters
requires complex algorithms. Current systems, however, use simple generalized
heuristics and ignore workload characteristics, since developing and tuning a
scheduling policy for each workload is infeasible. In this paper, we show that
modern machine learning techniques can generate highly-efficient policies
automatically. Decima uses reinforcement learning (RL) and neural networks to
learn workload-specific scheduling algorithms without any human instruction
beyond a high-level objective such as minimizing average job completion time.
Off-the-shelf RL techniques, however, cannot handle the complexity and scale of
the scheduling problem. To build Decima, we had to develop new representations
for jobs' dependency graphs, design scalable RL models, and invent RL training
methods for dealing with continuous stochastic job arrivals. Our prototype
integration with Spark on a 25-node cluster shows that Decima improves the
average job completion time over hand-tuned scheduling heuristics by at least
21%, achieving up to 2x improvement during periods of high cluster load
Spatial-temporal data modelling and processing for personalised decision support
The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less
Keywords
Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio
Spatial-temporal data modelling and processing for personalised decision support
The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less
Keywords
Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio
A Survey on Scheduling the Task in Fog Computing Environment
With the rapid increase in the Internet of Things (IoT), the amount of data
produced and processed is also increased. Cloud Computing facilitates the
storage, processing, and analysis of data as needed. However, cloud computing
devices are located far away from the IoT devices. Fog computing has emerged as
a small cloud computing paradigm that is near to the edge devices and handles
the task very efficiently. Fog nodes have a small storage capability than the
cloud node but it is designed and deployed near to the edge device so that
request must be accessed efficiently and executes in time. In this survey paper
we have investigated and analysed the main challenges and issues raised in
scheduling the task in fog computing environment. To the best of our knowledge
there is no comprehensive survey paper on challenges in task scheduling of fog
computing paradigm. In this survey paper research is conducted from 2018 to
2021 and most of the paper selection is done from 2020-2021. Moreover, this
survey paper organizes the task scheduling approaches and technically plans the
identified challenges and issues. Based on the identified issues, we have
highlighted the future work directions in the field of task scheduling in fog
computing environment
Reinforcement learning based local search for grouping problems: A case study on graph coloring
Grouping problems aim to partition a set of items into multiple mutually
disjoint subsets according to some specific criterion and constraints. Grouping
problems cover a large class of important combinatorial optimization problems
that are generally computationally difficult. In this paper, we propose a
general solution approach for grouping problems, i.e., reinforcement learning
based local search (RLS), which combines reinforcement learning techniques with
descent-based local search. The viability of the proposed approach is verified
on a well-known representative grouping problem (graph coloring) where a very
simple descent-based coloring algorithm is applied. Experimental studies on
popular DIMACS and COLOR02 benchmark graphs indicate that RLS achieves
competitive performances compared to a number of well-known coloring
algorithms
- …