43,650 research outputs found
A batch scheduler with high level components
In this article we present the design choices and the evaluation of a batch
scheduler for large clusters, named OAR. This batch scheduler is based upon an
original design that emphasizes on low software complexity by using high level
tools. The global architecture is built upon the scripting language Perl and
the relational database engine Mysql. The goal of the project OAR is to prove
that it is possible today to build a complex system for ressource management
using such tools without sacrificing efficiency and scalability. Currently, our
system offers most of the important features implemented by other batch
schedulers such as priority scheduling (by queues), reservations, backfilling
and some global computing support. Despite the use of high level tools, our
experiments show that our system has performances close to other systems.
Furthermore, OAR is currently exploited for the management of 700 nodes (a
metropolitan GRID) and has shown good efficiency and robustness
Single-machine scheduling with stepwise tardiness costs and release times
We study a scheduling problem that belongs to the yard operations component of the railroad planning problems, namely the hump sequencing problem. The scheduling problem is characterized as a single-machine problem with stepwise tardiness cost objectives. This is a new scheduling criterion which is also relevant in the context of traditional machine scheduling problems. We produce complexity results that characterize some cases of the problem as pseudo-polynomially solvable. For the difficult-to-solve cases of the problem, we develop mathematical programming formulations, and propose heuristic algorithms. We test the formulations and heuristic algorithms on randomly generated single-machine scheduling problems and real-life datasets for the hump sequencing problem. Our experiments show promising results for both sets of problems
Many-Task Computing and Blue Waters
This report discusses many-task computing (MTC) generically and in the
context of the proposed Blue Waters systems, which is planned to be the largest
NSF-funded supercomputer when it begins production use in 2012. The aim of this
report is to inform the BW project about MTC, including understanding aspects
of MTC applications that can be used to characterize the domain and
understanding the implications of these aspects to middleware and policies.
Many MTC applications do not neatly fit the stereotypes of high-performance
computing (HPC) or high-throughput computing (HTC) applications. Like HTC
applications, by definition MTC applications are structured as graphs of
discrete tasks, with explicit input and output dependencies forming the graph
edges. However, MTC applications have significant features that distinguish
them from typical HTC applications. In particular, different engineering
constraints for hardware and software must be met in order to support these
applications. HTC applications have traditionally run on platforms such as
grids and clusters, through either workflow systems or parallel programming
systems. MTC applications, in contrast, will often demand a short time to
solution, may be communication intensive or data intensive, and may comprise
very short tasks. Therefore, hardware and software for MTC must be engineered
to support the additional communication and I/O and must minimize task dispatch
overheads. The hardware of large-scale HPC systems, with its high degree of
parallelism and support for intensive communication, is well suited for MTC
applications. However, HPC systems often lack a dynamic resource-provisioning
feature, are not ideal for task communication via the file system, and have an
I/O system that is not optimized for MTC-style applications. Hence, additional
software support is likely to be required to gain full benefit from the HPC
hardware
Managing Uncertainty: A Case for Probabilistic Grid Scheduling
The Grid technology is evolving into a global, service-orientated
architecture, a universal platform for delivering future high demand
computational services. Strong adoption of the Grid and the utility computing
concept is leading to an increasing number of Grid installations running a wide
range of applications of different size and complexity. In this paper we
address the problem of elivering deadline/economy based scheduling in a
heterogeneous application environment using statistical properties of job
historical executions and its associated meta-data. This approach is motivated
by a study of six-month computational load generated by Grid applications in a
multi-purpose Grid cluster serving a community of twenty e-Science projects.
The observed job statistics, resource utilisation and user behaviour is
discussed in the context of management approaches and models most suitable for
supporting a probabilistic and autonomous scheduling architecture
Spatial-temporal data modelling and processing for personalised decision support
The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less
Keywords
Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio
Spatial-temporal data modelling and processing for personalised decision support
The purpose of this research is to undertake the modelling of dynamic data without losing any of the temporal relationships, and to be able to predict likelihood of outcome as far in advance of actual occurrence as possible. To this end a novel computational architecture for personalised ( individualised) modelling of spatio-temporal data based on spiking neural network methods (PMeSNNr), with a three dimensional visualisation of relationships between variables is proposed. In brief, the architecture is able to transfer spatio-temporal data patterns from a multidimensional input stream into internal patterns in the spiking neural network reservoir. These patterns are then analysed to produce a personalised model for either classification or prediction dependent on the specific needs of the situation. The architecture described above was constructed using MatLab© in several individual modules linked together to form NeuCube (M1). This methodology has been applied to two real world case studies. Firstly, it has been applied to data for the prediction of stroke occurrences on an individual basis. Secondly, it has been applied to ecological data on aphid pest abundance prediction. Two main objectives for this research when judging outcomes of the modelling are accurate prediction and to have this at the earliest possible time point. The implications of these findings are not insignificant in terms of health care management and environmental control. As the case studies utilised here represent vastly different application fields, it reveals more of the potential and usefulness of NeuCube (M1) for modelling data in an integrated manner. This in turn can identify previously unknown (or less understood) interactions thus both increasing the level of reliance that can be placed on the model created, and enhancing our human understanding of the complexities of the world around us without the need for over simplification. Read less
Keywords
Personalised modelling; Spiking neural network; Spatial-temporal data modelling; Computational intelligence; Predictive modelling; Stroke risk predictio
- …