43 research outputs found
A review of traffic simulation software
Computer simulation of tra c is a widely used method in research of tra c modelling,
planning and development of tra c networks and systems. Vehicular tra c systems are of
growing concern and interest globally and modelling arbitrarily complex tra c systems is a
hard problem. In this article we review some of the tra c simulation software applications,
their features and characteristics as well as the issues these applications face. Additionally, we
introduce some algorithmic ideas, underpinning data structural approaches and quanti able
metrics that can be applied to simulated model systems
Managing community membership information in a small-world grid
As the Grid matures the problem of resource discovery across communities,
where resources now include computational services, is becoming more
critical. The number of resources available on a world-wide grid is set to grow
exponentially in much the same way as the number of static web pages on
the WWW. We observe that the world-wide resource discovery problem can
be modelled as a slowly evolving very-large sparse-matrix where individual
matrix elements represent nodes’ knowledge of one another. Blocks in the
matrix arise where nodes offer more than one service. Blocking effects also
arise in the identification of sub-communities in the Grid. The linear algebra
community has long been aware of suitable representations of large, sparse
matrices. However, matrices the size of the world-wide grid potentially number
in the billions, making dense solutions completely intractable. Distributed
nodes will not necessarily have the storage capacity to store the addresses of
any significant percentage of the available resources. We discuss ways of modelling
this problem in the regime of a slowly changing service base including
phenomena such as percolating networks and small-world network effects
Sparse cross-products of metadata in scientific simulation management
Managing scientific data is by no means a trivial task even in a single site environment
with a small number of researchers involved. We discuss some issues concerned with posing
well-specified experiments in terms of parameters or instrument settings and the metadata
framework that arises from doing so. We are particularly interested in parallel computer
simulation experiments, where very large quantities of warehouse-able data are involved. We
consider SQL databases and other framework technologies for manipulating experimental data.
Our framework manages the the outputs from parallel runs that arise from large cross-products
of parameter combinations. Considerable useful experiment planning and analysis can be done
with the sparse metadata without fully expanding the parameter cross-products. Extra value
can be obtained from simulation output that can subsequently be data-mined. We have
particular interests in running large scale Monte-Carlo physics model simulations. Finding
ourselves overwhelmed by the problems of managing data and compute ¿resources, we have
built a prototype tool using Java and MySQL that addresses these issues. We use this example
to discuss type-space management and other fundamental ideas for implementing a laboratory
information management system
Small-world networks, distributed hash tables and the e-resource discovery problem
Resource discovery is one of the most important underpinning problems behind producing a scalable,
robust and efficient global infrastructure for e-Science. A number of approaches to the resource discovery
and management problem have been made in various computational grid environments and prototypes
over the last decade. Computational resources and services in modern grid and cloud environments can be
modelled as an overlay network superposed on the physical network structure of the Internet and World
Wide Web. We discuss some of the main approaches to resource discovery in the context of the general
properties of such an overlay network. We present some performance data and predicted properties based
on algorithmic approaches such as distributed hash table resource discovery and management. We describe
a prototype system and use its model to explore some of the known key graph aspects of the global
resource overlay network - including small-world and scale-free properties
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Parallel containers: a tool for applying parallel computing applications on clusters
Parallel and cluster computing remain somewhat difficult to apply quickly for many applications
domains. Recent developments in computer libraries such as the Standard Template
Library of the C++ language and the Message Passing Package associated with the Python
Language provide a way to implement very high level parallel containers in support of application
programming. A parallel container is an implementation of a data structure such as a
list, or vector, or set, that has associated with it the necessary methods and state knowledge
to distribute the contents of the structure across the memory of a parallel computer or a
computer cluster. A key idea is that of the parallel iterator which allows a single high level
statement written by the applications programmer to invoke a parallel operation across the
entire data structure’s contents while avoiding the need for knowledge of how the distribution
is actually carried out. This transparency approach means that optimised parallel algorithms
can be separated from the applications domain code, maximising reuse of the parallel computing
infrastructure and libraries. This paper describes our initial experiments with C++
parallel containers
A framework and simulation engine for studying artificial life
The area of computer-generated artificial life-forms is a relatively recent
field of inter-disciplinary study that involves mathematical modelling, physical
intuition and ideas from chemistry and biology and computational science.
Although the attribution of “life” to non biological systems is still controversial,
several groups agree that certain emergent properties can be ascribed to
computer simulated systems that can be constructed to “live” in a simulated
environment. In this paper we discuss some of the issues and infrastructure
necessary to construct a simulation laboratory for the study of computer generated
artificial life-forms. We review possible technologies and present some
preliminary studies based around simple models
64-bit architechtures and compute clusters for high performance simulations
Simulation of large complex systems remains one of the most demanding
of high performance computer systems both in terms of raw compute performance
and efficient memory management. Recent availability of 64-bit
architectures has opened up the possibilities of commodity computers accessing
more than the 4 Gigabyte memory limit previously enforced by 32-bit
addressing. We report on some performance measurements we have made on
two 64-bit architectures and their consequences for some high performance
simulations. We discuss performance of our codes for simulations of artificial
life models; computational physics models of point particles on lattices; and
with interacting clusters of particles. We have summarised pertinent features
of these codes into benchmark kernels which we discuss in the context of wellknown
benchmark kernels of the 32-bit era. We report on how these these
findings were useful in the context of designing 64-bit compute clusters for
high-performance simulations
Accelerated face detector training using the PSL framework
We train a face detection system using the PSL framework [1] which combines the AdaBoost
learning algorithm and Haar-like features. We demonstrate the ability of this framework to
overcome some of the challenges inherent in training classifiers that are structured in cascades
of boosted ensembles (CoBE). The PSL classifiers are compared to the Viola-Jones type cas-
caded classifiers. We establish the ability of the PSL framework to produce classifiers in a
complex domain in significantly reduced time frame. They also comprise of fewer boosted en-
sembles albeit at a price of increased false detection rates on our test dataset. We also report
on results from a more diverse number of experiments carried out on the PSL framework in
order to shed more insight into the effects of variations in its adjustable training parameters
A novel bootstrapping method for positive datasets in cascades of boosted ensembles
We present a novel method for efficiently training a face detector using large positive
datasets in a cascade of boosted ensembles. We extend the successful Viola-Jones [1] framework
which achieved low false acceptance rates through bootstrapping negative samples with the
capability to also bootstrap large positive datasets thereby capturing more in-class variation
of the target object. We achieve this form of bootstrapping by way of an additional embedded
cascade within each layer and term the new structure as the Bootstrapped Dual-Cascaded
(BDC) framework. We demonstrate its ability to easily and efficiently train a classifier on
large and complex face datasets which exhibit acute in-class variation