830 research outputs found
Recommended from our members
Unconventional computing platforms and nature-inspired methods for solving hard optimisation problems
The search for novel hardware beyond the traditional von Neumann architecture has given rise to a modern area of unconventional computing requiring the efforts of mathematicians, physicists and engineers. Many analogue physical systems, including networks of nonlinear oscillators, lasers, condensates, and superconducting qubits, are proposed and realised to address challenging computational problems from various areas of social and physical sciences and technology. Understanding the underlying physical process by which the system finds the solutions to such problems often leads to new optimisation algorithms. This thesis focuses on studying gain-dissipative systems and nature-inspired algorithms that form a hybrid architecture that may soon rival classical hardware.
Chapter 1 lays the necessary foundation and explains various interdisciplinary terms that are used throughout the dissertation. In particular, connections between the optimisation problems and spin Hamiltonians are established, their computational complexity classes are explained, and the most prominent physical platforms for spin Hamiltonian implementation are reviewed.
Chapter 2 demonstrates a large variety of behaviours encapsulated in networks of polariton condensates, which are a vivid example of a gain-dissipative system we use throughout the thesis. We explain how the variations of experimentally tunable parameters allow the networks of polariton condensates to represent different oscillator models. We derive analytic expressions for the interactions between two spatially separated polariton condensates and show various synchronisation regimes for periodic chains of condensates. An odd number of condensates at the vertices of a regular polygon leads to a spontaneous formation of a giant multiply-quantised vortex at the centre of a polygon. Numerical simulations of all studied configurations of polariton condensates are performed with a mean-field approach with some theoretically proposed physical phenomena supported by the relevant experiments.
Chapter 3 examines the potential of polariton graphs to find the low-energy minima of the spin Hamiltonians. By associating a spin with a condensate phase, the minima of the XY model are achieved for simple configurations of spatially-interacting polariton condensates. We argue that such implementation of gain-dissipative simulators limits their applicability to the classes of easily solvable problems since the parameters of a particular Hamiltonian depend on the node occupancies that are not known a priori. To overcome this difficulty, we propose to adjust pumping intensities and coupling strengths dynamically. We further theoretically suggest how the discrete Ising and -state planar Potts models with or without external fields can be simulated using gain-dissipative platforms. The underlying operational principle originates from a combination of resonant and non-resonant pumping. Spatial anisotropy of pump and dissipation profiles enables an effective control of the sign and intensity of the coupling strength between any two neighbouring sites, which we demonstrate with a two dimensional square lattice of polariton condensates. For an accurate minimisation of discrete and continuous spin Hamiltonians, we propose a fully controllable polaritonic XY-Ising machine based on a network of geometrically isolated polariton condensates.
In Chapter 4, we look at classical computing rivals and study nature-inspired methods for optimising spin Hamiltonians. Based on the operational principles of gain-dissipative machines, we develop a novel class of gain-dissipative algorithms for the optimisation of discrete and continuous problems and show its performance in comparison with traditional optimisation techniques. Besides looking at traditional heuristic methods for Ising minimisation, such as the Hopfield-Tank neural networks and parallel tempering, we consider a recent physics-inspired algorithm, namely chaotic amplitude control, and exact commercial solver, Gurobi. For a proper evaluation of physical simulators, we further discuss the importance of detecting easy instances of hard combinatorial optimisation problems. The Ising model for certain interaction matrices, that are commonly used for evaluating the performance of unconventional computing machines and assumed to be exponentially hard, is shown to be solvable in polynomial time including the Mobius ladder graphs and Mattis spin glasses.
In Chapter 5 we discuss possible future applications of unconventional computing platforms including emulation of search algorithms such as PageRank, realisation of a proof-of-work protocol for blockchain technology, and reservoir computing
IST Austria Thesis
In this thesis we present a computer-aided programming approach to concurrency. Our approach helps the programmer by automatically fixing concurrency-related bugs, i.e. bugs that occur when the program is executed using an aggressive preemptive scheduler, but not when using a non-preemptive (cooperative) scheduler. Bugs are program behaviours that are incorrect w.r.t. a specification. We consider both user-provided explicit specifications in the form of assertion
statements in the code as well as an implicit specification. The implicit specification is inferred from the non-preemptive behaviour. Let us consider sequences of calls that the program makes to an external interface. The implicit specification requires that any such sequence produced under a preemptive scheduler should be included in the set of sequences produced under a non-preemptive scheduler. We consider several semantics-preserving fixes that go beyond atomic sections typically explored in the synchronisation synthesis literature. Our synthesis is able to place locks, barriers and wait-signal statements and last, but not least reorder independent statements. The latter may be useful if a thread is released to early, e.g., before some initialisation is completed. We guarantee that our synthesis does not introduce deadlocks and that the synchronisation inserted is optimal w.r.t. a given objective function. We dub our solution trace-based synchronisation synthesis and it is loosely based on counterexample-guided inductive synthesis (CEGIS). The synthesis works by discovering a trace that is incorrect w.r.t. the specification and identifying ordering constraints crucial to trigger the specification violation. Synchronisation may be placed immediately (greedy approach) or delayed until all incorrect traces are found (non-greedy approach). For the non-greedy approach we construct a set of global constraints over synchronisation placements. Each model of the global constraints set corresponds to a correctness-ensuring synchronisation placement. The placement that is optimal w.r.t. the given objective function is chosen as the synchronisation solution. We evaluate our approach on a number of realistic (albeit simplified) Linux device-driver
benchmarks. The benchmarks are versions of the drivers with known concurrency-related bugs. For the experiments with an explicit specification we added assertions that would detect the bugs in the experiments. Device drivers lend themselves to implicit specification, where the device and the operating system are the external interfaces. Our experiments demonstrate that our synthesis method is precise and efficient. We implemented objective functions for coarse-grained and fine-grained locking and observed that different synchronisation placements are produced for our experiments, favouring e.g. a minimal number of synchronisation operations or maximum concurrency
Ensuring performance and correctness for legacy parallel programs
Modern computers are based on manycore architectures, with multiple processors on
a single silicon chip. In this environment programmers are required to make use of
parallelism to fully exploit the available cores. This can either be within a single chip,
normally using shared-memory programming or at a larger scale on a cluster of chips,
normally using message-passing.
Legacy programs written using either paradigm face issues when run on modern
manycore architectures. In message-passing the problem is performance related,
with clusters based on manycores introducing necessarily tiered topologies that unaware
programs may not fully exploit. In shared-memory it is a correctness problem,
with modern systems employing more relaxed memory consistency models, on which
legacy programs were not designed to operate. Solutions to this correctness problem
exist, but introduce a performance problem as they are necessarily conservative. This
thesis focuses on addressing these problems, largely through compile-time analysis
and transformation.
The first technique proposed is a method for statically determining the communication
graph of an MPI program. This is then used to optimise process placement in
a cluster of CMPs. Using the 64-process versions of the NAS parallel benchmarks,
we see an average of 28% (7%) improvement in communication localisation over by-rank
scheduling for 8-core (12-core) CMP-based clusters, representing the maximum
possible improvement.
Secondly, we move into the shared-memory paradigm, identifying and proving
necessary conditions for a read to be an acquire. This can be used to improve solutions
in several application areas, two of which we then explore.
We apply our acquire signatures to the problem of fence placement for legacy well-synchronised
programs. We find that applying our signatures, we can reduce the number
of fences placed by an average of 62%, leading to a speedup of up to 2.64x over an
existing practical technique.
Finally, we develop a dynamic synchronisation detection tool known as SyncDetect.
This proof of concept tool leverages our acquire signatures to more accurately
detect ad hoc synchronisations in running programs and provides the programmer with
a report of their locations in the source code. The tool aims to assist programmers with
the notoriously difficult problem of parallel debugging and in manually porting legacy
programs to more modern (relaxed) memory consistency models
The MOLDY short-range molecular dynamics package
We describe a parallelised version of the MOLDY molecular dynamics program.
This Fortran code is aimed at systems which may be described by short-range
potentials and specifically those which may be addressed with the embedded atom
method. This includes a wide range of transition metals and alloys. MOLDY
provides a range of options in terms of the molecular dynamics ensemble used
and the boundary conditions which may be applied. A number of standard
potentials are provided, and the modular structure of the code allows new
potentials to be added easily. The code is parallelised using OpenMP and can
therefore be run on shared memory systems, including modern multicore
processors. Particular attention is paid to the updates required in the main
force loop, where synchronisation is often required in OpenMP implementations
of molecular dynamics. We examine the performance of the parallel code in
detail and give some examples of applications to realistic problems, including
the dynamic compression of copper and carbon migration in an iron-carbon alloy
Metaheuristic approaches for Complete Network Signal Setting Design (CNSSD)
2014 - 2015In order to mitigate the urban traffic congestion and increase the travelers’ surplus, several policies can be adopted which may be applied in short or long time horizon. With regards to the short term policies, one of the most straightforward is control through traffic lights at single junction or network level. The main goal of traffic control is avoiding that incompatible approaches have green at the same time. With respect to this aim existing methodologies for Signal Setting Design (NSSD) can be divided into two classes as in following described
Approach-based (or Phase-based) methods address the signal setting as a periodic scheduling problem: the cycle length, and for each approach the start and the end of the green are considered as decision variables, some binary variables (or some non-linear constraints) are included to avoid incompatible approaches having green at the same time (see for instance Improta and Cantarella, 1987). If needed the stage composition and sequence may easily be obtained from decision variables. Commercial software codes following this methodology are available for single junction control only, such Oscady Pro® (TRL, UK; Burrow, 1987). Once the green timing and scheduling have been carried out for each junction, offsets can be optimized (coordination) using the stage matrices obtained from single junction optimization (possibly together with green splits again) through one of codes mentioned below.
Stage-based signal setting methods dealt with that by dividing the cycle length into stages, each one being a time interval during which some mutually compatible approaches have green. Stage composition, say which approaches have green, and sequence, say their order, can be represented through the approach-stage incidence matrix, or stage matrix for short. Once the stage matrix is given for each junction, the cycle length, the green splits and the offsets can be optimised (synchronisation) through some well established commercial software codes. Two of the most commonly used codes are: TRANSYT14® (TRL, UK) (recently TRANSYT15® has been released) and TRANSYT-7F® (FHWA, USA). Both allow to compute the green splits, the offsets and the cycle length by combining a traffic flow model and a signal setting optimiser. Both may be used for coordination (optimisation of offsets only, once green splits are known) or synchronisation. TRANSYT14® generates several (but not all) significant stage sequences to be tested but the optimal solution is not endogenously generated, while TRANSYT-7F® is able to optimise the stage sequence for each single junction starting from the ring and barrier NEMA (i.e. National Electrical Manufacturers Association) phases. Still these methods do not allow for stage matrix optimisation; moreover the effects of stage composition and sequence on network performance are not well analysed in literature... [edited by Author]XIV n.s
- …