77,253 research outputs found
Developing interest management techniques in distributed interactive simulation using Java
Bandwidth consumption in distributed real time simulation, or networked real time simulation, is a major problem as the number of participants and the sophistication of joint simulation exercises grow in size. The paper briefly reviews distributed real time simulation and bandwidth reduction techniques and introduces the Generic Runtime Infrastructure for Distributed Simulation (GRIDS) as a research architecture for studying such problems. GRIDS uses Java abstract classes to promote distributed services called thin agents, a novel approach to implementing distributed simulation services, such as user defined bandwidth reduction mechanisms, and to distributing the executable code across the simulation. Thin agents offer the advantages of traditional agents without the overhead imposed by mobility or continuous state, which are unnecessary in this context. We present our implementation and some predicted results from message reduction studies using thin agent
LUNES: Agent-based Simulation of P2P Systems (Extended Version)
We present LUNES, an agent-based Large Unstructured NEtwork Simulator, which
allows to simulate complex networks composed of a high number of nodes. LUNES
is modular, since it splits the three phases of network topology creation,
protocol simulation and performance evaluation. This permits to easily
integrate external software tools into the main software architecture. The
simulation of the interaction protocols among network nodes is performed via a
simulation middleware that supports both the sequential and the
parallel/distributed simulation approaches. In the latter case, a specific
mechanism for the communication overhead-reduction is used; this guarantees
high levels of performance and scalability. To demonstrate the efficiency of
LUNES, we test the simulator with gossip protocols executed on top of networks
(representing peer-to-peer overlays), generated with different topologies.
Results demonstrate the effectiveness of the proposed approach.Comment: Proceedings of the International Workshop on Modeling and Simulation
of Peer-to-Peer Architectures and Systems (MOSPAS 2011). As part of the 2011
International Conference on High Performance Computing and Simulation (HPCS
2011
Resource Optimized Quantum Architectures for Surface Code Implementations of Magic-State Distillation
Quantum computers capable of solving classically intractable problems are
under construction, and intermediate-scale devices are approaching completion.
Current efforts to design large-scale devices require allocating immense
resources to error correction, with the majority dedicated to the production of
high-fidelity ancillary states known as magic-states. Leading techniques focus
on dedicating a large, contiguous region of the processor as a single
"magic-state distillation factory" responsible for meeting the magic-state
demands of applications. In this work we design and analyze a set of optimized
factory architectural layouts that divide a single factory into spatially
distributed factories located throughout the processor. We find that
distributed factory architectures minimize the space-time volume overhead
imposed by distillation. Additionally, we find that the number of distributed
components in each optimal configuration is sensitive to application
characteristics and underlying physical device error rates. More specifically,
we find that the rate at which T-gates are demanded by an application has a
significant impact on the optimal distillation architecture. We develop an
optimization procedure that discovers the optimal number of factory
distillation rounds and number of output magic states per factory, as well as
an overall system architecture that interacts with the factories. This yields
between a 10x and 20x resource reduction compared to commonly accepted single
factory designs. Performance is analyzed across representative application
classes such as quantum simulation and quantum chemistry.Comment: 16 pages, 14 figure
Neighbour coverage: a dynamic probabilistic route discovery for mobile ad hoc networks
Blind flooding is extensively use in ad hoc routing protocols for on-demand route discovery, where a mobile node blindly rebroadcasts received route request (RREQ) packets until a route to a particular destination is established. This can potentially lead to high channel contention, causing redundant retransmissions and thus excessive packet collisions in the network. Such a phenomenon induces what is known as broadcast storm problem, which has been shown to greatly increase the network communication overhead and end-to-end delay. In this paper, we show that the deleterious impact of such a problem can be reduced if measures are taken during the dissemination of RREQ packets. We propose a generic probabilistic method for route discovery, that is simple to implement and can significantly reduce the overhead associated with the dissemination of RREQs. Our analysis reveals that equipping AODV with probabilistic route discovery can result in significant reduction of routing control overhead while achieving good throughput
Assessing load-sharing within optimistic simulation platforms
The advent of multi-core machines has lead to the need for revising the architecture of modern simulation platforms. One recent proposal we made attempted to explore the viability of load-sharing for optimistic simulators run on top of these types of machines. In this article, we provide an extensive experimental study for an assessment of the effects on run-time dynamics by a load-sharing architecture that has been implemented within the ROOT-Sim package, namely an open source simulation platform adhering to the optimistic synchronization paradigm. This experimental study is essentially aimed at evaluating possible sources of overheads when supporting load-sharing. It has been based on differentiated workloads allowing us to generate different execution profiles in terms of, e.g., granularity/locality of the simulation events. © 2012 IEEE
A fine-grain time-sharing Time Warp system
Although Parallel Discrete Event Simulation (PDES) platforms relying on the Time Warp (optimistic) synchronization
protocol already allow for exploiting parallelism, several techniques have been proposed to
further favor performance. Among them we can mention optimized approaches for state restore, as well as
techniques for load balancing or (dynamically) controlling the speculation degree, the latter being specifically
targeted at reducing the incidence of causality errors leading to waste of computation. However, in
state of the art Time Warp systems, events’ processing is not preemptable, which may prevent the possibility
to promptly react to the injection of higher priority (say lower timestamp) events. Delaying the processing
of these events may, in turn, give rise to higher incidence of incorrect speculation. In this article we present
the design and realization of a fine-grain time-sharing Time Warp system, to be run on multi-core Linux
machines, which makes systematic use of event preemption in order to dynamically reassign the CPU to
higher priority events/tasks. Our proposal is based on a truly dual mode execution, application vs platform,
which includes a timer-interrupt based support for bringing control back to platform mode for possible CPU
reassignment according to very fine grain periods. The latter facility is offered by an ad-hoc timer-interrupt
management module for Linux, which we release, together with the overall time-sharing support, within the
open source ROOT-Sim platform. An experimental assessment based on the classical PHOLD benchmark and
two real world models is presented, which shows how our proposal effectively leads to the reduction of the
incidence of causality errors, as compared to traditional Time Warp, especially when running with higher
degrees of parallelism
Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP
With ever-increasing volumes of scientific data produced by HPC applications,
significantly reducing data size is critical because of limited capacity of
storage space and potential bottlenecks on I/O or networks in writing/reading
or transferring data. SZ and ZFP are the two leading lossy compressors
available to compress scientific data sets. However, their performance is not
consistent across different data sets and across different fields of some data
sets: for some fields SZ provides better compression performance, while other
fields are better compressed with ZFP. This situation raises the need for an
automatic online (during compression) selection between SZ and ZFP, with a
minimal overhead. In this paper, the automatic selection optimizes the
rate-distortion, an important statistical quality metric based on the
signal-to-noise ratio. To optimize for rate-distortion, we investigate the
principles of SZ and ZFP. We then propose an efficient online, low-overhead
selection algorithm that predicts the compression quality accurately for two
compressors in early processing stages and selects the best-fit compressor for
each data field. We implement the selection algorithm into an open-source
library, and we evaluate the effectiveness of our proposed solution against
plain SZ and ZFP in a parallel environment with 1,024 cores. Evaluation results
on three data sets representing about 100 fields show that our selection
algorithm improves the compression ratio up to 70% with the same level of data
distortion because of very accurate selection (around 99%) of the best-fit
compressor, with little overhead (less than 7% in the experiments).Comment: 14 pages, 9 figures, first revisio
- …