53,360 research outputs found
TaskPoint: sampled simulation of task-based programs
Sampled simulation is a mature technique for reducing simulation time of single-threaded programs, but it is not directly applicable to simulation of multi-threaded architectures. Recent multi-threaded sampling techniques assume that the workload assigned to each thread does not change across multiple executions of a program. This assumption does not hold for dynamically scheduled task-based programming models. Task-based programming models allow the programmer to specify program segments as tasks which are instantiated many times and scheduled dynamically to available threads. Due to system noise and variation in scheduling decisions, two consecutive executions on the same machine typically result in different instruction streams processed by each thread. In this paper, we propose TaskPoint, a sampled simulation technique for dynamically scheduled task-based programs. We leverage task instances as sampling units and simulate only a fraction of all task instances in detail. Between detailed simulation intervals we employ a novel fast-forward mechanism for dynamically scheduled programs. We evaluate the proposed technique on a set of 19 task-based parallel benchmarks and two different architectures. Compared to detailed simulation, TaskPoint accelerates architectural simulation with 64 simulated threads by an average factor of 19.1 at an average error of 1.8% and a maximum error of 15.0%.This work has been supported by the Spanish Government (Severo Ochoa grants SEV2015-0493, SEV-2011-00067), the Spanish Ministry of Science and Innovation
(contract TIN2015-65316-P), Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272), the RoMoL ERC Advanced Grant (GA 321253), the European HiPEAC Network of Excellence and the Mont-Blanc project (EU-FP7-610402 and EU-H2020-671697). M. Moreto has been partially supported by the Ministry of Economy and Competitiveness under Juan de la Cierva postdoctoral fellowship JCI-2012-15047. M. Casas is supported by the Ministry of Economy
and Knowledge of the Government of Catalonia and the Cofund programme of the Marie Curie Actions of the EUFP7 (contract 2013BP B 00243). T.Grass has been partially
supported by the AGAUR of the Generalitat de Catalunya (grant 2013FI B 0058).Peer ReviewedPostprint (author's final draft
Deep Fluids: A Generative Network for Parameterized Fluid Simulations
This paper presents a novel generative model to synthesize fluid simulations
from a set of reduced parameters. A convolutional neural network is trained on
a collection of discrete, parameterizable fluid simulation velocity fields. Due
to the capability of deep learning architectures to learn representative
features of the data, our generative model is able to accurately approximate
the training data set, while providing plausible interpolated in-betweens. The
proposed generative model is optimized for fluids by a novel loss function that
guarantees divergence-free velocity fields at all times. In addition, we
demonstrate that we can handle complex parameterizations in reduced spaces, and
advance simulations in time by integrating in the latent space with a second
network. Our method models a wide variety of fluid behaviors, thus enabling
applications such as fast construction of simulations, interpolation of fluids
with different parameters, time re-sampling, latent space simulations, and
compression of fluid simulation data. Reconstructed velocity fields are
generated up to 700x faster than re-simulating the data with the underlying CPU
solver, while achieving compression rates of up to 1300x.Comment: Computer Graphics Forum (Proceedings of EUROGRAPHICS 2019),
additional materials: http://www.byungsoo.me/project/deep-fluids
Optimal photonic indistinguishability tests in multimode networks
Particle indistinguishability is at the heart of quantum statistics that
regulates fundamental phenomena such as the electronic band structure of
solids, Bose-Einstein condensation and superconductivity. Moreover, it is
necessary in practical applications such as linear optical quantum computation
and simulation, in particular for Boson Sampling devices. It is thus crucial to
develop tools to certify genuine multiphoton interference between multiple
sources. Here we show that so-called Sylvester interferometers are near-optimal
for the task of discriminating the behaviors of distinguishable and
indistinguishable photons. We report the first implementations of integrated
Sylvester interferometers with 4 and 8 modes with an efficient, scalable and
reliable 3D-architecture. We perform two-photon interference experiments
capable of identifying indistinguishable photon behaviour with a Bayesian
approach using very small data sets. Furthermore, we employ experimentally this
new device for the assessment of scattershot Boson Sampling. These results open
the way to the application of Sylvester interferometers for the optimal
assessment of multiphoton interference experiments.Comment: 9+10 pages, 6+6 figures, added supplementary material, completed and
updated bibliograph
Autonomic log/restore for advanced optimistic simulation systems
In this paper we address state recoverability in optimistic simulation systems by presenting an autonomic log/restore architecture. Our proposal is unique in that it jointly provides the following features: (i) log/restore operations are carried out in a completely transparent manner to the application programmer, (ii) the simulation-object state can be scattered across dynamically allocated non-contiguous memory chunks, (iii) two differentiated operating modes, incremental vs non-incremental, coexist via transparent, optimized run-time management of dual versions of the same application layer, with dynamic selection of the best suited operating mode in different phases of the optimistic simulation run, and (iv) determinationof the best suited mode for any time frame is carried out on the basis of an innovative modeling/optimization approach that takes into account stability of each operating mode vs variations of the model execution parameters. © 2010 IEEE
Fast Neural Network Predictions from Constrained Aerodynamics Datasets
Incorporating computational fluid dynamics in the design process of jets,
spacecraft, or gas turbine engines is often challenged by the required
computational resources and simulation time, which depend on the chosen
physics-based computational models and grid resolutions. An ongoing problem in
the field is how to simulate these systems faster but with sufficient accuracy.
While many approaches involve simplified models of the underlying physics,
others are model-free and make predictions based only on existing simulation
data. We present a novel model-free approach in which we reformulate the
simulation problem to effectively increase the size of constrained pre-computed
datasets and introduce a novel neural network architecture (called a cluster
network) with an inductive bias well-suited to highly nonlinear computational
fluid dynamics solutions. Compared to the state-of-the-art in model-based
approximations, we show that our approach is nearly as accurate, an order of
magnitude faster, and easier to apply. Furthermore, we show that our method
outperforms other model-free approaches
Frequency-modulated continuous-wave LiDAR compressive depth-mapping
We present an inexpensive architecture for converting a frequency-modulated
continuous-wave LiDAR system into a compressive-sensing based depth-mapping
camera. Instead of raster scanning to obtain depth-maps, compressive sensing is
used to significantly reduce the number of measurements. Ideally, our approach
requires two difference detectors. % but can operate with only one at the cost
of doubling the number of measurments. Due to the large flux entering the
detectors, the signal amplification from heterodyne detection, and the effects
of background subtraction from compressive sensing, the system can obtain
higher signal-to-noise ratios over detector-array based schemes while scanning
a scene faster than is possible through raster-scanning. %Moreover, we show how
a single total-variation minimization and two fast least-squares minimizations,
instead of a single complex nonlinear minimization, can efficiently recover
high-resolution depth-maps with minimal computational overhead. Moreover, by
efficiently storing only data points from measurements of an
pixel scene, we can easily extract depths by solving only two linear equations
with efficient convex-optimization methods
Transferable neural networks for enhanced sampling of protein dynamics
Variational auto-encoder frameworks have demonstrated success in reducing
complex nonlinear dynamics in molecular simulation to a single non-linear
embedding. In this work, we illustrate how this non-linear latent embedding can
be used as a collective variable for enhanced sampling, and present a simple
modification that allows us to rapidly perform sampling in multiple related
systems. We first demonstrate our method is able to describe the effects of
force field changes in capped alanine dipeptide after learning a model using
AMBER99. We further provide a simple extension to variational dynamics encoders
that allows the model to be trained in a more efficient manner on larger
systems by encoding the outputs of a linear transformation using time-structure
based independent component analysis (tICA). Using this technique, we show how
such a model trained for one protein, the WW domain, can efficiently be
transferred to perform enhanced sampling on a related mutant protein, the GTT
mutation. This method shows promise for its ability to rapidly sample related
systems using a single transferable collective variable and is generally
applicable to sets of related simulations, enabling us to probe the effects of
variation in increasingly large systems of biophysical interest.Comment: 20 pages, 10 figure
Model-Based Calibration of Filter Imperfections in the Random Demodulator for Compressive Sensing
The random demodulator is a recent compressive sensing architecture providing
efficient sub-Nyquist sampling of sparse band-limited signals. The compressive
sensing paradigm requires an accurate model of the analog front-end to enable
correct signal reconstruction in the digital domain. In practice, hardware
devices such as filters deviate from their desired design behavior due to
component variations. Existing reconstruction algorithms are sensitive to such
deviations, which fall into the more general category of measurement matrix
perturbations. This paper proposes a model-based technique that aims to
calibrate filter model mismatches to facilitate improved signal reconstruction
quality. The mismatch is considered to be an additive error in the discretized
impulse response. We identify the error by sampling a known calibrating signal,
enabling least-squares estimation of the impulse response error. The error
estimate and the known system model are used to calibrate the measurement
matrix. Numerical analysis demonstrates the effectiveness of the calibration
method even for highly deviating low-pass filter responses. The proposed method
performance is also compared to a state of the art method based on discrete
Fourier transform trigonometric interpolation.Comment: 10 pages, 8 figures, submitted to IEEE Transactions on Signal
Processin
Adversarial Deformation Regularization for Training Image Registration Neural Networks
We describe an adversarial learning approach to constrain convolutional
neural network training for image registration, replacing heuristic smoothness
measures of displacement fields often used in these tasks. Using
minimally-invasive prostate cancer intervention as an example application, we
demonstrate the feasibility of utilizing biomechanical simulations to
regularize a weakly-supervised anatomical-label-driven registration network for
aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural
transrectal ultrasound (TRUS) images. A discriminator network is optimized to
distinguish the registration-predicted displacement fields from the motion data
simulated by finite element analysis. During training, the registration network
simultaneously aims to maximize similarity between anatomical labels that
drives image alignment and to minimize an adversarial generator loss that
measures divergence between the predicted- and simulated deformation. The
end-to-end trained network enables efficient and fully-automated registration
that only requires an MR and TRUS image pair as input, without anatomical
labels or simulated data during inference. 108 pairs of labelled MR and TRUS
images from 76 prostate cancer patients and 71,500 nonlinear finite-element
simulations from 143 different patients were used for this study. We show that,
with only gland segmentation as training labels, the proposed method can help
predict physically plausible deformation without any other smoothness penalty.
Based on cross-validation experiments using 834 pairs of independent validation
landmarks, the proposed adversarial-regularized registration achieved a target
registration error of 6.3 mm that is significantly lower than those from
several other regularization methods.Comment: Accepted to MICCAI 201
- …