5,031 research outputs found
From Quantity to Quality: Massive Molecular Dynamics Simulation of Nanostructures under Plastic Deformation in Desktop and Service Grid Distributed Computing Infrastructure
The distributed computing infrastructure (DCI) on the basis of BOINC and
EDGeS-bridge technologies for high-performance distributed computing is used
for porting the sequential molecular dynamics (MD) application to its parallel
version for DCI with Desktop Grids (DGs) and Service Grids (SGs). The actual
metrics of the working DG-SG DCI were measured, and the normal distribution of
host performances, and signs of log-normal distributions of other
characteristics (CPUs, RAM, and HDD per host) were found. The practical
feasibility and high efficiency of the MD simulations on the basis of DG-SG DCI
were demonstrated during the experiment with the massive MD simulations for the
large quantity of aluminum nanocrystals (-). Statistical
analysis (Kolmogorov-Smirnov test, moment analysis, and bootstrapping analysis)
of the defect density distribution over the ensemble of nanocrystals had shown
that change of plastic deformation mode is followed by the qualitative change
of defect density distribution type over ensemble of nanocrystals. Some
limitations (fluctuating performance, unpredictable availability of resources,
etc.) of the typical DG-SG DCI were outlined, and some advantages (high
efficiency, high speedup, and low cost) were demonstrated. Deploying on DG DCI
allows to get new scientific from the simulated
of numerous configurations by harnessing sufficient computational power to
undertake MD simulations in a wider range of physical parameters
(configurations) in a much shorter timeframe.Comment: 13 pages, 11 pages (http://journals.agh.edu.pl/csci/article/view/106
Integrated Laboratory Demonstrations of Multi-Object Adaptive Optics on a Simulated 10-Meter Telescope at Visible Wavelengths
One important frontier for astronomical adaptive optics (AO) involves methods
such as Multi-Object AO and Multi-Conjugate AO that have the potential to give
a significantly larger field of view than conventional AO techniques. A second
key emphasis over the next decade will be to push astronomical AO to visible
wavelengths. We have conducted the first laboratory simulations of wide-field,
laser guide star adaptive optics at visible wavelengths on a 10-meter-class
telescope. These experiments, utilizing the UCO/Lick Observatory's Multi-Object
/ Laser Tomography Adaptive Optics (MOAO/LTAO) testbed, demonstrate new
techniques in wavefront sensing and control that are crucial to future on-sky
MOAO systems. We (1) test and confirm the feasibility of highly accurate
atmospheric tomography with laser guide stars, (2) demonstrate key innovations
allowing open-loop operation of Shack-Hartmann wavefront sensors (with errors
of ~30 nm) as will be needed for MOAO, and (3) build a complete error budget
model describing system performance. The AO system maintains a performance of
32.4% Strehl on-axis, with 24.5% and 22.6% at 10" and 15", respectively, at a
science wavelength of 710 nm (R-band) over the equivalent of 0.8 seconds of
simulation. The MOAO-corrected field of view is ~25 times larger in area than
that limited by anisoplanatism at R-band. Our error budget is composed of terms
verified through independent, empirical experiments. Error terms arising from
calibration inaccuracies and optical drift are comparable in magnitude to
traditional terms like fitting error and tomographic error. This makes a strong
case for implementing additional calibration facilities in future AO systems,
including accelerometers on powered optics, 3D turbulators, telescope and LGS
simulators, and external calibration ports for deformable mirrors.Comment: 29 pages, 11 figures, submitted to PAS
Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems
The generic matrix multiply (GEMM) function is the core element of
high-performance linear algebra libraries used in many
computationally-demanding digital signal processing (DSP) systems. We propose
an acceleration technique for GEMM based on dynamically adjusting the
imprecision (distortion) of computation. Our technique employs adaptive scalar
companding and rounding to input matrix blocks followed by two forms of packing
in floating-point that allow for concurrent calculation of multiple results.
Since the adaptive companding process controls the increase of concurrency (via
packing), the increase in processing throughput (and the corresponding increase
in distortion) depends on the input data statistics. To demonstrate this, we
derive the optimal throughput-distortion control framework for GEMM for the
broad class of zero-mean, independent identically distributed, input sources.
Our approach converts matrix multiplication in programmable processors into a
computation channel: when increasing the processing throughput, the output
noise (error) increases due to (i) coarser quantization and (ii) computational
errors caused by exceeding the machine-precision limitations. We show that,
under certain distortion in the GEMM computation, the proposed framework can
significantly surpass 100% of the peak performance of a given processor. The
practical benefits of our proposal are shown in a face recognition system and a
multi-layer perceptron system trained for metadata learning from a large music
feature database.Comment: IEEE Transactions on Signal Processing (vol. 60, 2012
High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures
This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC) based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms
- …