5,031 research outputs found

    From Quantity to Quality: Massive Molecular Dynamics Simulation of Nanostructures under Plastic Deformation in Desktop and Service Grid Distributed Computing Infrastructure

    Get PDF
    The distributed computing infrastructure (DCI) on the basis of BOINC and EDGeS-bridge technologies for high-performance distributed computing is used for porting the sequential molecular dynamics (MD) application to its parallel version for DCI with Desktop Grids (DGs) and Service Grids (SGs). The actual metrics of the working DG-SG DCI were measured, and the normal distribution of host performances, and signs of log-normal distributions of other characteristics (CPUs, RAM, and HDD per host) were found. The practical feasibility and high efficiency of the MD simulations on the basis of DG-SG DCI were demonstrated during the experiment with the massive MD simulations for the large quantity of aluminum nanocrystals (102\sim10^2-10310^3). Statistical analysis (Kolmogorov-Smirnov test, moment analysis, and bootstrapping analysis) of the defect density distribution over the ensemble of nanocrystals had shown that change of plastic deformation mode is followed by the qualitative change of defect density distribution type over ensemble of nanocrystals. Some limitations (fluctuating performance, unpredictable availability of resources, etc.) of the typical DG-SG DCI were outlined, and some advantages (high efficiency, high speedup, and low cost) were demonstrated. Deploying on DG DCI allows to get new scientific quality\it{quality} from the simulated quantity\it{quantity} of numerous configurations by harnessing sufficient computational power to undertake MD simulations in a wider range of physical parameters (configurations) in a much shorter timeframe.Comment: 13 pages, 11 pages (http://journals.agh.edu.pl/csci/article/view/106

    Integrated Laboratory Demonstrations of Multi-Object Adaptive Optics on a Simulated 10-Meter Telescope at Visible Wavelengths

    Full text link
    One important frontier for astronomical adaptive optics (AO) involves methods such as Multi-Object AO and Multi-Conjugate AO that have the potential to give a significantly larger field of view than conventional AO techniques. A second key emphasis over the next decade will be to push astronomical AO to visible wavelengths. We have conducted the first laboratory simulations of wide-field, laser guide star adaptive optics at visible wavelengths on a 10-meter-class telescope. These experiments, utilizing the UCO/Lick Observatory's Multi-Object / Laser Tomography Adaptive Optics (MOAO/LTAO) testbed, demonstrate new techniques in wavefront sensing and control that are crucial to future on-sky MOAO systems. We (1) test and confirm the feasibility of highly accurate atmospheric tomography with laser guide stars, (2) demonstrate key innovations allowing open-loop operation of Shack-Hartmann wavefront sensors (with errors of ~30 nm) as will be needed for MOAO, and (3) build a complete error budget model describing system performance. The AO system maintains a performance of 32.4% Strehl on-axis, with 24.5% and 22.6% at 10" and 15", respectively, at a science wavelength of 710 nm (R-band) over the equivalent of 0.8 seconds of simulation. The MOAO-corrected field of view is ~25 times larger in area than that limited by anisoplanatism at R-band. Our error budget is composed of terms verified through independent, empirical experiments. Error terms arising from calibration inaccuracies and optical drift are comparable in magnitude to traditional terms like fitting error and tomographic error. This makes a strong case for implementing additional calibration facilities in future AO systems, including accelerometers on powered optics, 3D turbulators, telescope and LGS simulators, and external calibration ports for deformable mirrors.Comment: 29 pages, 11 figures, submitted to PAS

    Throughput-Distortion Computation Of Generic Matrix Multiplication: Toward A Computation Channel For Digital Signal Processing Systems

    Get PDF
    The generic matrix multiply (GEMM) function is the core element of high-performance linear algebra libraries used in many computationally-demanding digital signal processing (DSP) systems. We propose an acceleration technique for GEMM based on dynamically adjusting the imprecision (distortion) of computation. Our technique employs adaptive scalar companding and rounding to input matrix blocks followed by two forms of packing in floating-point that allow for concurrent calculation of multiple results. Since the adaptive companding process controls the increase of concurrency (via packing), the increase in processing throughput (and the corresponding increase in distortion) depends on the input data statistics. To demonstrate this, we derive the optimal throughput-distortion control framework for GEMM for the broad class of zero-mean, independent identically distributed, input sources. Our approach converts matrix multiplication in programmable processors into a computation channel: when increasing the processing throughput, the output noise (error) increases due to (i) coarser quantization and (ii) computational errors caused by exceeding the machine-precision limitations. We show that, under certain distortion in the GEMM computation, the proposed framework can significantly surpass 100% of the peak performance of a given processor. The practical benefits of our proposal are shown in a face recognition system and a multi-layer perceptron system trained for metadata learning from a large music feature database.Comment: IEEE Transactions on Signal Processing (vol. 60, 2012

    High-Efficient Parallel CAVLC Encoders on Heterogeneous Multicore Architectures

    Get PDF
    This article presents two high-efficient parallel realizations of the context-based adaptive variable length coding (CAVLC) based on heterogeneous multicore processors. By optimizing the architecture of the CAVLC encoder, three kinds of dependences are eliminated or weaken, including the context-based data dependence, the memory accessing dependence and the control dependence. The CAVLC pipeline is divided into three stages: two scans, coding, and lag packing, and be implemented on two typical heterogeneous multicore architectures. One is a block-based SIMD parallel CAVLC encoder on multicore stream processor STORM. The other is a component-oriented SIMT parallel encoder on massively parallel architecture GPU. Both of them exploited rich data-level parallelism. Experiments results show that compared with the CPU version, more than 70 times of speedup can be obtained for STORM and over 50 times for GPU. The implementation of encoder on STORM can make a real-time processing for 1080p @30fps and GPU-based version can satisfy the requirements for 720p real-time encoding. The throughput of the presented CAVLC encoders is more than 10 times higher than that of published software encoders on DSP and multicore platforms
    corecore