3,674 research outputs found

    Developing a pattern discovery method in time series data and its GPU acceleration

    Get PDF
    The Dynamic Time Warping (DTW) algorithm is widely used in finding the global alignment of time series. Many time series data mining and analytical problems can be solved by the DTW algorithm. However, using the DTW algorithm to find similar subsequences is computationally expensive or unable to perform accurate analysis. Hence, in the literature, the parallelisation technique is used to speed up the DTW algorithm. However, due to the nature of DTW algorithm, parallelising this algorithm remains an open challenge. In this paper, we first propose a novel method that finds the similar local subsequence. Our algorithm first searches for the possible start positions of subsequence, and then finds the best-matching alignment from these positions. Moreover, we parallelise the proposed algorithm on GPUs using CUDA and further propose an optimisation technique to improve the performance of our parallelization implementation on GPU. We conducted the extensive experiments to evaluate the proposed method. Experimental results demonstrate that the proposed algorithm is able to discover time series subsequences efficiently and that the proposed GPU-based parallelization technique can further speedup the processing

    Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

    Full text link
    In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

    A GPU implementation of the Correlation Technique for Real-time Fourier Domain Pulsar Acceleration Searches

    Full text link
    The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of High Performance Computing (HPC) platforms. We present our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs). The correlation technique is applied as a convolution with multiple Finite Impulse Response filters in the Fourier domain. Two approaches are compared: the first uses the NVIDIA cuFFT library for applying Fast Fourier Transforms (FFTs) on the GPU, and the second contains a custom FFT implementation in GPU shared memory. We find that the FFT shared memory implementation performs between 1.5 and 3.2 times faster than our cuFFT-based application for smaller but sufficient filter sizes. It is also 4 to 6 times faster than the existing GPU and OpenMP implementations of FDAS. This work is part of the AstroAccelerate project, a many-core accelerated time-domain signal processing library for radio astronomy.Comment: 20 pages, 9 figures. Accepted for publication in ApJ

    Towards a Scalable Hardware/Software Co-Design Platform for Real-time Pedestrian Tracking Based on a ZYNQ-7000 Device

    Get PDF
    Currently, most designers face a daunting task to research different design flows and learn the intricacies of specific software from various manufacturers in hardware/software co-design. An urgent need of creating a scalable hardware/software co-design platform has become a key strategic element for developing hardware/software integrated systems. In this paper, we propose a new design flow for building a scalable co-design platform on FPGA-based system-on-chip. We employ an integrated approach to implement a histogram oriented gradients (HOG) and a support vector machine (SVM) classification on a programmable device for pedestrian tracking. Not only was hardware resource analysis reported, but the precision and success rates of pedestrian tracking on nine open access image data sets are also analysed. Finally, our proposed design flow can be used for any real-time image processingrelated products on programmable ZYNQ-based embedded systems, which benefits from a reduced design time and provide a scalable solution for embedded image processing products

    Molecular dynamics simulation: a tool for exploration and discovery using simple models

    Full text link
    Emergent phenomena share the fascinating property of not being obvious consequences of the design of the system in which they appear. This characteristic is no less relevant when attempting to simulate such phenomena, given that the outcome is not always a foregone conclusion. The present survey focuses on several simple model systems that exhibit surprisingly rich emergent behavior, all studied by MD simulation. The examples are taken from the disparate fields of fluid dynamics, granular matter and supramolecular self-assembly. In studies of fluids modeled at the detailed microscopic level using discrete particles, the simulations demonstrate that complex hydrodynamic phenomena in rotating and convecting fluids, the Taylor-Couette and Rayleigh-B\'enard instabilities, can not only be observed within the limited length and time scales accessible to MD, but even quantitative agreement can be achieved. Simulation of highly counterintuitive segregation phenomena in granular mixtures, again using MD methods, but now augmented by forces producing damping and friction, leads to results that resemble experimentally observed axial and radial segregation in the case of a rotating cylinder, and to a novel form of horizontal segregation in a vertically vibrated layer. Finally, when modeling self-assembly processes analogous to the formation of the polyhedral shells that package spherical viruses, simulation of suitably shaped particles reveals the ability to produce complete, error-free assembly, and leads to the important general observation that reversible growth steps contribute to the high yield. While there are limitations to the MD approach, both computational and conceptual, the results offer a tantalizing hint of the kinds of phenomena that can be explored, and what might be discovered when sufficient resources are brought to bear on a problem.Comment: 21 pages, 20 figures (v2 - minor text addition
    corecore