16,109 research outputs found
Overview of Parallel Platforms for Common High Performance Computing
The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well
Type-II/III DCT/DST algorithms with reduced number of arithmetic operations
We present algorithms for the discrete cosine transform (DCT) and discrete
sine transform (DST), of types II and III, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~ 2N log_2 N to ~ (17/9) N log_2 N for a power-of-two transform size N.
Furthermore, we show that a further N multiplications may be saved by a certain
rescaling of the inputs or outputs, generalizing a well-known technique for N=8
by Arai et al. These results are derived by considering the DCT to be a special
case of a DFT of length 4N, with certain symmetries, and then pruning redundant
operations from a recent improved fast Fourier transform algorithm (based on a
recursive rescaling of the conjugate-pair split radix algorithm). The improved
algorithms for DCT-III, DST-II, and DST-III follow immediately from the
improved count for the DCT-II.Comment: 9 page
Fast Computation of Voigt Functions via Fourier Transforms
This work presents a method of computing Voigt functions and their
derivatives, to high accuracy, on a uniform grid. It is based on an adaptation
of Fourier-transform based convolution. The relative error of the result
decreases as the fourth power of the computational effort. Because of its use
of highly vectorizable operations for its core, it can be implemented very
efficiently in scripting language environments which provide fast vector
libraries. The availability of the derivatives makes it suitable as a function
generator for non-linear fitting procedures.Comment: 8 pages, 1 figur
Low-power Programmable Processor for Fast Fourier Transform Based on Transport Triggered Architecture
This paper describes a low-power processor tailored for fast Fourier
transform computations where transport triggering template is exploited. The
processor is software-programmable while retaining an energy-efficiency
comparable to existing fixed-function implementations. The power savings are
achieved by compressing the computation kernel into one instruction word. The
word is stored in an instruction loop buffer, which is more power-efficient
than regular instruction memory storage. The processor supports all
power-of-two FFT sizes from 64 to 16384 and given 1 mJ of energy, it can
compute 20916 transforms of size 1024.Comment: 5 pages, 4 figures, 1 table, ICASSP 2019 conferenc
Sound propagation over uneven ground and irregular topography
Theoretical, computational, and experimental techniques for predicting the effects of irregular topography on long range sound propagation in the atmosphere was developed. Irregular topography here is understood to imply a ground surface that is not idealized as being perfectly flat or that is not idealized as having a constant specific acoustic impedance. The interest focuses on circumstances where the propagation is similar to what might be expected for noise from low altitude air vehicles flying over suburban or rural terrain, such that rays from the source arrive at angles close to grazing incidence
XMDS2: Fast, scalable simulation of coupled stochastic partial differential equations
XMDS2 is a cross-platform, GPL-licensed, open source package for numerically
integrating initial value problems that range from a single ordinary
differential equation up to systems of coupled stochastic partial differential
equations. The equations are described in a high-level XML-based script, and
the package generates low-level optionally parallelised C++ code for the
efficient solution of those equations. It combines the advantages of high-level
simulations, namely fast and low-error development, with the speed, portability
and scalability of hand-written code. XMDS2 is a complete redesign of the XMDS
package, and features support for a much wider problem space while also
producing faster code.Comment: 9 pages, 5 figure
- …