4,118 research outputs found
An Architectural Framework for VLSI Time-Recursive Computation with Applications
The time-recursive computation model has been proven as a particularly useful tool in audio, video, radar and sonar real- time data processing architectures. Unlike the FFT based architectures, the time-recursive ones require only local communication, they imply linear implementation cost and they operate in a single-input multiple-output (SIMO) manner. This is appropriate for the above applications since the data are supplied serially. Also, the time-recursive architectures are modular and regular and they allow high degree of parallelism; thus they are very appropriate for VLSI implementation.In this dissertation, we establish an architectural framework for parallel time-recursive computation. We consider a class of linear operators (or signal transformers) that are characterized by discrete time, time invariant, compactly supported, but otherwise arbitrary kernel functions. We specify the properties of linear operators that can be implemented efficiently in a time-recursive way. Based on these properties, we develop a systematic routine that produces a time-recursive architectural implementation for a given operator. We demonstrate the use and effectiveness of this routine by means of specific examples, namely the Discrete Cosine Transform (DCT), the Discrete Fourier Transform (DFT) and the Discrete Wavelet Transform (DWT).By using this architectural framework we obtain novel architectures for the uniform-DFT QMF bank, the cosine modulated QMF bank, the 1-D and 2-D Modulated Lapped Transform (MLT), as well as an Extended Lapped Transform (ELT). Furthermore, the architectural implementation of the Cepstral Transform and a Short Time Fourier Transform are considered based on the time-recursive architecture of the DFT. All of the above designs are modular, regular, with local communication and linear cost in operator counts. In particular, the 1-D MLT requires 1N + 3 adders and N - 1 rotation circuits, where N denotes the data block size. The 2-D MLT requires 3 1-D MLT circuits and no matrix transposition. The ELT has basis length equal to 4N and it requires 3N + 4 multipliers, 4N + 4 adders and N + 2 rotation circuits. These results are expected to have a significant impact on real-time audio and video data compression, in frequency domain adaptive filtering and in spectrum analysis
Parallel Construction of Wavelet Trees on Multicore Architectures
The wavelet tree has become a very useful data structure to efficiently
represent and query large volumes of data in many different domains, from
bioinformatics to geographic information systems. One problem with wavelet
trees is their construction time. In this paper, we introduce two algorithms
that reduce the time complexity of a wavelet tree's construction by taking
advantage of nowadays ubiquitous multicore machines.
Our first algorithm constructs all the levels of the wavelet in parallel in
time and bits of working space, where
is the size of the input sequence and is the size of the alphabet. Our
second algorithm constructs the wavelet tree in a domain-decomposition fashion,
using our first algorithm in each segment, reaching time and
bits of extra space, where is the
number of available cores. Both algorithms are practical and report good
speedup for large real datasets.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
Task-based adaptive multiresolution for time-space multi-scale reaction-diffusion systems on multi-core architectures
A new solver featuring time-space adaptation and error control has been
recently introduced to tackle the numerical solution of stiff
reaction-diffusion systems. Based on operator splitting, finite volume adaptive
multiresolution and high order time integrators with specific stability
properties for each operator, this strategy yields high computational
efficiency for large multidimensional computations on standard architectures
such as powerful workstations. However, the data structure of the original
implementation, based on trees of pointers, provides limited opportunities for
efficiency enhancements, while posing serious challenges in terms of parallel
programming and load balancing. The present contribution proposes a new
implementation of the whole set of numerical methods including Radau5 and
ROCK4, relying on a fully different data structure together with the use of a
specific library, TBB, for shared-memory, task-based parallelism with
work-stealing. The performance of our implementation is assessed in a series of
test-cases of increasing difficulty in two and three dimensions on multi-core
and many-core architectures, demonstrating high scalability
A Deep Representation for Invariance And Music Classification
Representations in the auditory cortex might be based on mechanisms similar
to the visual ventral stream; modules for building invariance to
transformations and multiple layers for compositionality and selectivity. In
this paper we propose the use of such computational modules for extracting
invariant and discriminative audio representations. Building on a theory of
invariance in hierarchical architectures, we propose a novel, mid-level
representation for acoustical signals, using the empirical distributions of
projections on a set of templates and their transformations. Under the
assumption that, by construction, this dictionary of templates is composed from
similar classes, and samples the orbit of variance-inducing signal
transformations (such as shift and scale), the resulting signature is
theoretically guaranteed to be unique, invariant to transformations and stable
to deformations. Modules of projection and pooling can then constitute layers
of deep networks, for learning composite representations. We present the main
theoretical and computational aspects of a framework for unsupervised learning
of invariant audio representations, empirically evaluated on music genre
classification.Comment: 5 pages, CBMM Memo No. 002, (to appear) IEEE 2014 International
Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014
Wavelet/shearlet hybridized neural networks for biomedical image restoration
Recently, new programming paradigms have emerged that combine parallelism and numerical computations with algorithmic differentiation. This approach allows for the hybridization of neural network techniques for inverse imaging problems with more traditional methods such as wavelet-based sparsity modelling techniques. The benefits are twofold: on the one hand traditional methods with well-known properties can be integrated in neural networks, either as separate layers or tightly integrated in the network, on the other hand, parameters in traditional methods can be trained end-to-end from datasets in a neural network "fashion" (e.g., using Adagrad or Adam optimizers). In this paper, we explore these hybrid neural networks in the context of shearlet-based regularization for the purpose of biomedical image restoration. Due to the reduced number of parameters, this approach seems a promising strategy especially when dealing with small training data sets
The Cascading Haar Wavelet algorithm for computing the Walsh-Hadamard Transform
We propose a novel algorithm for computing the Walsh-Hadamard Transform (WHT)
which consists entirely of Haar wavelet transforms. We prove that the
algorithm, which we call the Cascading Haar Wavelet (CHW) algorithm, shares
precisely the same serial complexity as the popular divide-and-conquer
algorithm for the WHT. We also propose a natural way of parallelizing the
algorithm which has a number of attractive features
- …