416 research outputs found
Analysis, Visualization, and Transformation of Audio Signals Using Dictionary-based Methods
date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +0000date-added: 2014-01-07 09:15:58 +0000 date-modified: 2014-01-07 09:15:58 +000
A Panorama on Multiscale Geometric Representations, Intertwining Spatial, Directional and Frequency Selectivity
The richness of natural images makes the quest for optimal representations in
image processing and computer vision challenging. The latter observation has
not prevented the design of image representations, which trade off between
efficiency and complexity, while achieving accurate rendering of smooth regions
as well as reproducing faithful contours and textures. The most recent ones,
proposed in the past decade, share an hybrid heritage highlighting the
multiscale and oriented nature of edges and patterns in images. This paper
presents a panorama of the aforementioned literature on decompositions in
multiscale, multi-orientation bases or dictionaries. They typically exhibit
redundancy to improve sparsity in the transformed domain and sometimes its
invariance with respect to simple geometric deformations (translation,
rotation). Oriented multiscale dictionaries extend traditional wavelet
processing and may offer rotation invariance. Highly redundant dictionaries
require specific algorithms to simplify the search for an efficient (sparse)
representation. We also discuss the extension of multiscale geometric
decompositions to non-Euclidean domains such as the sphere or arbitrary meshed
surfaces. The etymology of panorama suggests an overview, based on a choice of
partially overlapping "pictures". We hope that this paper will contribute to
the appreciation and apprehension of a stream of current research directions in
image understanding.Comment: 65 pages, 33 figures, 303 reference
Efficient compression of motion compensated residuals
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
Image interpolation using Shearlet based iterative refinement
This paper proposes an image interpolation algorithm exploiting sparse
representation for natural images. It involves three main steps: (a) obtaining
an initial estimate of the high resolution image using linear methods like FIR
filtering, (b) promoting sparsity in a selected dictionary through iterative
thresholding, and (c) extracting high frequency information from the
approximation to refine the initial estimate. For the sparse modeling, a
shearlet dictionary is chosen to yield a multiscale directional representation.
The proposed algorithm is compared to several state-of-the-art methods to
assess its objective as well as subjective performance. Compared to the cubic
spline interpolation method, an average PSNR gain of around 0.8 dB is observed
over a dataset of 200 images
Analysis, visualization, and transformation of audio signals using dictionary-based methods
This article provides an overview of dictionary-based methods (DBMs), and reviews recent work in the application of such methods to working with audio and music signals. As Fourier analysis is to additive synthesis, DBMs can be seen as the analytical counterpart to a generalized granular synthesis, where a sound is built by combining heterogeneous atoms selected from a user-defined dictionary. As such, DBMs provide novel ways for analyzing and visualizing audio signals, creating multiresolution descriptions of their contents, and designing sound transformations unique to a description of audio in terms of atoms. 1
MP3D: Highly Scalable Video Coding Scheme Based on Matching Pursuit
This paper describes a novel video coding scheme based on a three-dimensional Matching Pursuit algorithm. In addition to good compression performance at low bit rate, the proposed coder allows for flexible spatial, temporal and rate scalability thanks to its progressive coding structure. The Matching Pursuit algorithm generates a sparse decomposition of a video sequence in a series of spatio-temporal atoms, taken from an overcomplete dictionary of three-dimensional basis functions. The dictionary is generated by shifting, scaling and rotating two different mother atoms in order to cover the whole frequency cube. An embedded stream is then produced from the series of atoms. They are first distributed into sets through the set-partitioned position map algorithm (SPPM) to form the index-map, inspired from bit plane encoding. Scalar quantization is then applied to the coefficients which are finally arithmetic coded. A complete MP3D codec has been implemented, and performances are shown to favorably compare to other scalable coders like MPEG-4 FGS and SPIHT-3D. In addition, the MP3D streams offer an incomparable flexibility for multiresolution streaming or adaptive decoding
State of the art in 2D content representation and compression
Livrable D1.3 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D3.1 du projet
Simultaneous Codeword Optimization (SimCO) for Dictionary Update and Learning
We consider the data-driven dictionary learning problem. The goal is to seek
an over-complete dictionary from which every training signal can be best
approximated by a linear combination of only a few codewords. This task is
often achieved by iteratively executing two operations: sparse coding and
dictionary update. In the literature, there are two benchmark mechanisms to
update a dictionary. The first approach, such as the MOD algorithm, is
characterized by searching for the optimal codewords while fixing the sparse
coefficients. In the second approach, represented by the K-SVD method, one
codeword and the related sparse coefficients are simultaneously updated while
all other codewords and coefficients remain unchanged. We propose a novel
framework that generalizes the aforementioned two methods. The unique feature
of our approach is that one can update an arbitrary set of codewords and the
corresponding sparse coefficients simultaneously: when sparse coefficients are
fixed, the underlying optimization problem is similar to that in the MOD
algorithm; when only one codeword is selected for update, it can be proved that
the proposed algorithm is equivalent to the K-SVD method; and more importantly,
our method allows us to update all codewords and all sparse coefficients
simultaneously, hence the term simultaneous codeword optimization (SimCO).
Under the proposed framework, we design two algorithms, namely, primitive and
regularized SimCO. We implement these two algorithms based on a simple gradient
descent mechanism. Simulations are provided to demonstrate the performance of
the proposed algorithms, as compared with two baseline algorithms MOD and
K-SVD. Results show that regularized SimCO is particularly appealing in terms
of both learning performance and running speed.Comment: 13 page
- âŠ