689 research outputs found
Simple and fast subband de-blocking technique by discarding the high band signals
In this paper, we propose a simple and fast post-processing de-blocking technique to reduce blocking artifacts. The block-based coded image is first decomposed into several subbands. Only the low frequency subband signals are retained and the high frequency subband signals are discarded. The remaining subband signals are then reconstructed to obtain a less blocky image. The ideas are demonstrated by a cosine filter bank and a modulated sine filter bank. The simulation result shows that the proposed algorithm is effective in the reduction of blocking artifacts
Type-IV DCT, DST, and MDCT algorithms with reduced numbers of arithmetic operations
We present algorithms for the type-IV discrete cosine transform (DCT-IV) and
discrete sine transform (DST-IV), as well as for the modified discrete cosine
transform (MDCT) and its inverse, that achieve a lower count of real
multiplications and additions than previously published algorithms, without
sacrificing numerical accuracy. Asymptotically, the operation count is reduced
from ~2NlogN to ~(17/9)NlogN for a power-of-two transform size N, and the exact
count is strictly lowered for all N > 4. These results are derived by
considering the DCT to be a special case of a DFT of length 8N, with certain
symmetries, and then pruning redundant operations from a recent improved fast
Fourier transform algorithm (based on a recursive rescaling of the
conjugate-pair split radix algorithm). The improved algorithms for DST-IV and
MDCT follow immediately from the improved count for the DCT-IV.Comment: 11 page
Superposition frames for adaptive time-frequency analysis and fast reconstruction
In this article we introduce a broad family of adaptive, linear
time-frequency representations termed superposition frames, and show that they
admit desirable fast overlap-add reconstruction properties akin to standard
short-time Fourier techniques. This approach stands in contrast to many
adaptive time-frequency representations in the extant literature, which, while
more flexible than standard fixed-resolution approaches, typically fail to
provide efficient reconstruction and often lack the regular structure necessary
for precise frame-theoretic analysis. Our main technical contributions come
through the development of properties which ensure that this construction
provides for a numerically stable, invertible signal representation. Our
primary algorithmic contributions come via the introduction and discussion of
specific signal adaptation criteria in deterministic and stochastic settings,
based respectively on time-frequency concentration and nonstationarity
detection. We conclude with a short speech enhancement example that serves to
highlight potential applications of our approach.Comment: 16 pages, 6 figures; revised versio
Design of near-perfect-reconstructed transmultiplexer using different modulation techniques: A comparative study
AbstractIn this paper, an efficient iterative method for design of near-perfect reconstructed transmultiplexer (NPR TMUX) is proposed for the prescribed roll-off factor (RF) and stop band attenuation (As). In this method, windowing technique has been used for the design of prototype filter, and different modulation techniques have been exploited for designing multi-channel transmultiplexer (TMUX). In this method, inter-channel interference (ICI) is iteratively minimized so that it approximately reduces to ideal value zero. Design example is given to illustrate the superiority of the proposed method over earlier reported work. A comparative study of the performance of different modulation techniques for designing TMUX is also presented
A Note on the MDCT/MDST and Pseudoinverse Matrix
The modified discrete cosine transform (MDCT) and modified discrete sine transform (MDST) both for the evenly and oddly stacked systems are perfect reconstruction cosine/sine--modulated filter banks based on time domain aliasing cancellation (TDAC) employed in the current international audio coding standards and commercial audio compression products. Based on the matrix representation of MDCTs and MDSTs it is shown that the transposed MDCT and MDST matrices are actually the pseudoinverses of their corresponding forward transform matrices. The pseudoinverse matrix and its properties provide an elegant mathematical tool to characterize the MDCT/MDST as the analysis/synthesis filter banks in the matrix representation
Multiplierless perfect reconstruction modulated filter banks with sum-of-powers-of-two coefficients
This paper proposes an efficient class of perfect reconstruction (PR) modulated filter banks (MFB) using sum-of-powers-of-two (SOPOT) coefficients. This is based on a modified factorization of the DCT-IV matrix and the lossless latrice structure of the prototype filter, which allows the coefficients to be represented in SOPOT form without affecting the PR condition. A genetic algorithm (GA) is then used to search for these SOPOT coefficients. Design examples show that SOPOT MFB with a good frequency characteristic can be designed with very low implementation complexity. The usefulness of the approach is demonstrated with a 16-channel design example.published_or_final_versio
Low Delay Filter Banks with Perfect Reconstruction
The design of modulated filter banks with a low system delay and with perfect reconstruction will be shown. The filter lengths K can be chosen arbitrarily. The well known orthogonal filter banks have a system delay of K - 1 samples. The proposed filter banks can reduce this delay to N - 1 samples, where N is the number of bands. The design method uses a decomposition or factorization of the polyphase matrix into cascades of simple matrices. Several factorizations with different properties will be shown. A factorization will be introduced which is more general and needs fewer multiplications than previous approaches (K/2 + N). The resulting filter banks can have analysis and synthesis frequency responses that can be made different from each other, leading to biorthogonal filter banks. An optimization algorithm for the frequency response of the resulting filter banks will be given. Examples show the feasibility of designing even big filter banks with many bands with low system delay and high stopband attenuation
Audio Coding Based on Integer Transforms
Die Audiocodierung hat sich in den letzten Jahren zu einem sehr
populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere
gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3
(MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur
effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für
professionelle Anwendungen, wie etwa die Archivierung und Übertragung im
Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht.
Die bisherigen Ansätze für gehörangepasste und verlustlose
Audiocodierung sind technisch völlig verschieden. Moderne
gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der
überlappenden orthogonalen Transformation "Modifizierte Diskrete
Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen
verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige
Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden
bisher versucht.
Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das
Lifting-Schema auf die in der gehörangepassten Audiocodierung
verwendeten überlappenden Transformationen anwendet. Dies ermöglicht
eine invertierbare Integer-Approximation der ursprünglichen
Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die
selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung
angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler
Lifting-Ansatz und eine Technik zur Spektralformung von
Quantisierungsfehlern eine Verbesserung der Approximation der
ursprünglichen Transformation.
Basierend auf diesen neuen Integer-Transformationen werden in dieser
Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren
umfassen verlustlose Audiocodierung, eine skalierbare verlustlose
Erweiterung eines gehörangepassten Audiocoders und einen integrierten
Ansatz zur fein skalierbaren gehörangepassten und verlustlosen
Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen
ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen
Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for
research and applications. Especially perceptual audio coding schemes,
such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are
widely used for efficient storage and transmission of music
signals. Nevertheless, for professional applications, such as archiving
and transmission in studio environments, lossless audio coding schemes
are considered more appropriate.
Traditionally, the technical approaches used in perceptual and lossless
audio coding have been separate worlds. In perceptual audio coding, the
use of filter banks, such as the lapped orthogonal transform "Modified
Discrete Cosine Transform" (MDCT), has been the approach of choice being
used by many state of the art coding schemes. On the other hand,
lossless audio coding schemes mostly employ predictive coding of
waveforms to remove redundancy. Only few attempts have been made so far
to use transform coding for the purpose of lossless audio coding.
This work presents a new approach of applying the lifting scheme to
lapped transforms used in perceptual audio coding. This allows for an
invertible integer-to-integer approximation of the original transform,
e.g. the IntMDCT as an integer approximation of the MDCT. The same
technique can also be applied to low-delay filter banks. A generalized,
multi-dimensional lifting approach and a noise-shaping technique are
introduced, allowing to further optimize the accuracy of the
approximation to the original transform.
Based on these new integer transforms, this work presents new audio
coding schemes and applications. The audio coding applications cover
lossless audio coding, scalable lossless enhancement of a perceptual
audio coder and fine-grain scalable perceptual and lossless audio
coding. Finally an approach to data hiding with high data rates in
uncompressed audio signals based on integer transforms is described
Unified Theory for Biorthogonal Modulated Filter Banks
Modulated filter banks (MFBs) are practical signal decomposition tools for M -channel multirate systems. They combine high subfilter selectivity with efficient realization based on polyphase filters and block transforms. Consequently, the O(M 2 ) burden of computations in a general filter bank (FB) is reduced to O(M log2 M ) - the latter being a complexity order comparable with the FFT-like transforms.Often hiding from the plain sight, these versatile digital signal processing tools have important role in various professional and everyday life applications of information and communications technology, including audiovisual communications and media storage (e.g., audio codecs for low-energy music playback in portable devices, as well as communication waveform processing and channelization). The algorithmic efficiency implies low cost, small size, and extended battery life, bringing the devices close to our skins.The main objective of this thesis is to formulate a generalized and unified approach to the MFBs, which includes, in addition to the deep theoretical background behind these banks, both their design by using appropriate optimization techniques and efficient algorithmic realizations. The FBs discussed in this thesis are discrete-time time-frequency decomposition/reconstruction, or equivalently, analysis-synthesis systems, where the subfilters are generated through modulation from either a single or two prototype filters. The perfect reconstruction (PR) property is a particularly important characteristics of the MFBs and this is the core theme of this thesis. In the presented biorthogonal arbitrary-delay exponentially modulated filter bank (EMFB), the PR property can be maintained also for complex-valued signals.The EMFB concept is quite flexible, since it may respond to the various requirements given to a subband processing system: low-delay PR prototype design, subfilters having symmetric impulse responses, efficient algorithms, and the definition covers odd and even-stacked cosine-modulated FBs as special cases. Oversampling schemes for the subsignals prove out to be advantageous in subband processing problems requiring phase information about the localized frequency components. In addition, the MFBs have strong connections with the lapped transform (LT) theory, especially with the class of LTs grounded in parametric window functions.<br/
- …