112 research outputs found
Tilings of the time-frequency plane: construction of arbitrary orthogonal bases and fast tiling algorithms
The authors consider expansions which give arbitrary orthonormal tilings of the time-frequency plane. These differ from the short-time Fourier transform, wavelet transform, and wavelet packets tilings in that they change over time. They show how this can be achieved using time-varying orthogonal tree structures, which preserve orthogonality, even across transitions. The method is based on the construction of boundary and transition filters; these allow us to construct essentially arbitrary tilings. Time-varying modulated lapped transforms are a special case, where both boundary and overlapping solutions are possible with filters obtained by modulation. They present a double-tree algorithm which for a given signal decides on the best binary segmentation in both time and frequency. That is, it is a joint optimization of time and frequency splitting. The algorithm is optimal for additive cost functions (e.g., rate-distortion), and results in time-varying best bases, the main application of which is for compression of nonstationary signals. Experiments on test signals are presente
Audio Source Separation Using Sparse Representations
This is the author's final version of the article, first published as A. Nesbit, M. G. Jafari, E. Vincent and M. D. Plumbley. Audio Source Separation Using Sparse Representations. In W. Wang (Ed), Machine Audition: Principles, Algorithms and Systems. Chapter 10, pp. 246-264. IGI Global, 2011. ISBN 978-1-61520-919-4. DOI: 10.4018/978-1-61520-919-4.ch010file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04file: NesbitJafariVincentP11-audio.pdf:n\NesbitJafariVincentP11-audio.pdf:PDF owner: markp timestamp: 2011.02.04The authors address the problem of audio source separation, namely, the recovery of audio signals from recordings of mixtures of those signals. The sparse component analysis framework is a powerful method for achieving this. Sparse orthogonal transforms, in which only few transform coefficients differ significantly from zero, are developed; once the signal has been transformed, energy is apportioned from each transform coefficient to each estimated source, and, finally, the signal is reconstructed using the inverse transform. The overriding aim of this chapter is to demonstrate how this framework, as exemplified here by two different decomposition methods which adapt to the signal to represent it sparsely, can be used to solve different problems in different mixing scenarios. To address the instantaneous (neither delays nor echoes) and underdetermined (more sources than mixtures) mixing model, a lapped orthogonal transform is adapted to the signal by selecting a basis from a library of predetermined bases. This method is highly related to the windowing methods used in the MPEG audio coding framework. In considering the anechoic (delays but no echoes) and determined (equal number of sources and mixtures) mixing case, a greedy adaptive transform is used based on orthogonal basis functions that are learned from the observed data, instead of being selected from a predetermined library of bases. This is found to encode the signal characteristics, by introducing a feedback system between the bases and the observed data. Experiments on mixtures of speech and music signals demonstrate that these methods give good signal approximations and separation performance, and indicate promising directions for future research
Representation of perfectly reconstructed octave decomposition filter banks with set of decimators {2,4,4} via tree structure
In this letter, we prove that a filter bank with set of decimators {2,4,4} achieves perfect reconstruction if and only if it can be represented via a tree structure and each branch of the tree structure achieves perfect reconstruction
Realization of perfect reconstruction non-uniform filter banks via a tree structure
Obviously, a tree structure filter bank can be realized via a non-uniform filter bank, and perfect reconstruction is achieved if and only if each branch of the tree structure can provide perfect reconstruction. In this paper, the converse of this problem is studied. We show that a perfect reconstruction non-uniform filter bank with decimation ratio {2,4,4} can be realized via a tree structure and each branch of the tree structure achieves perfect reconstruction
Streaming Reconstruction from Non-uniform Samples
We present an online algorithm for reconstructing a signal from a set of
non-uniform samples. By representing the signal using compactly supported basis
functions, we show how estimating the expansion coefficients using
least-squares can be implemented in a streaming manner: as batches of samples
over subsequent time intervals are presented, the algorithm forms an initial
estimate of the signal over the sampling interval then updates its estimates
over previous intervals. We give conditions under which this reconstruction
procedure is stable and show that the least-squares estimates in each interval
converge exponentially, meaning that the updates can be performed with finite
memory with almost no loss in accuracy. We also discuss how our framework
extends to more general types of measurements including time-varying
convolution with a compactly supported kernel
Superposition frames for adaptive time-frequency analysis and fast reconstruction
In this article we introduce a broad family of adaptive, linear
time-frequency representations termed superposition frames, and show that they
admit desirable fast overlap-add reconstruction properties akin to standard
short-time Fourier techniques. This approach stands in contrast to many
adaptive time-frequency representations in the extant literature, which, while
more flexible than standard fixed-resolution approaches, typically fail to
provide efficient reconstruction and often lack the regular structure necessary
for precise frame-theoretic analysis. Our main technical contributions come
through the development of properties which ensure that this construction
provides for a numerically stable, invertible signal representation. Our
primary algorithmic contributions come via the introduction and discussion of
specific signal adaptation criteria in deterministic and stochastic settings,
based respectively on time-frequency concentration and nonstationarity
detection. We conclude with a short speech enhancement example that serves to
highlight potential applications of our approach.Comment: 16 pages, 6 figures; revised versio
Factorability of lossless time-varying filters and filter banks
We study the factorability of linear time-varying (LTV) lossless filters and filter banks. We give a complete characterization of all, degree-one lossless LTV systems and show that all degree-one lossless systems can be decomposed into a time-dependent unitary matrix followed by a lossless dyadic-based LTV system. The lossless dyadic-based system has several properties that make it useful in the factorization of lossless LTV systems. The traditional lapped orthogonal transform (LOT) is also generalized to the LTV case. We identify two classes of TVLOTs, namely, the invertible inverse lossless (IIL) and noninvertible inverse lossless (NIL) TVLOTs. The minimum number of delays required to implement a TVLOT is shown to be a nondecreasing function of time, and it is a constant if and only if the TVLOT is IIL. We also show that all IIL TVLOTs can be factorized uniquely into the proposed degree-one lossless building block. The factorization is minimal in terms of the delay elements. For NIL TVLOTs, there are factorable and unfactorable examples. Both necessary and sufficient conditions for the factorability of lossless LTV systems are given. We also introduce the concept of strong eternal reachability (SER) and strong eternal observability (SEO) of LTV systems. The SER and SEO of an implementation of LTV systems imply the minimality of the structure. Using these concepts, we are able to show that the cascade structure for a factorable IIL LTV system is minimal. That implies that if a IIL LTV system is factorable in terms of the lossless dyadic-based building blocks, the factorization is minimal in terms of delays as well as the number of building blocks. We also prove the BIBO stability of the LTV normalized IIR lattice
- …