7,216 research outputs found
Theoretical Error Performance Analysis for Deep Neural Network Based Regression Functional Approximation
Based on Kolmogorov's superposition theorem and universal approximation theorems by Cybenko and Barron, any vector-to-scalar function can be approximated by a multi-layer perceptron (MLP) within certain bounds. The theorems inspire us to exploit deep neural networks (DNN) based vector-to-vector regression. This dissertation aims at establishing theoretical foundations on DNN based vector-to-vector functional approximation, and bridging the gap between DNN based applications and their theoretical understanding in terms of representation and generalization powers.
Concerning the representation power, we develop the classical universal approximation theorems and put forth a new upper bound to vector-to-vector regression. More specifically, we first derive upper bounds on the artificial neural network (ANN), and then we generalize the concepts to DNN based architectures. Our theorems suggest that a broader width of the top hidden layer and a deep model structure bring a more expressive power of DNN based vector-to-vector regression, which is illustrated with speech enhancement experiments.
As for the generalization power of DNN based vector-to-vector regression, we employ a well-known error decomposition technique, which factorizes an expected loss into the sum of an approximation error, an estimation error, and an optimization error. Since the approximation error is associated with our attained upper bound upon the expressive power, we concentrate our research on deriving the upper bound for the estimation error and optimization error based on statistical learning theory and non-convex optimization. Moreover, we demonstrate that mean absolute error (MAE) satisfies the property of Lipschitz continuity and exhibits better performance than mean squared error (MSE). The speech enhancement experiments with DNN models are utilized to corroborate our aforementioned theorems.
Finally, since an over-parameterized setting for DNN is expected to ensure our theoretical upper bounds on the generalization power, we put forth a novel deep tensor learning framework, namely tensor-train deep neural network (TT-DNN), to deal with an explosive DNN model size and realize effective deep regression with much smaller model complexity. Our experiments of speech enhancement demonstrate that a TT-DNN can maintain or even achieve higher performance accuracy but with much fewer model parameters than an even over-parameterized DNN.Ph.D
Learning Dictionaries with Bounded Self-Coherence
Sparse coding in learned dictionaries has been established as a successful
approach for signal denoising, source separation and solving inverse problems
in general. A dictionary learning method adapts an initial dictionary to a
particular signal class by iteratively computing an approximate factorization
of a training data matrix into a dictionary and a sparse coding matrix. The
learned dictionary is characterized by two properties: the coherence of the
dictionary to observations of the signal class, and the self-coherence of the
dictionary atoms. A high coherence to the signal class enables the sparse
coding of signal observations with a small approximation error, while a low
self-coherence of the atoms guarantees atom recovery and a more rapid residual
error decay rate for the sparse coding algorithm. The two goals of high signal
coherence and low self-coherence are typically in conflict, therefore one seeks
a trade-off between them, depending on the application. We present a dictionary
learning method with an effective control over the self-coherence of the
trained dictionary, enabling a trade-off between maximizing the sparsity of
codings and approximating an equiangular tight frame.Comment: 4 pages, 2 figures; IEEE Signal Processing Letters, vol. 19, no. 12,
201
A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition
This article provides a unifying Bayesian network view on various approaches
for acoustic model adaptation, missing feature, and uncertainty decoding that
are well-known in the literature of robust automatic speech recognition. The
representatives of these classes can often be deduced from a Bayesian network
that extends the conventional hidden Markov models used in speech recognition.
These extensions, in turn, can in many cases be motivated from an underlying
observation model that relates clean and distorted feature vectors. By
converting the observation models into a Bayesian network representation, we
formulate the corresponding compensation rules leading to a unified view on
known derivations as well as to new formulations for certain approaches. The
generic Bayesian perspective provided in this contribution thus highlights
structural differences and similarities between the analyzed approaches
Time and spectral domain relative entropy: A new approach to multivariate spectral estimation
The concept of spectral relative entropy rate is introduced for jointly
stationary Gaussian processes. Using classical information-theoretic results,
we establish a remarkable connection between time and spectral domain relative
entropy rates. This naturally leads to a new spectral estimation technique
where a multivariate version of the Itakura-Saito distance is employed}. It may
be viewed as an extension of the approach, called THREE, introduced by Byrnes,
Georgiou and Lindquist in 2000 which, in turn, followed in the footsteps of the
Burg-Jaynes Maximum Entropy Method. Spectral estimation is here recast in the
form of a constrained spectrum approximation problem where the distance is
equal to the processes relative entropy rate. The corresponding solution
entails a complexity upper bound which improves on the one so far available in
the multichannel framework. Indeed, it is equal to the one featured by THREE in
the scalar case. The solution is computed via a globally convergent matricial
Newton-type algorithm. Simulations suggest the effectiveness of the new
technique in tackling multivariate spectral estimation tasks, especially in the
case of short data records.Comment: 32 pages, submitted for publicatio
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
- …