Search CORE

7,216 research outputs found

Theoretical Error Performance Analysis for Deep Neural Network Based Regression Functional Approximation

Author: Qi Jun
Publication venue: Georgia Institute of Technology
Publication date: 14/01/2022
Field of study

Based on Kolmogorov's superposition theorem and universal approximation theorems by Cybenko and Barron, any vector-to-scalar function can be approximated by a multi-layer perceptron (MLP) within certain bounds. The theorems inspire us to exploit deep neural networks (DNN) based vector-to-vector regression. This dissertation aims at establishing theoretical foundations on DNN based vector-to-vector functional approximation, and bridging the gap between DNN based applications and their theoretical understanding in terms of representation and generalization powers. Concerning the representation power, we develop the classical universal approximation theorems and put forth a new upper bound to vector-to-vector regression. More specifically, we first derive upper bounds on the artificial neural network (ANN), and then we generalize the concepts to DNN based architectures. Our theorems suggest that a broader width of the top hidden layer and a deep model structure bring a more expressive power of DNN based vector-to-vector regression, which is illustrated with speech enhancement experiments. As for the generalization power of DNN based vector-to-vector regression, we employ a well-known error decomposition technique, which factorizes an expected loss into the sum of an approximation error, an estimation error, and an optimization error. Since the approximation error is associated with our attained upper bound upon the expressive power, we concentrate our research on deriving the upper bound for the estimation error and optimization error based on statistical learning theory and non-convex optimization. Moreover, we demonstrate that mean absolute error (MAE) satisfies the property of Lipschitz continuity and exhibits better performance than mean squared error (MSE). The speech enhancement experiments with DNN models are utilized to corroborate our aforementioned theorems. Finally, since an over-parameterized setting for DNN is expected to ensure our theoretical upper bounds on the generalization power, we put forth a novel deep tensor learning framework, namely tensor-train deep neural network (TT-DNN), to deal with an explosive DNN model size and realize effective deep regression with much smaller model complexity. Our experiments of speech enhancement demonstrate that a TT-DNN can maintain or even achieve higher performance accuracy but with much fewer model parameters than an even over-parameterized DNN.Ph.D

Scholarly Materials And Research @ Georgia Tech

Learning Dictionaries with Bounded Self-Coherence

Author: Buhmann Joachim M.
Dikk Tomas
Sigg Christian D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 17/10/2012
Field of study

Sparse coding in learned dictionaries has been established as a successful approach for signal denoising, source separation and solving inverse problems in general. A dictionary learning method adapts an initial dictionary to a particular signal class by iteratively computing an approximate factorization of a training data matrix into a dictionary and a sparse coding matrix. The learned dictionary is characterized by two properties: the coherence of the dictionary to observations of the signal class, and the self-coherence of the dictionary atoms. A high coherence to the signal class enables the sparse coding of signal observations with a small approximation error, while a low self-coherence of the atoms guarantees atom recovery and a more rapid residual error decay rate for the sparse coding algorithm. The two goals of high signal coherence and low self-coherence are typically in conflict, therefore one seeks a trade-off between them, depending on the application. We present a dictionary learning method with an effective control over the self-coherence of the trained dictionary, enabling a trade-off between maximizing the sparsity of codings and approximating an equiangular tight frame.Comment: 4 pages, 2 figures; IEEE Signal Processing Letters, vol. 19, no. 12, 201

arXiv.org e-Print Archive

Crossref

A Bayesian Network View on Acoustic Model-Based Techniques for Robust Speech Recognition

Author: Huemmer Christian
Kellermann Walter
Maas Roland
Sehr Armin
Publication venue
Publication date: 22/09/2014
Field of study

This article provides a unifying Bayesian network view on various approaches for acoustic model adaptation, missing feature, and uncertainty decoding that are well-known in the literature of robust automatic speech recognition. The representatives of these classes can often be deduced from a Bayesian network that extends the conventional hidden Markov models used in speech recognition. These extensions, in turn, can in many cases be motivated from an underlying observation model that relates clean and distorted feature vectors. By converting the observation models into a Bayesian network representation, we formulate the corresponding compensation rules leading to a unified view on known derivations as well as to new formulations for certain approaches. The generic Bayesian perspective provided in this contribution thus highlights structural differences and similarities between the analyzed approaches

arXiv.org e-Print Archive

Time and spectral domain relative entropy: A new approach to multivariate spectral estimation

Author: Ferrante Augusto
Masiero Chiara
Pavon Michele
Publication venue
Publication date: 29/09/2011
Field of study

The concept of spectral relative entropy rate is introduced for jointly stationary Gaussian processes. Using classical information-theoretic results, we establish a remarkable connection between time and spectral domain relative entropy rates. This naturally leads to a new spectral estimation technique where a multivariate version of the Itakura-Saito distance is employed}. It may be viewed as an extension of the approach, called THREE, introduced by Byrnes, Georgiou and Lindquist in 2000 which, in turn, followed in the footsteps of the Burg-Jaynes Maximum Entropy Method. Spectral estimation is here recast in the form of a constrained spectrum approximation problem where the distance is equal to the processes relative entropy rate. The corresponding solution entails a complexity upper bound which improves on the one so far available in the multichannel framework. Indeed, it is equal to the one featured by THREE in the scalar case. The solution is computed via a globally convergent matricial Newton-type algorithm. Simulations suggest the effectiveness of the new technique in tackling multivariate spectral estimation tasks, especially in the case of short data records.Comment: 32 pages, submitted for publicatio

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova

Graph Spectral Image Processing

Author: Cheung Gene
Magli Enrico
Ng Michael
Tanaka Yuichi
Publication venue
Publication date: 16/01/2018
Field of study

Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

arXiv.org e-Print Archive

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)