68 research outputs found
Deep learning methods for solving linear inverse problems: Research directions and paradigms
The linear inverse problem is fundamental to the development of various scientific areas. Innumerable attempts have been carried out to solve different variants of the linear inverse problem in different applications. Nowadays, the rapid development of deep learning provides a fresh perspective for solving the linear inverse problem, which has various well-designed network architectures results in state-of-the-art performance in many applications. In this paper, we present a comprehensive survey of the recent progress in the development of deep learning for solving various linear inverse problems. We review how deep learning methods are used in solving different linear inverse problems, and explore the structured neural network architectures that incorporate knowledge used in traditional methods. Furthermore, we identify open challenges and potential future directions along this research line
Sparse and low rank approximations for action recognition
Action recognition is crucial area of research in computer vision with wide range of
applications in surveillance, patient-monitoring systems, video indexing, Human-
Computer Interaction and many more. These applications require automated
action recognition. Robust classification methods are sought-after despite influential
research in this field over past decade. The data resources have grown
tremendously owing to the advances in the digital revolution which cannot be
compared to the meagre resources in the past. The main limitation on a system
when dealing with video data is the computational burden due to large dimensions
and data redundancy. Sparse and low rank approximation methods have evolved
recently which aim at concise and meaningful representation of data. This thesis
explores the application of sparse and low rank approximation methods in the
context of video data classification with the following contributions.
1. An approach for solving the problem of action and gesture classification is
proposed within the sparse representation domain, effectively dealing with
large feature dimensions,
2. Low rank matrix completion approach is proposed to jointly classify more
than one action
3. Deep features are proposed for robust classification of multiple actions
within matrix completion framework which can handle data deficiencies.
This thesis starts with the applicability of sparse representations based classifi-
cation methods to the problem of action and gesture recognition. Random projection
is used to reduce the dimensionality of the features. These are referred
to as compressed features in this thesis. The dictionary formed with compressed
features has proved to be efficient for the classification task achieving comparable
results to the state of the art.
Next, this thesis addresses the more promising problem of simultaneous classifi-
cation of multiple actions. This is treated as matrix completion problem under
transduction setting. Matrix completion methods are considered as the generic
extension to the sparse representation methods from compressed sensing point
of view. The features and corresponding labels of the training and test data are
concatenated and placed as columns of a matrix. The unknown test labels would
be the missing entries in that matrix. This is solved using rank minimization
techniques based on the assumption that the underlying complete matrix would
be a low rank one. This approach has achieved results better than the state of the art on datasets with varying complexities.
This thesis then extends the matrix completion framework for joint classification
of actions to handle the missing features besides missing test labels. In
this context, deep features from a convolutional neural network are proposed.
A convolutional neural network is trained on the training data and features are
extracted from train and test data from the trained network. The performance
of the deep features has proved to be promising when compared to the state of
the art hand-crafted features
Adaptive sparse coding and dictionary selection
Grant no. D000246/1.The sparse coding is approximation/representation of signals with the minimum number of
coefficients using an overcomplete set of elementary functions. This kind of approximations/
representations has found numerous applications in source separation, denoising, coding and
compressed sensing. The adaptation of the sparse approximation framework to the coding
problem of signals is investigated in this thesis. Open problems are the selection of appropriate
models and their orders, coefficient quantization and sparse approximation method. Some of
these questions are addressed in this thesis and novel methods developed. Because almost all
recent communication and storage systems are digital, an easy method to compute quantized
sparse approximations is introduced in the first part.
The model selection problem is investigated next. The linear model can be adapted to better
fit a given signal class. It can also be designed based on some a priori information about the
model. Two novel dictionary selection methods are separately presented in the second part
of the thesis. The proposed model adaption algorithm, called Dictionary Learning with the
Majorization Method (DLMM), is much more general than current methods. This generality
allowes it to be used with different constraints on the model. Particularly, two important cases
have been considered in this thesis for the first time, Parsimonious Dictionary Learning (PDL)
and Compressible Dictionary Learning (CDL). When the generative model order is not given,
PDL not only adapts the dictionary to the given class of signals, but also reduces the model
order redundancies. When a fast dictionary is needed, the CDL framework helps us to find a
dictionary which is adapted to the given signal class without increasing the computation cost
so much.
Sometimes a priori information about the linear generative model is given in format of a parametric
function. Parametric Dictionary Design (PDD) generates a suitable dictionary for sparse
coding using the parametric function. Basically PDD finds a parametric dictionary with a minimal
dictionary coherence, which has been shown to be suitable for sparse approximation and
exact sparse recovery.
Theoretical analyzes are accompanied by experiments to validate the analyzes. This research
was primarily used for audio applications, as audio can be shown to have sparse structures.
Therefore, most of the experiments are done using audio signals
Sparse Modeling for Image and Vision Processing
In recent years, a large amount of multi-disciplinary research has been
conducted on sparse models and their applications. In statistics and machine
learning, the sparsity principle is used to perform model selection---that is,
automatically selecting a simple model among a large collection of them. In
signal processing, sparse coding consists of representing data with linear
combinations of a few dictionary elements. Subsequently, the corresponding
tools have been widely adopted by several scientific communities such as
neuroscience, bioinformatics, or computer vision. The goal of this monograph is
to offer a self-contained view of sparse modeling for visual recognition and
image processing. More specifically, we focus on applications where the
dictionary is learned and adapted to data, yielding a compact representation
that has been successful in various contexts.Comment: 205 pages, to appear in Foundations and Trends in Computer Graphics
and Visio
Textural Difference Enhancement based on Image Component Analysis
In this thesis, we propose a novel image enhancement method to magnify the textural differences in the images with respect to human visual characteristics. The method is intended to be a preprocessing step to improve the performance of the texture-based image segmentation algorithms.
We propose to calculate the six Tamura's texture features (coarseness, contrast, directionality, line-likeness, regularity and roughness) in novel measurements. Each feature follows its original understanding of the certain texture characteristic, but is measured by some local low-level features, e.g., direction of the local edges, dynamic range of the local pixel intensities, kurtosis and skewness of the local image histogram. A discriminant texture feature selection method based on principal component analysis (PCA) is then proposed to find the most representative characteristics in describing textual differences in the image.
We decompose the image into pairwise components representing the texture characteristics strongly and weakly, respectively. A set of wavelet-based soft thresholding methods are proposed as the dictionaries of morphological component analysis (MCA) to sparsely highlight the characteristics strongly and weakly from the image. The wavelet-based thresholding methods are proposed in pair, therefore each of the resulted pairwise components can exhibit one certain characteristic either strongly or weakly.
We propose various wavelet-based manipulation methods to enhance the components separately. For each component representing a certain texture characteristic, a non-linear function is proposed to manipulate the wavelet coefficients of the component so that the component is enhanced with the corresponding characteristic accentuated independently while having little effect on other characteristics.
Furthermore, the above three methods are combined into a uniform framework of image enhancement. Firstly, the texture characteristics differentiating different textures in the image are found. Secondly, the image is decomposed into components exhibiting these texture characteristics respectively. Thirdly, each component is manipulated to accentuate the corresponding texture characteristics exhibited there. After re-combining these manipulated components, the image is enhanced with the textural differences magnified with respect to the selected texture characteristics.
The proposed textural differences enhancement method is used prior to both grayscale and colour image segmentation algorithms. The convincing results of improving the performance of different segmentation algorithms prove the potential of the proposed textural difference enhancement method
Textural Difference Enhancement based on Image Component Analysis
In this thesis, we propose a novel image enhancement method to magnify the textural differences in the images with respect to human visual characteristics. The method is intended to be a preprocessing step to improve the performance of the texture-based image segmentation algorithms.
We propose to calculate the six Tamura's texture features (coarseness, contrast, directionality, line-likeness, regularity and roughness) in novel measurements. Each feature follows its original understanding of the certain texture characteristic, but is measured by some local low-level features, e.g., direction of the local edges, dynamic range of the local pixel intensities, kurtosis and skewness of the local image histogram. A discriminant texture feature selection method based on principal component analysis (PCA) is then proposed to find the most representative characteristics in describing textual differences in the image.
We decompose the image into pairwise components representing the texture characteristics strongly and weakly, respectively. A set of wavelet-based soft thresholding methods are proposed as the dictionaries of morphological component analysis (MCA) to sparsely highlight the characteristics strongly and weakly from the image. The wavelet-based thresholding methods are proposed in pair, therefore each of the resulted pairwise components can exhibit one certain characteristic either strongly or weakly.
We propose various wavelet-based manipulation methods to enhance the components separately. For each component representing a certain texture characteristic, a non-linear function is proposed to manipulate the wavelet coefficients of the component so that the component is enhanced with the corresponding characteristic accentuated independently while having little effect on other characteristics.
Furthermore, the above three methods are combined into a uniform framework of image enhancement. Firstly, the texture characteristics differentiating different textures in the image are found. Secondly, the image is decomposed into components exhibiting these texture characteristics respectively. Thirdly, each component is manipulated to accentuate the corresponding texture characteristics exhibited there. After re-combining these manipulated components, the image is enhanced with the textural differences magnified with respect to the selected texture characteristics.
The proposed textural differences enhancement method is used prior to both grayscale and colour image segmentation algorithms. The convincing results of improving the performance of different segmentation algorithms prove the potential of the proposed textural difference enhancement method
- …