26 research outputs found
Statistics on the (compact) Stiefel manifold: Theory and Applications
A Stiefel manifold of the compact type is often encountered in many fields of
Engineering including, signal and image processing, machine learning, numerical
optimization and others. The Stiefel manifold is a Riemannian homogeneous space
but not a symmetric space. In previous work, researchers have defined
probability distributions on symmetric spaces and performed statistical
analysis of data residing in these spaces. In this paper, we present original
work involving definition of Gaussian distributions on a homogeneous space and
show that the maximum-likelihood estimate of the location parameter of a
Gaussian distribution on the homogeneous space yields the Fr\'echet mean (FM)
of the samples drawn from this distribution. Further, we present an algorithm
to sample from the Gaussian distribution on the Stiefel manifold and
recursively compute the FM of these samples. We also prove the weak consistency
of this recursive FM estimator. Several synthetic and real data experiments are
then presented, demonstrating the superior computational performance of this
estimator over the gradient descent based non-recursive counter part as well as
the stochastic gradient descent based method prevalent in literature
ManifoldNorm: Extending normalizations on Riemannian Manifolds
Many measurements in computer vision and machine learning manifest as
non-Euclidean data samples. Several researchers recently extended a number of
deep neural network architectures for manifold valued data samples. Researchers
have proposed models for manifold valued spatial data which are common in
medical image processing including processing of diffusion tensor imaging (DTI)
where images are fields of symmetric positive definite matrices or
representation in terms of orientation distribution field (ODF) where the
identification is in terms of field on hypersphere. There are other sequential
models for manifold valued data that recently researchers have shown to be
effective for group difference analysis in study for neuro-degenerative
diseases. Although, several of these methods are effective to deal with
manifold valued data, the bottleneck includes the instability in optimization
for deeper networks. In order to deal with these instabilities, researchers
have proposed residual connections for manifold valued data. One of the other
remedies to deal with the instabilities including gradient explosion is to use
normalization techniques including {\it batch norm} and {\it group norm} etc..
But, so far there is no normalization techniques applicable for manifold valued
data. In this work, we propose a general normalization techniques for manifold
valued data. We show that our proposed manifold normalization technique have
special cases including popular batch norm and group norm techniques. On the
experimental side, we focus on two types of manifold valued data including
manifold of symmetric positive definite matrices and hypersphere. We show the
performance gain in one synthetic experiment for moving MNIST dataset and one
real brain image dataset where the representation is in terms of orientation
distribution field (ODF)
Dictionary Learning and Sparse Coding on Statistical Manifolds
In this paper, we propose a novel information theoretic framework for
dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the
manifold of probability distributions). Unlike the traditional DL and SC
framework, our new formulation does not explicitly incorporate any sparsity
inducing norm in the cost function being optimized but yet yields sparse codes.
Our algorithm approximates the data points on the statistical manifold (which
are probability distributions) by the weighted Kullback-Leibeler center/mean
(KL-center) of the dictionary atoms. The KL-center is defined as the minimizer
of the maximum KL-divergence between itself and members of the set whose center
is being sought. Further, we prove that the weighted KL-center is a sparse
combination of the dictionary atoms. This result also holds for the case when
the KL-divergence is replaced by the well known Hellinger distance. From an
applications perspective, we present an extension of the aforementioned
framework to the manifold of symmetric positive definite matrices (which can be
identified with the manifold of zero mean gaussian distributions),
. We present experiments involving a variety of dictionary-based
reconstruction and classification problems in Computer Vision. Performance of
the proposed algorithm is demonstrated by comparing it to several
state-of-the-art methods in terms of reconstruction and classification accuracy
as well as sparsity of the chosen representation.Comment: arXiv admin note: substantial text overlap with arXiv:1604.0693
Generative Adversarial Network based Autoencoder: Application to fault detection problem for closed loop dynamical systems
Fault detection problem for closed loop uncertain dynamical systems, is
investigated in this paper, using different deep learning based methods.
Traditional classifier based method does not perform well, because of the
inherent difficulty of detecting system level faults for closed loop dynamical
system. Specifically, acting controller in any closed loop dynamical system,
works to reduce the effect of system level faults. A novel Generative
Adversarial based deep Autoencoder is designed to classify datasets under
normal and faulty operating conditions. This proposed network performs
significantly well when compared to any available classifier based methods, and
moreover, does not require labeled fault incorporated datasets for training
purpose. Finally, this aforementioned network's performance is tested on a high
complexity building energy system dataset.Comment: 9 pages, 2 figure
A GMM based algorithm to generate point-cloud and its application to neuroimaging
Recent years have witnessed the emergence of 3D medical imaging techniques
with the development of 3D sensors and technology. Due to the presence of noise
in image acquisition, registration researchers focused on an alternative way to
represent medical images. An alternative way to analyze medical imaging is by
understanding the 3D shapes represented in terms of point-cloud. Though in the
medical imaging community, 3D point-cloud processing is not a ``go-to'' choice,
it is a ``natural'' way to capture 3D shapes. However, as the number of samples
for medical images are small, researchers have used pre-trained models to
fine-tune on medical images. Furthermore, due to different modality in medical
images, standard generative models can not be used to generate new samples of
medical images. In this work, we use the advantage of point-cloud
representation of 3D structures of medical images and propose a Gaussian
mixture model-based generation scheme. Our proposed method is robust to
outliers. Experimental validation has been performed to show that the proposed
scheme can generate new 3D structures using interpolation techniques, i.e.,
given two 3D structures represented as point-clouds, we can generate
point-clouds in between. We have also generated new point-clouds for subjects
with and without dementia and show that the generated samples are indeed
closely matched to the respective training samples from the same class
An "augmentation-free" rotation invariant classification scheme on point-cloud and its application to neuroimaging
Recent years have witnessed the emergence and increasing popularity of 3D
medical imaging techniques with the development of 3D sensors and technology.
However, achieving geometric invariance in the processing of 3D medical images
is computationally expensive but nonetheless essential due to the presence of
possible errors caused by rigid registration techniques. An alternative way to
analyze medical imaging is by understanding the 3D shapes represented in terms
of point-cloud. Though in the medical imaging community, 3D point-cloud
processing is not a "go-to" choice, it is a canonical way to preserve rotation
invariance. Unfortunately, due to the presence of discrete topology, one can
not use the standard convolution operator on point-cloud. To the best of our
knowledge, the existing ways to do "convolution" can not preserve the rotation
invariance without explicit data augmentation. Therefore, we propose a rotation
invariant convolution operator by inducing topology from hypersphere.
Experimental validation has been performed on publicly available OASIS dataset
in terms of classification accuracy between subjects with (without) dementia,
demonstrating the usefulness of our proposed method in terms of model
complexity, classification accuracy, and last but most important invariance to
rotations.Comment: arXiv admin note: text overlap with arXiv:1910.13050 and
arXiv:1911.0170
ManifoldNet: A Deep Network Framework for Manifold-valued Data
Deep neural networks have become the main work horse for many tasks involving
learning from data in a variety of applications in Science and Engineering.
Traditionally, the input to these networks lie in a vector space and the
operations employed within the network are well defined on vector-spaces. In
the recent past, due to technological advances in sensing, it has become
possible to acquire manifold-valued data sets either directly or indirectly.
Examples include but are not limited to data from omnidirectional cameras on
automobiles, drones etc., synthetic aperture radar imaging, diffusion magnetic
resonance imaging, elastography and conductance imaging in the Medical Imaging
domain and others. Thus, there is need to generalize the deep neural networks
to cope with input data that reside on curved manifolds where vector space
operations are not naturally admissible. In this paper, we present a novel
theoretical framework to generalize the widely popular convolutional neural
networks (CNNs) to high dimensional manifold-valued data inputs. We call these
networks, ManifoldNets.
In ManifoldNets, convolution operation on data residing on Riemannian
manifolds is achieved via a provably convergent recursive computation of the
weighted Fr\'{e}chet Mean (wFM) of the given data, where the weights makeup the
convolution mask, to be learned. Further, we prove that the proposed wFM layer
achieves a contraction mapping and hence ManifoldNet does not need the
non-linear ReLU unit used in standard CNNs. We present experiments, using the
ManifoldNet framework, to achieve dimensionality reduction by computing the
principal linear subspaces that naturally reside on a Grassmannian. The
experimental results demonstrate the efficacy of ManifoldNets in the context of
classification and reconstruction accuracy
SurReal: Complex-Valued Learning as Principled Transformations on a Scaling and Rotation Manifold
Complex-valued data is ubiquitous in signal and image processing
applications, and complex-valued representations in deep learning have
appealing theoretical properties. While these aspects have long been
recognized, complex-valued deep learning continues to lag far behind its
real-valued counterpart.
We propose a principled geometric approach to complex-valued deep learning.
Complex-valued data could often be subject to arbitrary complex-valued scaling;
as a result, real and imaginary components could co-vary. Instead of treating
complex values as two independent channels of real values, we recognize their
underlying geometry: We model the space of complex numbers as a product
manifold of non-zero scaling and planar rotations. Arbitrary complex-valued
scaling naturally becomes a group of transitive actions on this manifold.
We propose to extend the property instead of the form of real-valued
functions to the complex domain. We define convolution as weighted Fr\'echet
mean on the manifold that is equivariant to the group of scaling/rotation
actions, and define distance transform on the manifold that is invariant to the
action group. The manifold perspective also allows us to define nonlinear
activation functions such as tangent ReLU and G-transport, as well as residual
connections on the manifold-valued data.
We dub our model SurReal, as our experiments on MSTAR and RadioML deliver
high performance with only a fractional size of real-valued and complex-valued
baseline models.Comment: 12 pages, accepted to TNNLS journa
An Online Riemannian PCA for Stochastic Canonical Correlation Analysis
We present an efficient stochastic algorithm (RSG+) for canonical correlation
analysis (CCA) using a reparametrization of the projection matrices. We show
how this reparametrization (into structured matrices), simple in hindsight,
directly presents an opportunity to repurpose/adjust mature techniques for
numerical optimization on Riemannian manifolds. Our developments nicely
complement existing methods for this problem which either require time
complexity per iteration with convergence rate (where
is the dimensionality) or only extract the top component with
convergence rate. In contrast, our algorithm offers a strict
improvement for this classical problem: it achieves runtime
complexity per iteration for extracting the top canonical components with
convergence rate. While the paper primarily focuses on the
formulation and technical analysis of its properties, our experiments show that
the empirical behavior on common datasets is quite promising. We also explore a
potential application in training fair models where the label of protected
attribute is missing or otherwise unavailable
A CNN for homogneous Riemannian manifolds with applications to Neuroimaging
Convolutional neural networks are ubiquitous in Machine Learning applications
for solving a variety of problems. They however can not be used in their native
form when the domain of the data is commonly encountered manifolds such as the
sphere, the special orthogonal group, the Grassmanian, the manifold of
symmetric positive definite matrices and others. Most recently, generalization
of CNNs to data domains such as the 2-sphere has been reported by some research
groups, which is referred to as the spherical CNNs (SCNNs). The key property of
SCNNs distinct from CNNs is that they exhibit the rotational equivariance
property that allows for sharing learned weights within a layer. In this paper,
we theoretically generalize the CNNs to Riemannian homogeneous manifolds, that
include but are not limited to the aforementioned example manifolds. Our key
contributions in this work are: (i) A theorem stating that linear group
equivariance systems are fully characterized by correlation of functions on the
domain manifold and vice-versa. This is fundamental to the characterization of
all linear group equivariant systems and parallels the widely used result in
linear system theory for vector spaces. (ii) As a corrolary, we prove the
equivariance of the correlation operation to group actions admitted by the
input domains which are Riemannian homogeneous manifolds. (iii) We present the
first end-to-end deep network architecture for classification of diffusion
magnetic resonance image (dMRI) scans acquired from a cohort of 44 Parkinson
Disease patients and 50 control/normal subjects. (iv) A proof of concept
experiment involving synthetic data generated on the manifold of symmetric
positive definite matrices is presented to demonstrate the applicability of our
network to other types of domains