23 research outputs found
Robust Object Classification Approach using Spherical Harmonics
Point clouds produced by either 3D scanners or multi-view images are often imperfect and contain noise or outliers. This paper presents an end-to-end robust spherical harmonics approach to classifying 3D objects. The proposed framework first uses the voxel grid of concentric spheres to learn features over the unit ball. We then limit the spherical harmonics order level to suppress the effect of noise and outliers. In addition, the entire classification operation is performed in the Fourier domain. As a result, our proposed model learned features that are less sensitive to data perturbations and corruptions. We tested our proposed model against several types of data perturbations and corruptions, such as noise and outliers. Our results show that the proposed model has fewer parameters, competes with state-of-art networks in terms of robustness to data inaccuracies, and is faster than other robust methods. Our implementation code is also publicly available1
On incorporating inductive biases into deep neural networks
A machine learning (ML) algorithm can be interpreted as a system that learns to capture patterns in data distributions. Before the modern \emph{deep learning era}, emulating the human brain, the use of structured representations and strong inductive bias have been prevalent in building ML models, partly due to the expensive computational resources and the limited availability of data. On the contrary, armed with increasingly cheaper hardware and abundant data, deep learning has made unprecedented progress during the past decade, showcasing incredible performance on a diverse set of ML tasks. In contrast to \emph{classical ML} models, the latter seeks to minimize structured representations and inductive bias when learning, implicitly favoring the flexibility of learning over manual intervention. Despite the impressive performance, attention is being drawn towards enhancing the (relatively) weaker areas of deep models such as learning with limited resources, robustness, minimal overhead to realize simple relationships, and ability to generalize the learned representations beyond the training conditions, which were (arguably) the forte of classical ML. Consequently, a recent hybrid trend is surfacing that aims to blend structured representations and substantial inductive bias into deep models, with the hope of improving them. Based on the above motivation, this thesis investigates methods to improve the performance of deep models using inductive bias and structured representations across multiple problem domains. To this end, we inject a priori knowledge into deep models in the form of enhanced feature extraction techniques, geometrical priors, engineered features, and optimization constraints. Especially, we show that by leveraging the prior knowledge about the task in hand and the structure of data, the performance of deep learning models can be significantly elevated. We begin by exploring equivariant representation learning. In general, the real-world observations are prone to fundamental transformations (e.g., translation, rotation), and deep models typically demand expensive data-augmentations and a high number of filters to tackle such variance. In comparison, carefully designed equivariant filters possess this ability by nature. Henceforth, we propose a novel \emph{volumetric convolution} operation that can convolve arbitrary functions in the unit-ball () while preserving rotational equivariance by projecting the input data onto the Zernike basis. We conduct extensive experiments and show that our formulations can be used to construct significantly cheaper ML models. Next, we study generative modeling of 3D objects and propose a principled approach to synthesize 3D point-clouds in the spectral-domain by obtaining a structured representation of 3D points as functions on the unit sphere (). Using the prior knowledge about the spectral moments and the output data manifold, we design an architecture that can maximally utilize the information in the inputs and generate high-resolution point-clouds with minimal computational overhead. Finally, we propose a framework to build normalizing flows (NF) based on increasing triangular maps and Bernstein-type polynomials. Compared to the existing NF approaches, our framework consists of favorable characteristics for fusing inductive bias within the model i.e., theoretical upper bounds for the approximation error, robustness, higher interpretability, suitability for compactly supported densities, and the ability to employ higher degree polynomials without training instability. Most importantly, we present a constructive universality proof, which permits us to analytically derive the optimal model coefficients for known transformations without training
Mesh Neural Networks for SE(3)-Equivariant Hemodynamics Estimation on the Artery Wall
Computational fluid dynamics (CFD) is a valuable asset for patient-specific
cardiovascular-disease diagnosis and prognosis, but its high computational
demands hamper its adoption in practice. Machine-learning methods that estimate
blood flow in individual patients could accelerate or replace CFD simulation to
overcome these limitations. In this work, we consider the estimation of
vector-valued quantities on the wall of three-dimensional geometric artery
models. We employ group-equivariant graph convolution in an end-to-end
SE(3)-equivariant neural network that operates directly on triangular surface
meshes and makes efficient use of training data. We run experiments on a large
dataset of synthetic coronary arteries and find that our method estimates
directional wall shear stress (WSS) with an approximation error of 7.6% and
normalised mean absolute error (NMAE) of 0.4% while up to two orders of
magnitude faster than CFD. Furthermore, we show that our method is powerful
enough to accurately predict transient, vector-valued WSS over the cardiac
cycle while conditioned on a range of different inflow boundary conditions.
These results demonstrate the potential of our proposed method as a plugin
replacement for CFD in the personalised prediction of hemodynamic vector and
scalar fields.Comment: Preprint. Under Revie
Recommended from our members
Structure-preserving machine learning for inverse problems
Inverse problems naturally arise in many scientific settings, and the study of these problems has been crucial in the development of important technologies such as medical imaging. In inverse problems, the goal is to estimate an underlying ground truth u∗, typically an image, from corresponding measurements y, where u∗ and y are related by
y = N(A(u∗)) (1)
for some forward operator A and noise-generating process N (both of which are generally assumed to be known). Variational regularisation is a well-established approach that can be used to approximately solve inverse problems such as Problem (1). In this approach an image is reconstructed from measurements y by solving a minimisation problem such as
uˆ = argmin d(A(u),y) +αJ(u). (2)
While this approach has proven very successful, it generally requires the parts that make up the optimisation problem to be carefully chosen, and the optimisation problem may require considerable computational effort to solve. There is an active line of research into overcoming these issues using data-driven approaches, which aim to use multiple instances of data to inform a method that can be used on similar data. In this dissertation we investigate ways in which favourable properties of the variational regularisation approach can be combined with a data-driven approach to solving inverse problems.
In the first chapter of the dissertation, we propose a bilevel optimisation framework that can be used to optimise sampling patterns and regularisation parameters for variational image reconstruction in accelerated magnetic resonance imaging. We use this framework to learn sampling patterns that result in better image reconstructions than standard random variable density sampling patterns that sample with the same rate.
In the second chapter of the dissertation, we study the use of group symmetries in learned reconstruction methods for inverse problems. We show that group invariance of a functional implies that the corresponding proximal operator satisfies a group equivariance property. Applying this idea to model proximal operators as roto-translationally equivariant in an unrolled iterative reconstruction method, we show that reconstruction performance is more robust when tested on images in orientations not seen during training (compared to similar methods that model proximal operators to just be translationally equivariant) and that good methods can be learned with less training data.
In the final chapter of the dissertation, we propose a ResNet-styled neural network architecture that is provably nonexpansive. This architecture can be thought of as composing discretisations of gradient flows along learnable convex potentials. Appealing to a classical result on the numerical integration of ODEs, we show that constraining the operator norms of the weight operators is sufficient to give nonexpansiveness, and additional analysis in the case that the numerical integrator is the forward Euler method shows that the neural network is an averaged operator. This guarantees that its fixed point iterations are convergent, and makes it a natural candidate for a learned denoiser in a Plug-and-Play approach to solving inverse problemsCantab Capital Institute for the Mathematics of Informatio
Geometric deep learning and equivariant neural networks
We survey the mathematical foundations of geometric deep learning, focusing on group equivariant and gauge equivariant neural networks. We develop gauge equivariant convolutional neural networks on arbitrary manifolds M using principal bundles with structure group K and equivariant maps between sections of associated vector bundles. We also discuss group equivariant neural networks for homogeneous spaces M= G/ K , which are instead equivariant with respect to the global symmetry G on M . Group equivariant layers can be interpreted as intertwiners between induced representations of G, and we show their relation to gauge equivariant convolutional layers. We analyze several applications of this formalism, including semantic segmentation and object detection networks. We also discuss the case of spherical networks in great detail, corresponding to the case M= S2= SO (3) / SO (2) . Here we emphasize the use of Fourier analysis involving Wigner matrices, spherical harmonics and Clebsch–Gordan coefficients for G= SO (3) , illustrating the power of representation theory for deep learning