352 research outputs found
DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition
Being symmetric positive-definite (SPD), covariance matrix has traditionally
been used to represent a set of local descriptors in visual recognition. Recent
study shows that kernel matrix can give considerably better representation by
modelling the nonlinearity in the local descriptor set. Nevertheless, neither
the descriptors nor the kernel matrix is deeply learned. Worse, they are
considered separately, hindering the pursuit of an optimal SPD representation.
This work proposes a deep network that jointly learns local descriptors,
kernel-matrix-based SPD representation, and the classifier via an end-to-end
training process. We derive the derivatives for the mapping from a local
descriptor set to the SPD representation to carry out backpropagation. Also, we
exploit the Daleckii-Krein formula in operator theory to give a concise and
unified result on differentiating SPD matrix functions, including the matrix
logarithm to handle the Riemannian geometry of kernel matrix. Experiments not
only show the superiority of kernel-matrix-based SPD representation with deep
local descriptors, but also verify the advantage of the proposed deep network
in pursuing better SPD representations for fine-grained image recognition
tasks
A novel family of geometrical transformations: Polyrigid transformations. Application to the registration of histological slices
We present in this report a novel kind of geometrical transformations, which we have named polyrigid. Within their framework, it is possible to define local rigid deformations in a given number of simple regions, while simultanously guaranteeing the smoothness and invertibility of the global transformation. Entirely parametric, this new type of tool is highly suitable for inference, and it is successfully applied to the non-rigid registration of histological slices. These general transformations are a nice alternative to classical B-Spline transformations (which do not guaranty invertibility). In future work, other applications will be considered, for instance in 3D registration
Towards ultra-high resolution 3D reconstruction of a whole rat brain from 3D-PLI data
3D reconstruction of the fiber connectivity of the rat brain at microscopic
scale enables gaining detailed insight about the complex structural
organization of the brain. We introduce a new method for registration and 3D
reconstruction of high- and ultra-high resolution (64 m and 1.3 m
pixel size) histological images of a Wistar rat brain acquired by 3D polarized
light imaging (3D-PLI). Our method exploits multi-scale and multi-modal 3D-PLI
data up to cellular resolution. We propose a new feature transform-based
similarity measure and a weighted regularization scheme for accurate and robust
non-rigid registration. To transform the 1.3 m ultra-high resolution data
to the reference blockface images a feature-based registration method followed
by a non-rigid registration is proposed. Our approach has been successfully
applied to 278 histological sections of a rat brain and the performance has
been quantitatively evaluated using manually placed landmarks by an expert.Comment: 9 pages, Accepted at 2nd International Workshop on Connectomics in
NeuroImaging (CNI), MICCAI'201
Statistically Motivated Second Order Pooling
Second-order pooling, a.k.a.~bilinear pooling, has proven effective for deep
learning based visual recognition. However, the resulting second-order networks
yield a final representation that is orders of magnitude larger than that of
standard, first-order ones, making them memory-intensive and cumbersome to
deploy. Here, we introduce a general, parametric compression strategy that can
produce more compact representations than existing compression techniques, yet
outperform both compressed and uncompressed second-order models. Our approach
is motivated by a statistical analysis of the network's activations, relying on
operations that lead to a Gaussian-distributed final representation, as
inherently used by first-order deep networks. As evidenced by our experiments,
this lets us outperform the state-of-the-art first-order and second-order
models on several benchmark recognition datasets.Comment: Accepted to ECCV 2018. Camera ready version. 14 page, 5 figures, 3
table
Diffeomorphic registration using geodesic shooting and Gauss-Newton optimisation
This paper presents a nonlinear image registration algorithm based on the setting of Large Deformation Diffeomorphic Metric Mapping (LDDMM). but with a more efficient optimisation scheme - both in terms of memory required and the number of iterations required to reach convergence. Rather than perform a variational optimisation on a series of velocity fields, the algorithm is formulated to use a geodesic shooting procedure, so that only an initial velocity is estimated. A Gauss-Newton optimisation strategy is used to achieve faster convergence. The algorithm was evaluated using freely available manually labelled datasets, and found to compare favourably with other inter-subject registration algorithms evaluated using the same data. (C) 2011 Elsevier Inc. All rights reserved
Bi-invariant Means in Lie Groups. Application to Left-invariant Polyaffine Transformations
In this work, we present a general framework to define rigorously a novel type of mean in Lie groups, called the bi-invariant mean. This mean enjoys many desirable invariance properties, which generalize to the non-linear case the properties of the arithmetic mean: it is invariant with respect to left- and right-multiplication, as well as inversion. Previously, this type of mean was only defined in Lie groups endowed with a bi-invariant Riemannian metric, like compact Lie groups such as the group of rotations. But Riemannian bi-invariant metrics do not always exist. In particular, we prove in this work that such metrics do not exist in any dimension for rigid transformations, which form but the most simple Lie group involved in bio-medical image registration. To overcome the lack of existence of bi-invariant Riemannian metrics for many Lie groups, we propose in this article to define bi-invariant means in any finite-dimensional real Lie group via a general barycentric equation, whose solution is by definition the bi-invariant mean. We show the existence and uniqueness of this novel type of mean, provided the dispersion of the data is small enough, and the convergence of an efficient iterative algorithm for computing this mean has also been shown. The intuition of the existence of such a mean was first given by R.P.Woods (without any precise definition), along with an efficient algorithm for computing it (without proof of convergence), in the case of matrix groups. In the case of rigid transformations, we give a simple criterion for the general existence and uniqueness of the bi-invariant mean, which happens to be the same as for rotations. We also give closed forms for the bi-invariant mean in a number of simple but instructive cases, including 2D rigid transformations. Interestingly, for general linear transformations, we show that similarly to the Log-Euclidean mean, which we proposed in recent work, the bi-invariant mean is a generalization of the (scalar) geometric mean, since the determinant of the bi-invariant mean is exactly equal to the geometric mean of the determinants of the data. Last but not least, we use this new type of mean to define a novel class of polyaffine transformations, called left-invariant polyaffine, which allows to fuse local rigid or affine components arbitrarily far away from the identity, contrary to Log-Euclidean polyaffine fusion, which we have recently introduced
Estimation of Fiber Orientations Using Neighborhood Information
Data from diffusion magnetic resonance imaging (dMRI) can be used to
reconstruct fiber tracts, for example, in muscle and white matter. Estimation
of fiber orientations (FOs) is a crucial step in the reconstruction process and
these estimates can be corrupted by noise. In this paper, a new method called
Fiber Orientation Reconstruction using Neighborhood Information (FORNI) is
described and shown to reduce the effects of noise and improve FO estimation
performance by incorporating spatial consistency. FORNI uses a fixed tensor
basis to model the diffusion weighted signals, which has the advantage of
providing an explicit relationship between the basis vectors and the FOs. FO
spatial coherence is encouraged using weighted l1-norm regularization terms,
which contain the interaction of directional information between neighbor
voxels. Data fidelity is encouraged using a squared error between the observed
and reconstructed diffusion weighted signals. After appropriate weighting of
these competing objectives, the resulting objective function is minimized using
a block coordinate descent algorithm, and a straightforward parallelization
strategy is used to speed up processing. Experiments were performed on a
digital crossing phantom, ex vivo tongue dMRI data, and in vivo brain dMRI data
for both qualitative and quantitative evaluation. The results demonstrate that
FORNI improves the quality of FO estimation over other state of the art
algorithms.Comment: Journal paper accepted in Medical Image Analysis. 35 pages and 16
figure
Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions
We present a comparative evaluation of various techniques for action
recognition while keeping as many variables as possible controlled. We employ
two categories of Riemannian manifolds: symmetric positive definite matrices
and linear subspaces. For both categories we use their corresponding nearest
neighbour classifiers, kernels, and recent kernelised sparse representations.
We compare against traditional action recognition techniques based on Gaussian
mixture models and Fisher vectors (FVs). We evaluate these action recognition
techniques under ideal conditions, as well as their sensitivity in more
challenging conditions (variations in scale and translation). Despite recent
advancements for handling manifolds, manifold based techniques obtain the
lowest performance and their kernel representations are more unstable in the
presence of challenging conditions. The FV approach obtains the highest
accuracy under ideal conditions. Moreover, FV best deals with moderate scale
and translation changes
Second-order Democratic Aggregation
Aggregated second-order features extracted from deep convolutional networks
have been shown to be effective for texture generation, fine-grained
recognition, material classification, and scene understanding. In this paper,
we study a class of orderless aggregation functions designed to minimize
interference or equalize contributions in the context of second-order features
and we show that they can be computed just as efficiently as their first-order
counterparts and they have favorable properties over aggregation by summation.
Another line of work has shown that matrix power normalization after
aggregation can significantly improve the generalization of second-order
representations. We show that matrix power normalization implicitly equalizes
contributions during aggregation thus establishing a connection between matrix
normalization techniques and prior work on minimizing interference. Based on
the analysis we present {\gamma}-democratic aggregators that interpolate
between sum ({\gamma}=1) and democratic pooling ({\gamma}=0) outperforming both
on several classification tasks. Moreover, unlike power normalization, the
{\gamma}-democratic aggregations can be computed in a low dimensional space by
sketching that allows the use of very high-dimensional second-order features.
This results in a state-of-the-art performance on several datasets
- …