352 research outputs found

    DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-grained Image Recognition

    Full text link
    Being symmetric positive-definite (SPD), covariance matrix has traditionally been used to represent a set of local descriptors in visual recognition. Recent study shows that kernel matrix can give considerably better representation by modelling the nonlinearity in the local descriptor set. Nevertheless, neither the descriptors nor the kernel matrix is deeply learned. Worse, they are considered separately, hindering the pursuit of an optimal SPD representation. This work proposes a deep network that jointly learns local descriptors, kernel-matrix-based SPD representation, and the classifier via an end-to-end training process. We derive the derivatives for the mapping from a local descriptor set to the SPD representation to carry out backpropagation. Also, we exploit the Daleckii-Krein formula in operator theory to give a concise and unified result on differentiating SPD matrix functions, including the matrix logarithm to handle the Riemannian geometry of kernel matrix. Experiments not only show the superiority of kernel-matrix-based SPD representation with deep local descriptors, but also verify the advantage of the proposed deep network in pursuing better SPD representations for fine-grained image recognition tasks

    A novel family of geometrical transformations: Polyrigid transformations. Application to the registration of histological slices

    Get PDF
    We present in this report a novel kind of geometrical transformations, which we have named polyrigid. Within their framework, it is possible to define local rigid deformations in a given number of simple regions, while simultanously guaranteeing the smoothness and invertibility of the global transformation. Entirely parametric, this new type of tool is highly suitable for inference, and it is successfully applied to the non-rigid registration of histological slices. These general transformations are a nice alternative to classical B-Spline transformations (which do not guaranty invertibility). In future work, other applications will be considered, for instance in 3D registration

    Towards ultra-high resolution 3D reconstruction of a whole rat brain from 3D-PLI data

    Full text link
    3D reconstruction of the fiber connectivity of the rat brain at microscopic scale enables gaining detailed insight about the complex structural organization of the brain. We introduce a new method for registration and 3D reconstruction of high- and ultra-high resolution (64 ÎĽ\mum and 1.3 ÎĽ\mum pixel size) histological images of a Wistar rat brain acquired by 3D polarized light imaging (3D-PLI). Our method exploits multi-scale and multi-modal 3D-PLI data up to cellular resolution. We propose a new feature transform-based similarity measure and a weighted regularization scheme for accurate and robust non-rigid registration. To transform the 1.3 ÎĽ\mum ultra-high resolution data to the reference blockface images a feature-based registration method followed by a non-rigid registration is proposed. Our approach has been successfully applied to 278 histological sections of a rat brain and the performance has been quantitatively evaluated using manually placed landmarks by an expert.Comment: 9 pages, Accepted at 2nd International Workshop on Connectomics in NeuroImaging (CNI), MICCAI'201

    Statistically Motivated Second Order Pooling

    Get PDF
    Second-order pooling, a.k.a.~bilinear pooling, has proven effective for deep learning based visual recognition. However, the resulting second-order networks yield a final representation that is orders of magnitude larger than that of standard, first-order ones, making them memory-intensive and cumbersome to deploy. Here, we introduce a general, parametric compression strategy that can produce more compact representations than existing compression techniques, yet outperform both compressed and uncompressed second-order models. Our approach is motivated by a statistical analysis of the network's activations, relying on operations that lead to a Gaussian-distributed final representation, as inherently used by first-order deep networks. As evidenced by our experiments, this lets us outperform the state-of-the-art first-order and second-order models on several benchmark recognition datasets.Comment: Accepted to ECCV 2018. Camera ready version. 14 page, 5 figures, 3 table

    Diffeomorphic registration using geodesic shooting and Gauss-Newton optimisation

    Get PDF
    This paper presents a nonlinear image registration algorithm based on the setting of Large Deformation Diffeomorphic Metric Mapping (LDDMM). but with a more efficient optimisation scheme - both in terms of memory required and the number of iterations required to reach convergence. Rather than perform a variational optimisation on a series of velocity fields, the algorithm is formulated to use a geodesic shooting procedure, so that only an initial velocity is estimated. A Gauss-Newton optimisation strategy is used to achieve faster convergence. The algorithm was evaluated using freely available manually labelled datasets, and found to compare favourably with other inter-subject registration algorithms evaluated using the same data. (C) 2011 Elsevier Inc. All rights reserved

    Bi-invariant Means in Lie Groups. Application to Left-invariant Polyaffine Transformations

    Get PDF
    In this work, we present a general framework to define rigorously a novel type of mean in Lie groups, called the bi-invariant mean. This mean enjoys many desirable invariance properties, which generalize to the non-linear case the properties of the arithmetic mean: it is invariant with respect to left- and right-multiplication, as well as inversion. Previously, this type of mean was only defined in Lie groups endowed with a bi-invariant Riemannian metric, like compact Lie groups such as the group of rotations. But Riemannian bi-invariant metrics do not always exist. In particular, we prove in this work that such metrics do not exist in any dimension for rigid transformations, which form but the most simple Lie group involved in bio-medical image registration. To overcome the lack of existence of bi-invariant Riemannian metrics for many Lie groups, we propose in this article to define bi-invariant means in any finite-dimensional real Lie group via a general barycentric equation, whose solution is by definition the bi-invariant mean. We show the existence and uniqueness of this novel type of mean, provided the dispersion of the data is small enough, and the convergence of an efficient iterative algorithm for computing this mean has also been shown. The intuition of the existence of such a mean was first given by R.P.Woods (without any precise definition), along with an efficient algorithm for computing it (without proof of convergence), in the case of matrix groups. In the case of rigid transformations, we give a simple criterion for the general existence and uniqueness of the bi-invariant mean, which happens to be the same as for rotations. We also give closed forms for the bi-invariant mean in a number of simple but instructive cases, including 2D rigid transformations. Interestingly, for general linear transformations, we show that similarly to the Log-Euclidean mean, which we proposed in recent work, the bi-invariant mean is a generalization of the (scalar) geometric mean, since the determinant of the bi-invariant mean is exactly equal to the geometric mean of the determinants of the data. Last but not least, we use this new type of mean to define a novel class of polyaffine transformations, called left-invariant polyaffine, which allows to fuse local rigid or affine components arbitrarily far away from the identity, contrary to Log-Euclidean polyaffine fusion, which we have recently introduced

    Estimation of Fiber Orientations Using Neighborhood Information

    Full text link
    Data from diffusion magnetic resonance imaging (dMRI) can be used to reconstruct fiber tracts, for example, in muscle and white matter. Estimation of fiber orientations (FOs) is a crucial step in the reconstruction process and these estimates can be corrupted by noise. In this paper, a new method called Fiber Orientation Reconstruction using Neighborhood Information (FORNI) is described and shown to reduce the effects of noise and improve FO estimation performance by incorporating spatial consistency. FORNI uses a fixed tensor basis to model the diffusion weighted signals, which has the advantage of providing an explicit relationship between the basis vectors and the FOs. FO spatial coherence is encouraged using weighted l1-norm regularization terms, which contain the interaction of directional information between neighbor voxels. Data fidelity is encouraged using a squared error between the observed and reconstructed diffusion weighted signals. After appropriate weighting of these competing objectives, the resulting objective function is minimized using a block coordinate descent algorithm, and a straightforward parallelization strategy is used to speed up processing. Experiments were performed on a digital crossing phantom, ex vivo tongue dMRI data, and in vivo brain dMRI data for both qualitative and quantitative evaluation. The results demonstrate that FORNI improves the quality of FO estimation over other state of the art algorithms.Comment: Journal paper accepted in Medical Image Analysis. 35 pages and 16 figure

    Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions

    Full text link
    We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes

    Second-order Democratic Aggregation

    Full text link
    Aggregated second-order features extracted from deep convolutional networks have been shown to be effective for texture generation, fine-grained recognition, material classification, and scene understanding. In this paper, we study a class of orderless aggregation functions designed to minimize interference or equalize contributions in the context of second-order features and we show that they can be computed just as efficiently as their first-order counterparts and they have favorable properties over aggregation by summation. Another line of work has shown that matrix power normalization after aggregation can significantly improve the generalization of second-order representations. We show that matrix power normalization implicitly equalizes contributions during aggregation thus establishing a connection between matrix normalization techniques and prior work on minimizing interference. Based on the analysis we present {\gamma}-democratic aggregators that interpolate between sum ({\gamma}=1) and democratic pooling ({\gamma}=0) outperforming both on several classification tasks. Moreover, unlike power normalization, the {\gamma}-democratic aggregations can be computed in a low dimensional space by sketching that allows the use of very high-dimensional second-order features. This results in a state-of-the-art performance on several datasets
    • …
    corecore