420 research outputs found
Locally orderless tensor networks for classifying two- and three-dimensional medical images
Tensor networks are factorisations of high rank tensors into networks of
lower rank tensors and have primarily been used to analyse quantum many-body
problems. Tensor networks have seen a recent surge of interest in relation to
supervised learning tasks with a focus on image classification. In this work,
we improve upon the matrix product state (MPS) tensor networks that can operate
on one-dimensional vectors to be useful for working with 2D and 3D medical
images. We treat small image regions as orderless, squeeze their spatial
information into feature dimensions and then perform MPS operations on these
locally orderless regions. These local representations are then aggregated in a
hierarchical manner to retain global structure. The proposed locally orderless
tensor network (LoTeNet) is compared with relevant methods on three datasets.
The architecture of LoTeNet is fixed in all experiments and we show it requires
lesser computational resources to attain performance on par or superior to the
compared methods.Comment: Accepted for publication at the Journal of Machine Learning for
Biomedical Imaging (MELBA) (see https://melba-journal.org). Source code at
https://github.com/raghavian/LoTeNet_pytorch
Locally Orderless Registration
Image registration is an important tool for medical image analysis and is
used to bring images into the same reference frame by warping the coordinate
field of one image, such that some similarity measure is minimized. We study
similarity in image registration in the context of Locally Orderless Images
(LOI), which is the natural way to study density estimates and reveals the 3
fundamental scales: the measurement scale, the intensity scale, and the
integration scale.
This paper has three main contributions: Firstly, we rephrase a large set of
popular similarity measures into a common framework, which we refer to as
Locally Orderless Registration, and which makes full use of the features of
local histograms. Secondly, we extend the theoretical understanding of the
local histograms. Thirdly, we use our framework to compare two state-of-the-art
intensity density estimators for image registration: The Parzen Window (PW) and
the Generalized Partial Volume (GPV), and we demonstrate their differences on a
popular similarity measure, Normalized Mutual Information (NMI).
We conclude, that complicated similarity measures such as NMI may be
evaluated almost as fast as simple measures such as Sum of Squared Distances
(SSD) regardless of the choice of PW and GPV. Also, GPV is an asymmetric
measure, and PW is our preferred choice.Comment: submitte
Multi-scale Orderless Pooling of Deep Convolutional Activation Features
Deep convolutional neural networks (CNN) have shown their promise as a
universal representation for recognition. However, global CNN activations lack
geometric invariance, which limits their robustness for classification and
matching of highly variable scenes. To improve the invariance of CNN
activations without degrading their discriminative power, this paper presents a
simple but effective scheme called multi-scale orderless pooling (MOP-CNN).
This scheme extracts CNN activations for local patches at multiple scale
levels, performs orderless VLAD pooling of these activations at each level
separately, and concatenates the result. The resulting MOP-CNN representation
can be used as a generic feature for either supervised or unsupervised
recognition tasks, from image classification to instance-level retrieval; it
consistently outperforms global CNN activations without requiring any joint
training of prediction layers for a particular target dataset. In absolute
terms, it achieves state-of-the-art results on the challenging SUN397 and MIT
Indoor Scenes classification datasets, and competitive results on
ILSVRC2012/2013 classification and INRIA Holidays retrieval datasets
Information-Theoretic Registration with Explicit Reorientation of Diffusion-Weighted Images
We present an information-theoretic approach to the registration of images
with directional information, and especially for diffusion-Weighted Images
(DWI), with explicit optimization over the directional scale. We call it
Locally Orderless Registration with Directions (LORD). We focus on normalized
mutual information as a robust information-theoretic similarity measure for
DWI. The framework is an extension of the LOR-DWI density-based hierarchical
scale-space model that varies and optimizes the integration, spatial,
directional, and intensity scales. As affine transformations are insufficient
for inter-subject registration, we extend the model to non-rigid deformations.
We illustrate that the proposed model deforms orientation distribution
functions (ODFs) correctly and is capable of handling the classic complex
challenges in DWI-registrations, such as the registration of fiber-crossings
along with kissing, fanning, and interleaving fibers. Our experimental results
clearly illustrate a novel promising regularizing effect, that comes from the
nonlinear orientation-based cost function. We show the properties of the
different image scales and, we show that including orientational information in
our model makes the model better at retrieving deformations in contrast to
standard scalar-based registration.Comment: 16 pages, 19 figure
What-and-Where to Match: Deep Spatially Multiplicative Integration Networks for Person Re-identification
Matching pedestrians across disjoint camera views, known as person
re-identification (re-id), is a challenging problem that is of importance to
visual recognition and surveillance. Most existing methods exploit local
regions within spatial manipulation to perform matching in local
correspondence. However, they essentially extract \emph{fixed} representations
from pre-divided regions for each image and perform matching based on the
extracted representation subsequently. For models in this pipeline, local finer
patterns that are crucial to distinguish positive pairs from negative ones
cannot be captured, and thus making them underperformed. In this paper, we
propose a novel deep multiplicative integration gating function, which answers
the question of \emph{what-and-where to match} for effective person re-id. To
address \emph{what} to match, our deep network emphasizes common local patterns
by learning joint representations in a multiplicative way. The network
comprises two Convolutional Neural Networks (CNNs) to extract convolutional
activations, and generates relevant descriptors for pedestrian matching. This
thus, leads to flexible representations for pair-wise images. To address
\emph{where} to match, we combat the spatial misalignment by performing
spatially recurrent pooling via a four-directional recurrent neural network to
impose spatial dependency over all positions with respect to the entire image.
The proposed network is designed to be end-to-end trainable to characterize
local pairwise feature interactions in a spatially aligned manner. To
demonstrate the superiority of our method, extensive experiments are conducted
over three benchmark data sets: VIPeR, CUHK03 and Market-1501.Comment: Published at Pattern Recognition, Elsevie
- …