766 research outputs found
3D Shape Similarity Using Vectors of Locally Aggregated Tensors
IEEE International Conference on Image Processing 2013International audienceIn this paper, we present an efficient 3D object retrieval method invariant to scale, orientation and pose. Our approach is based on the dense extraction of discriminative local descriptors extracted from 2D views. We aggregate the descriptors into a single vector signature using tensor products. The similarity between 3D models can then be efficiently computed with a simple dot product. Experiments on the SHREC12 commonly-used benchmark demonstrate that our approach obtains superior performance in searching for generic shapes
Enhancing Deep Learning Models through Tensorization: A Comprehensive Survey and Framework
The burgeoning growth of public domain data and the increasing complexity of
deep learning model architectures have underscored the need for more efficient
data representation and analysis techniques. This paper is motivated by the
work of (Helal, 2023) and aims to present a comprehensive overview of
tensorization. This transformative approach bridges the gap between the
inherently multidimensional nature of data and the simplified 2-dimensional
matrices commonly used in linear algebra-based machine learning algorithms.
This paper explores the steps involved in tensorization, multidimensional data
sources, various multiway analysis methods employed, and the benefits of these
approaches. A small example of Blind Source Separation (BSS) is presented
comparing 2-dimensional algorithms and a multiway algorithm in Python. Results
indicate that multiway analysis is more expressive. Contrary to the intuition
of the dimensionality curse, utilising multidimensional datasets in their
native form and applying multiway analysis methods grounded in multilinear
algebra reveal a profound capacity to capture intricate interrelationships
among various dimensions while, surprisingly, reducing the number of model
parameters and accelerating processing. A survey of the multi-away analysis
methods and integration with various Deep Neural Networks models is presented
using case studies in different application domains.Comment: 34 pages, 8 figures, 4 table
Image Completion with Heterogeneously Filtered Spectral Hints
Image completion with large-scale free-form missing regions is one of the
most challenging tasks for the computer vision community. While researchers
pursue better solutions, drawbacks such as pattern unawareness, blurry
textures, and structure distortion remain noticeable, and thus leave space for
improvement. To overcome these challenges, we propose a new StyleGAN-based
image completion network, Spectral Hint GAN (SH-GAN), inside which a carefully
designed spectral processing module, Spectral Hint Unit, is introduced. We also
propose two novel 2D spectral processing strategies, Heterogeneous Filtering
and Gaussian Split that well-fit modern deep learning models and may further be
extended to other tasks. From our inclusive experiments, we demonstrate that
our model can reach FID scores of 3.4134 and 7.0277 on the benchmark datasets
FFHQ and Places2, and therefore outperforms prior works and reaches a new
state-of-the-art. We also prove the effectiveness of our design via ablation
studies, from which one may notice that the aforementioned challenges, i.e.
pattern unawareness, blurry textures, and structure distortion, can be
noticeably resolved. Our code will be open-sourced at:
https://github.com/SHI-Labs/SH-GAN.Comment: wacv2
The Role of Riemannian Manifolds in Computer Vision: From Coding to Deep Metric Learning
A diverse number of tasks in computer vision and machine learning
enjoy from representations of data that are compact yet
discriminative, informative and robust to critical measurements.
Two notable representations are offered by Region Covariance
Descriptors (RCovD) and linear subspaces which are naturally
analyzed through the manifold of Symmetric Positive Definite
(SPD) matrices and the Grassmann manifold, respectively, two
widely used types of Riemannian manifolds in computer vision.
As our first objective, we examine image and video-based
recognition applications where the local descriptors have the
aforementioned Riemannian structures, namely the SPD or linear
subspace structure. Initially, we provide a solution to compute
Riemannian version of the conventional Vector of Locally
aggregated Descriptors (VLAD), using geodesic distance of the
underlying manifold as the nearness measure. Next, by having a
closer look at the resulting codes, we formulate a new concept
which we name Local Difference Vectors (LDV). LDVs enable us to
elegantly expand our Riemannian coding techniques to any
arbitrary metric as well as provide intrinsic solutions to
Riemannian sparse coding and its variants when local structured
descriptors are considered.
We then turn our attention to two special types of covariance
descriptors namely infinite-dimensional RCovDs and rank-deficient
covariance matrices for which the underlying Riemannian
structure, i.e. the manifold of SPD matrices is out of reach to
great extent. %Generally speaking, infinite-dimensional RCovDs
offer better discriminatory power over their low-dimensional
counterparts.
To overcome this difficulty, we propose to approximate the
infinite-dimensional RCovDs by making use of two feature
mappings, namely random Fourier features and the Nystrom method.
As for the rank-deficient covariance matrices, unlike most
existing approaches that employ inference tools by predefined
regularizers, we derive positive definite kernels that can be
decomposed into the kernels on the cone of SPD matrices and
kernels on the Grassmann manifolds and show their effectiveness
for image set classification task.
Furthermore, inspired by attractive properties of Riemannian
optimization techniques, we extend the recently introduced Keep
It Simple and Straightforward MEtric learning (KISSME) method to
the scenarios where input data is non-linearly distributed. To
this end, we make use of the infinite dimensional covariance
matrices and propose techniques towards projecting on the
positive cone in a Reproducing Kernel Hilbert Space (RKHS).
We also address the sensitivity issue of the KISSME to the input
dimensionality. The KISSME algorithm is greatly dependent on
Principal Component Analysis (PCA) as a preprocessing step which
can lead to difficulties, especially when the dimensionality is
not meticulously set.
To address this issue, based on the KISSME algorithm, we develop
a Riemannian framework to jointly learn a mapping performing
dimensionality reduction and a metric in the induced space.
Lastly, in line with the recent trend in metric learning, we
devise end-to-end learning of a generic deep network for metric
learning using our derivation
- …