6 research outputs found
Comparison of model selection techniques for seafloor scattering statistics
In quantitative analysis of seafloor imagery, it is common to model the
collection of individual pixel intensities scattered by the seafloor as a
random variable with a given statistical distribution. There is a considerable
literature on statistical models for seafloor scattering, mostly focused on
areas with statistically homogeneous properties (i.e. exhibiting spatial
stationarity). For more complex seafloors, the pixel intensity distribution is
more appropriately modeled using a mixture of simple distributions. For very
complex seafloors, fitting 3 or more mixture components makes physical sense,
but the statistical model becomes much more complex in these cases. Therefore,
picking the number of components of the mixture model is a decision that must
be made, using a priori information, or using a data driven approach. However,
this information is time consuming to collect, and depends on the skill and
experience of the human. Therefore, a data-driven approach is advantageous to
use, and is explored in this work. Criteria for choosing a model always need to
balance the trade-off for the best fit for the data on the one hand and the
model complexity on the other hand. In this work, we compare several
statistical model selection criteria, e.g., the Bayesian information criterion.
Examples are given for SAS data collected by an autonomous underwater vehicle
in a rocky environment off the coast of Bergen, Norway using data from the
HISAS-1032 synthetic aperture sonar system.Comment: Paper presented at the 5th International Conference on Synthetic
Aperture Radar and Sonar, Lyric Italy, September 202
Affine-Transformation-Invariant Image Classification by Differentiable Arithmetic Distribution Module
Although Convolutional Neural Networks (CNNs) have achieved promising results
in image classification, they still are vulnerable to affine transformations
including rotation, translation, flip and shuffle. The drawback motivates us to
design a module which can alleviate the impact from different affine
transformations. Thus, in this work, we introduce a more robust substitute by
incorporating distribution learning techniques, focusing particularly on
learning the spatial distribution information of pixels in images. To rectify
the issue of non-differentiability of prior distribution learning methods that
rely on traditional histograms, we adopt the Kernel Density Estimation (KDE) to
formulate differentiable histograms. On this foundation, we present a novel
Differentiable Arithmetic Distribution Module (DADM), which is designed to
extract the intrinsic probability distributions from images. The proposed
approach is able to enhance the model's robustness to affine transformations
without sacrificing its feature extraction capabilities, thus bridging the gap
between traditional CNNs and distribution-based learning. We validate the
effectiveness of the proposed approach through ablation study and comparative
experiments with LeNet
S-Adapter: Generalizing Vision Transformer for Face Anti-Spoofing with Statistical Tokens
Face Anti-Spoofing (FAS) aims to detect malicious attempts to invade a face
recognition system by presenting spoofed faces. State-of-the-art FAS techniques
predominantly rely on deep learning models but their cross-domain
generalization capabilities are often hindered by the domain shift problem,
which arises due to different distributions between training and testing data.
In this study, we develop a generalized FAS method under the Efficient
Parameter Transfer Learning (EPTL) paradigm, where we adapt the pre-trained
Vision Transformer models for the FAS task. During training, the adapter
modules are inserted into the pre-trained ViT model, and the adapters are
updated while other pre-trained parameters remain fixed. We find the
limitations of previous vanilla adapters in that they are based on linear
layers, which lack a spoofing-aware inductive bias and thus restrict the
cross-domain generalization. To address this limitation and achieve
cross-domain generalized FAS, we propose a novel Statistical Adapter
(S-Adapter) that gathers local discriminative and statistical information from
localized token histograms. To further improve the generalization of the
statistical tokens, we propose a novel Token Style Regularization (TSR), which
aims to reduce domain style variance by regularizing Gram matrices extracted
from tokens across different domains. Our experimental results demonstrate that
our proposed S-Adapter and TSR provide significant benefits in both zero-shot
and few-shot cross-domain testing, outperforming state-of-the-art methods on
several benchmark tests. We will release the source code upon acceptance
Computational Aesthetics for Fashion
The online fashion industry is growing fast and with it, the need for advanced systems able to automatically solve different tasks in an accurate way. With the rapid advance of digital technologies, Deep Learning has played an important role in Computational Aesthetics, an interdisciplinary area that tries to bridge fine art, design, and computer science. Specifically, Computational Aesthetics aims to automatize human aesthetic judgments with computational methods. In this thesis, we focus on three applications of computer vision in fashion, and we discuss how Computational Aesthetics helps solve them accurately