188 research outputs found
Facial feature point extraction method based on combination of shape extraction and pattern matching
Occlusion Sensitivity Analysis with Augmentation Subspace Perturbation in Deep Feature Space
Deep Learning of neural networks has gained prominence in multiple
life-critical applications like medical diagnoses and autonomous vehicle
accident investigations. However, concerns about model transparency and biases
persist. Explainable methods are viewed as the solution to address these
challenges. In this study, we introduce the Occlusion Sensitivity Analysis with
Deep Feature Augmentation Subspace (OSA-DAS), a novel perturbation-based
interpretability approach for computer vision. While traditional perturbation
methods make only use of occlusions to explain the model predictions, OSA-DAS
extends standard occlusion sensitivity analysis by enabling the integration
with diverse image augmentations. Distinctly, our method utilizes the output
vector of a DNN to build low-dimensional subspaces within the deep feature
vector space, offering a more precise explanation of the model prediction. The
structural similarity between these subspaces encompasses the influence of
diverse augmentations and occlusions. We test extensively on the ImageNet-1k,
and our class- and model-agnostic approach outperforms commonly used
interpreters, setting it apart in the realm of explainable AI.Comment: Accepted at WACV 202
Hand-Shape Recognition Using the Distributions of Multi-Viewpoint Image Sets
This paper proposes a method for recognizing hand-shapes by using multi-viewpoint image sets. The recognition of a hand-shape is a difficult problem, as appearance of the hand changes largely depending on viewpoint, illumination conditions and individual characteristics. To overcome this problem, we apply the Kernel Orthogonal Mutual Subspace Method (KOMSM) to shift-invariance features obtained from multi-viewpoint images of a hand. When applying KOMSM to hand recognition with a lot of learning images from each class, it is necessary to consider how to run the KOMSM with heavy computational cost due to the kernel trick technique. We propose a new method that can drastically reduce the computational cost of KOMSM by adopting centroids and the number of images belonging to the centroids, which are obtained by using k-means clustering. The validity of the proposed method is demonstrated through evaluation experiments using multi-viewpoint image sets of 30 classes of hand-shapes
Time-series Anomaly Detection based on Difference Subspace between Signal Subspaces
This paper proposes a new method for anomaly detection in time-series data by
incorporating the concept of difference subspace into the singular spectrum
analysis (SSA). The key idea is to monitor slight temporal variations of the
difference subspace between two signal subspaces corresponding to the past and
present time-series data, as anomaly score. It is a natural generalization of
the conventional SSA-based method which measures the minimum angle between the
two signal subspaces as the degree of changes. By replacing the minimum angle
with the difference subspace, our method boosts the performance while using the
SSA-based framework as it can capture the whole structural difference between
the two subspaces in its magnitude and direction. We demonstrate our method's
effectiveness through performance evaluations on public time-series datasets.Comment: 8pages, an acknowledgement was added to v
Comparison between Constrained Mutual Subspace Method and Orthogonal Mutual Subspace Method – From the viewpoint of orthogonalization of subspaces –
This paper compares the performances between constrained mutual subspace method (CMSM), orthogonalmutual subspace method (OMSM), and also between their nonlinear extensions, namely kernel CMSM(KCMSM) and kernel OMSM (KOMSM). Although the princeples of the feature extraction in these methods aredifferent, their effectiveness are commonly derived from the orthogonalization of subspace, which is widely used tomeasure the performance of subspace-based methods. CMSM makes the relation between class subspaces similarto orthogonal relation by projecting the class subspaces onto the generalized difference subspaces. KCMSM is alsobased on this projection in the nonlinear feature space. On the other hand, OMSM orthogonalizes class subspacesdirectly by whitening the distribution of the class subspaces. KOMSM also utilizes this orthogonalization method inthe nonlinear feature space. From the experimental results, the performances of both the kernel methods (KCMSMand KOMSM) are found to be very high as compared to their linear methods (CMSM and OMSM) and theirperformances levels are well in the same order in spite of their different principles of orthogonalization
Controllable Multi-domain Semantic Artwork Synthesis
We present a novel framework for multi-domain synthesis of artwork from
semantic layouts. One of the main limitations of this challenging task is the
lack of publicly available segmentation datasets for art synthesis. To address
this problem, we propose a dataset, which we call ArtSem, that contains 40,000
images of artwork from 4 different domains with their corresponding semantic
label maps. We generate the dataset by first extracting semantic maps from
landscape photography and then propose a conditional Generative Adversarial
Network (GAN)-based approach to generate high-quality artwork from the semantic
maps without necessitating paired training data. Furthermore, we propose an
artwork synthesis model that uses domain-dependent variational encoders for
high-quality multi-domain synthesis. The model is improved and complemented
with a simple but effective normalization method, based on normalizing both the
semantic and style jointly, which we call Spatially STyle-Adaptive
Normalization (SSTAN). In contrast to previous methods that only take semantic
layout as input, our model is able to learn a joint representation of both
style and semantic information, which leads to better generation quality for
synthesizing artistic images. Results indicate that our model learns to
separate the domains in the latent space, and thus, by identifying the
hyperplanes that separate the different domains, we can also perform
fine-grained control of the synthesized artwork. By combining our proposed
dataset and approach, we are able to generate user-controllable artwork that is
of higher quality than existingComment: 15 pages, accepted by CVMJ, to appea
Adaptive occlusion sensitivity analysis for visually explaining video recognition networks
This paper proposes a method for visually explaining the decision-making
process of video recognition networks with a temporal extension of occlusion
sensitivity analysis, called Adaptive Occlusion Sensitivity Analysis (AOSA).
The key idea here is to occlude a specific volume of data by a 3D mask in an
input 3D temporal-spatial data space and then measure the change degree in the
output score. The occluded volume data that produces a larger change degree is
regarded as a more critical element for classification. However, while the
occlusion sensitivity analysis is commonly used to analyze single image
classification, applying this idea to video classification is not so
straightforward as a simple fixed cuboid cannot deal with complicated motions.
To solve this issue, we adaptively set the shape of a 3D occlusion mask while
referring to motions. Our flexible mask adaptation is performed by considering
the temporal continuity and spatial co-occurrence of the optical flows
extracted from the input video data. We further propose a novel method to
reduce the computational cost of the proposed method with the first-order
approximation of the output score with respect to an input video. We
demonstrate the effectiveness of our method through various and extensive
comparisons with the conventional methods in terms of the deletion/insertion
metric and the pointing metric on the UCF101 dataset and the Kinetics-400 and
700 datasets.Comment: 11 page
Discriminant feature extraction by generalized difference subspace
This paper reveals the discriminant ability of the orthogonal projection of data onto a generalized difference subspace (GDS) both theoretically and experimentally. In our previous work, we have demonstrated that GDS projection works as the quasi-orthogonalization of class subspaces. Interestingly, GDS projection also works as a discriminant feature extraction through a similar mechanism to the Fisher discriminant analysis (FDA). A direct proof of the connection between GDS projection and FDA is difficult due to the significant difference in their formulations. To avoid the difficulty, we first introduce geometrical Fisher discriminant analysis (gFDA) based on a simplified Fisher criterion. gFDA can work stably even under few samples, bypassing the small sample size (SSS) problem of FDA. Next, we prove that gFDA is equivalent to GDS projection with a small correction term. This equivalence ensures GDS projection to inherit the discriminant ability from FDA via gFDA. Furthermore, we discuss two useful extensions of these methods, 1) nonlinear extension by kernel trick, 2) the combination of convolutional neural network (CNN) features. The equivalence and the effectiveness of the extensions have been verified through extensive experiments on the extended Yale B+, CMU face database, ALOI, ETH80, MNIST and CIFAR10, focusing on the SSS problem
- …