6 research outputs found
Multimodal Subspace Support Vector Data Description
In this paper, we propose a novel method for projecting data from multiple
modalities to a new subspace optimized for one-class classification. The
proposed method iteratively transforms the data from the original feature space
of each modality to a new common feature space along with finding a joint
compact description of data coming from all the modalities. For data in each
modality, we define a separate transformation to map the data from the
corresponding feature space to the new optimized subspace by exploiting the
available information from the class of interest only. We also propose
different regularization strategies for the proposed method and provide both
linear and non-linear formulations. The proposed Multimodal Subspace Support
Vector Data Description outperforms all the competing methods using data from a
single modality or fusing data from all modalities in four out of five
datasets.Comment: 26 pages manuscript (6 tables, 2 figures), 24 pages supplementary
material (27 tables, 10 figures). The manuscript and supplementary material
are combined as a single .pdf (50 pages) fil
Multimodal subspace support vector data description
Highlights
• A novel method for transforming the multimodal data into a common feature space is proposed.
• The shared subspace optimized for one-class classification yields better results than traditional concatenation of multimodal data for one-class classification.
• Different regularization strategies along with linear and non-linear formulation provides more freedom of choice for optimizing a model according to specific evaluation metric.In this paper, we propose a novel method for projecting data from multiple modalities to a new subspace optimized for one-class classification. The proposed method iteratively transforms the data from the original feature space of each modality to a new common feature space along with finding a joint compact description of data coming from all the modalities. For data in each modality, we define a separate transformation to map the data from the corresponding feature space to the new optimized subspace by exploiting the available information from the class of interest only. We also propose different regularization strategies for the proposed method and provide both linear and non-linear formulations. The proposed Multimodal Subspace Support Vector Data Description outperforms all the competing methods using data from a single modality or fusing data from all modalities in four out of five datasets
Convolutional autoencoder-based multimodal one-class classification
One-class classification refers to approaches of learning using data from a
single class only. In this paper, we propose a deep learning one-class
classification method suitable for multimodal data, which relies on two
convolutional autoencoders jointly trained to reconstruct the positive input
data while obtaining the data representations in the latent space as compact as
possible. During inference, the distance of the latent representation of an
input to the origin can be used as an anomaly score. Experimental results using
a multimodal macroinvertebrate image classification dataset show that the
proposed multimodal method yields better results as compared to the unimodal
approach. Furthermore, study the effect of different input image sizes, and we
investigate how recently proposed feature diversity regularizers affect the
performance of our approach. We show that such regularizers improve
performance.Comment: 5 pages, 1 figure, 4 table
Boosting rare benthic macroinvertebrates taxa identification with one-class classification
Insect monitoring is crucial for understanding the consequences of rapid
ecological changes, but taxa identification currently requires tedious manual
expert work and cannot be scaled-up efficiently. Deep convolutional neural
networks (CNNs), provide a viable way to significantly increase the
biomonitoring volumes. However, taxa abundances are typically very imbalanced
and the amounts of training images for the rarest classes are simply too low
for deep CNNs. As a result, the samples from the rare classes are often
completely missed, while detecting them has biological importance. In this
paper, we propose combining the trained deep CNN with one-class classifiers to
improve the rare species identification. One-class classification models are
traditionally trained with much fewer samples and they can provide a mechanism
to indicate samples potentially belonging to the rare classes for human
inspection. Our experiments confirm that the proposed approach may indeed
support moving towards partial automation of the taxa identification task.Comment: 5 pages, 1 figure, 2 table
Subspace Support Vector Data Description and Extensions
Machine learning deals with discovering the knowledge that governs the learning process. The science of machine learning helps create techniques that enhance the capabilities of a system through the use of data. Typical machine learning techniques identify or predict different patterns in the data. In classification tasks, a machine learning model is trained using some training data to identify the unknown function that maps the input data to the output labels. The classification task gets challenging if the data from some categories are either unavailable or so diverse that they cannot be modelled statistically. For example, to train a model for anomaly detection, it is usually challenging to collect anomalous data for training, but the normal data is available in abundance. In such cases, it is possible to use One-Class Classification (OCC) techniques where the model is trained by using data only from one class.
OCC algorithms are practical in situations where it is vital to identify one of the categories, but the examples from that specific category are scarce. Numerous OCC techniques have been proposed in the literature that model the data in the given feature space; however, such data can be high-dimensional or may not provide discriminative information for classification. In order to avoid the curse of dimensionality, standard dimensionality reduction techniques are commonly used as a preprocessing step in many machine learning algorithms. Principal Component Analysis (PCA) is an example of a widely used algorithm to transform data into a subspace suitable for the task at hand while maintaining the meaningful features of a given dataset.
This thesis provides a new paradigm that jointly optimizes a subspace and data description for one-class classification via Support Vector Data Description (SVDD). We initiated the idea of subspace learning for one class classification by proposing a novel Subspace Support Vector Data Description (SSVDD) method, which was further extended to Ellipsoidal Subspace Support Vector Data Description (ESSVDD). ESSVDD generalizes SSVDD for a hypersphere by using ellipsoidal data description and it converges faster than SSVDD. It is important to train a joint model for multimodal data when data is collected from multiple sources. Therefore, we also proposed a multimodal approach, namely Multimodal Subspace Support Vector Data Description (MSSVDD) for transforming the data from multiple modalities to a common shared space for OCC. An important contribution of this thesis is to provide a framework unifying the subspace learning methods for SVDD. The proposed Graph-Embedded Subspace Support Vector Data Description (GESSVDD) framework helps revealing novel insights into the previously proposed methods and allows deriving novel variants that incorporate different optimization goals.
The main focus of the thesis is on generic novel methods which can be adapted to different application domains. We experimented with standard datasets from different domains such as robotics, healthcare, and economics and achieved better performance than competing methods in most of the cases. We also proposed a taxa identification framework for rare benthic macroinvertebrates. Benthic macroinvertebrate taxa distribution is typically very imbalanced. The amounts of training images for the rarest classes are too low for properly training deep learning-based methods, while these rarest classes can be central in biodiversity monitoring. We show that the classic one-class classifiers in general, and the proposed methods in particular, can enhance a deep neural network classification performance for imbalanced datasets
Credit Card Fraud Detection using One-Class Classification Algorithms
Similar to most things in everyday life, the advent of payment cards also has good and bad sides. It undoubtedly, made life easier by bringing the whole payment system to a single card, but it also paved the way for a new set of illegal activities and frauds. The credit card fraud has been carried out since the payment cards came into existence, and since then, the trend in such frauds has been an increasing one. Therefore, a quest to attenuate the losses caused by such frauds began. For this purpose, many preventive and detective measures have been taken in the past, and new ways are sought to further improve the policies. These measures, however, reduce the losses temporarily only and have not yet succeeded in converting the uptrend in the losses by such frauds into a downtrend because fraudsters always come up with a new way of tricking the people and the system. Thus, a new way of solving this ever-existing challenge is needed, which can detect even those fraudulent instances that are executed by techniques and methods that are yet-to-be-invented by fraudsters. Moreover, the occurrence of normal (non-fraudulent) credit card transactions is much more than fraudulent ones, and therefore, the data for credit card fraud detection is highly imbalanced. Another challenge in credit card fraud detection systems is the high dimensionality of datasets. Therefore, to address the imbalance nature of the data, to cope with the curse of dimensionality with a new way of making the model to regulate and extract the discriminative features, and to detect the fraud carried out by yet-to-be-invented techniques, we implemented a set of novel and state of the art subspace learning-based One-Class Classification algorithms. We experimented with integrating a projection matrix and geometric data information in the training phase to improve credit card fraud detection. We also experimented by using a maximization-update rule in updating the projection matrix instead of the classical minimization-update rule in the subspace leaning-based data description. We found that the linear version of Graph-embedded Subspace Support Vector Data Description with kNN graph, gradient-based solution, and minimization-update rule works better than all other models