218 research outputs found
A Generally Semisupervised Dimensionality Reduction Method with Local and Global Regression Regularizations for Recognition
The insufficiency of labeled data is an important problem in image classification such as face recognition. However, unlabeled data are abundant in the real-world application. Therefore, semisupervised learning methods, which corporate a few labeled data and a large number of unlabeled data into learning, have received more and more attention in the field of face recognition. During the past years, graph-based semisupervised learning has been becoming a popular topic in the area of semisupervised learning. In this chapter, we newly present graph-based semisupervised learning method for face recognition. The presented method is based on local and global regression regularization. The local regression regularization has adopted a set of local classification functions to preserve both local discriminative and geometrical information, as well as to reduce the bias of outliers and handle imbalanced data; while the global regression regularization is to preserve the global discriminative information and to calculate the projection matrix for out-of-sample extrapolation. Extensive simulations based on synthetic and real-world datasets verify the effectiveness of the proposed method
Semisupervised Autoencoder for Sentiment Analysis
In this paper, we investigate the usage of autoencoders in modeling textual
data. Traditional autoencoders suffer from at least two aspects: scalability
with the high dimensionality of vocabulary size and dealing with
task-irrelevant words. We address this problem by introducing supervision via
the loss function of autoencoders. In particular, we first train a linear
classifier on the labeled data, then define a loss for the autoencoder with the
weights learned from the linear classifier. To reduce the bias brought by one
single classifier, we define a posterior probability distribution on the
weights of the classifier, and derive the marginalized loss of the autoencoder
with Laplace approximation. We show that our choice of loss function can be
rationalized from the perspective of Bregman Divergence, which justifies the
soundness of our model. We evaluate the effectiveness of our model on six
sentiment analysis datasets, and show that our model significantly outperforms
all the competing methods with respect to classification accuracy. We also show
that our model is able to take advantage of unlabeled dataset and get improved
performance. We further show that our model successfully learns highly
discriminative feature maps, which explains its superior performance.Comment: To appear in AAAI 201
Iterative Nearest Neighborhood Oversampling in Semisupervised Learning from Imbalanced Data
Transductive graph-based semi-supervised learning methods usually build an
undirected graph utilizing both labeled and unlabeled samples as vertices.
Those methods propagate label information of labeled samples to neighbors
through their edges in order to get the predicted labels of unlabeled samples.
Most popular semi-supervised learning approaches are sensitive to initial label
distribution happened in imbalanced labeled datasets. The class boundary will
be severely skewed by the majority classes in an imbalanced classification. In
this paper, we proposed a simple and effective approach to alleviate the
unfavorable influence of imbalance problem by iteratively selecting a few
unlabeled samples and adding them into the minority classes to form a balanced
labeled dataset for the learning methods afterwards. The experiments on UCI
datasets and MNIST handwritten digits dataset showed that the proposed approach
outperforms other existing state-of-art methods
Deep domain adaptation by weighted entropy minimization for the classification of aerial images
Fully convolutional neural networks (FCN) are successfully used for the automated pixel-wise classification of aerial images and possibly additional data. However, they require many labelled training samples to perform well. One approach addressing this issue is semi-supervised domain adaptation (SSDA). Here, labelled training samples from a source domain and unlabelled samples from a target domain are used jointly to obtain a target domain classifier, without requiring any labelled samples from the target domain. In this paper, a two-step approach for SSDA is proposed. The first step corresponds to a supervised training on the source domain, making use of strong data augmentation to increase the initial performance on the target domain. Secondly, the model is adapted by entropy minimization using a novel weighting strategy. The approach is evaluated on the basis of five domains, corresponding to five cities. Several training variants and adaptation scenarios are tested, indicating that proper data augmentation can already improve the initial target domain performance significantly resulting in an average overall accuracy of 77.5%. The weighted entropy minimization improves the overall accuracy on the target domains in 19 out of 20 scenarios on average by 1.8%. In all experiments a novel FCN architecture is used that yields results comparable to those of the best-performing models on the ISPRS labelling challenge while having an order of magnitude fewer parameters than commonly used FCNs. © 2020 Copernicus GmbH. All rights reserved
cGAN-Based High Dimensional IMU Sensor Data Generation for Therapeutic Activities
Human activity recognition is a core technology for applications such as
rehabilitation, ambient health monitoring, and human-computer interactions.
Wearable devices, particularly IMU sensors, can help us collect rich features
of human movements that can be leveraged in activity recognition. Developing a
robust classifier for activity recognition has always been of interest to
researchers. One major problem is that there is usually a deficit of training
data for some activities, making it difficult and sometimes impossible to
develop a classifier. In this work, a novel GAN network called TheraGAN was
developed to generate realistic IMU signals associated with a particular
activity. The generated signal is of a 6-channel IMU. i.e., angular velocities
and linear accelerations. Also, by introducing simple activities, which are
meaningful subparts of a complex full-length activity, the generation process
was facilitated for any activity with arbitrary length. To evaluate the
generated signals, besides perceptual similarity metrics, they were applied
along with real data to improve the accuracy of classifiers. The results show
that the maximum increase in the f1-score belongs to the LSTM classifier by a
13.27% rise when generated data were added. This shows the validity of the
generated data as well as TheraGAN as a tool to build more robust classifiers
in case of imbalanced data problem
A comprehensive review of 3D convolutional neural network-based classification techniques of diseased and defective crops using non-UAV-based hyperspectral images
Hyperspectral imaging (HSI) is a non-destructive and contactless technology
that provides valuable information about the structure and composition of an
object. It can capture detailed information about the chemical and physical
properties of agricultural crops. Due to its wide spectral range, compared with
multispectral- or RGB-based imaging methods, HSI can be a more effective tool
for monitoring crop health and productivity. With the advent of this imaging
tool in agrotechnology, researchers can more accurately address issues related
to the detection of diseased and defective crops in the agriculture industry.
This allows to implement the most suitable and accurate farming solutions, such
as irrigation and fertilization before crops enter a damaged and
difficult-to-recover phase of growth in the field. While HSI provides valuable
insights into the object under investigation, the limited number of HSI
datasets for crop evaluation presently poses a bottleneck. Dealing with the
curse of dimensionality presents another challenge due to the abundance of
spectral and spatial information in each hyperspectral cube. State-of-the-art
methods based on 1D- and 2D-CNNs struggle to efficiently extract spectral and
spatial information. On the other hand, 3D-CNN-based models have shown
significant promise in achieving better classification and detection results by
leveraging spectral and spatial features simultaneously. Despite the apparent
benefits of 3D-CNN-based models, their usage for classification purposes in
this area of research has remained limited. This paper seeks to address this
gap by reviewing 3D-CNN-based architectures and the typical deep learning
pipeline, including preprocessing and visualization of results, for the
classification of hyperspectral images of diseased and defective crops.
Furthermore, we discuss open research areas and challenges when utilizing
3D-CNNs with HSI data
- …