27,401 research outputs found
Classification Confidence Estimation with Test-Time Data-Augmentation
Machine learning plays an increasingly significant role in many aspects of
our lives (including medicine, transportation, security, justice and other
domains), making the potential consequences of false predictions increasingly
devastating. These consequences may be mitigated if we can automatically flag
such false predictions and potentially assign them to alternative, more
reliable mechanisms, that are possibly more costly and involve human attention.
This suggests the task of detecting errors, which we tackle in this paper for
the case of visual classification. To this end, we propose a novel approach for
classification confidence estimation. We apply a set of semantics-preserving
image transformations to the input image, and show how the resulting image sets
can be used to estimate confidence in the classifier's prediction. We
demonstrate the potential of our approach by extensively evaluating it on a
wide variety of classifier architectures and datasets, including
ResNext/ImageNet, achieving state of the art performance. This paper
constitutes a significant revision of our earlier work in this direction (Bahat
& Shakhnarovich, 2018)
Data-Efficient Mutual Information Neural Estimator
Measuring Mutual Information (MI) between high-dimensional, continuous,
random variables from observed samples has wide theoretical and practical
applications. Recent work, MINE (Belghazi et al. 2018), focused on estimating
tight variational lower bounds of MI using neural networks, but assumed
unlimited supply of samples to prevent overfitting. In real world applications,
data is not always available at a surplus. In this work, we focus on improving
data efficiency and propose a Data-Efficient MINE Estimator (DEMINE), by
developing a relaxed predictive MI lower bound that can be estimated at higher
data efficiency by orders of magnitudes. The predictive MI lower bound also
enables us to develop a new meta-learning approach using task augmentation,
Meta-DEMINE, to improve generalization of the network and further boost
estimation accuracy empirically. With improved data-efficiency, our estimators
enables statistical testing of dependency at practical dataset sizes. We
demonstrate the effectiveness of our estimators on synthetic benchmarks and a
real world fMRI data, with application of inter-subject correlation analysis
Machine Learning for recognition of minerals from multispectral data
Machine Learning (ML) has found several applications in spectroscopy,
including being used to recognise minerals and estimate elemental composition.
In this work, we present novel methods for automatic mineral identification
based on combining data from different spectroscopic methods. We evaluate
combining data from three spectroscopic methods: vibrational Raman scattering,
reflective Visible-Near Infrared (VNIR), and Laser-Induced Breakdown
Spectroscopy (LIBS). These methods were paired into Raman + VNIR, Raman + LIBS
and VNIR + LIBS, and different methods of data fusion applied to each pair to
classify minerals. The methods presented here are shown to outperform the use
of a single data source by a significant margin. Additionally, we present a
Deep Learning algorithm for mineral classification from Raman spectra that
outperforms previous state-of-the-art methods. Our approach was tested on
various open access experimental Raman (RRUFF) and VNIR (USGS, Relab,
ECOSTRESS), as well as synthetic LIBS NIST spectral libraries. Our
cross-validation tests show that multi-method spectroscopy paired with ML paves
the way towards rapid and accurate characterization of rocks and minerals.Comment: 11 page
Distance-based Confidence Score for Neural Network Classifiers
The reliable measurement of confidence in classifiers' predictions is very
important for many applications and is, therefore, an important part of
classifier design. Yet, although deep learning has received tremendous
attention in recent years, not much progress has been made in quantifying the
prediction confidence of neural network classifiers. Bayesian models offer a
mathematically grounded framework to reason about model uncertainty, but
usually come with prohibitive computational costs. In this paper we propose a
simple, scalable method to achieve a reliable confidence score, based on the
data embedding derived from the penultimate layer of the network. We
investigate two ways to achieve desirable embeddings, by using either a
distance-based loss or Adversarial Training. We then test the benefits of our
method when used for classification error prediction, weighting an ensemble of
classifiers, and novelty detection. In all tasks we show significant
improvement over traditional, commonly used confidence scores
Pose Estimation for Non-Cooperative Spacecraft Rendezvous Using Convolutional Neural Networks
On-board estimation of the pose of an uncooperative target spacecraft is an
essential task for future on-orbit servicing and close-proximity formation
flying missions. However, two issues hinder reliable on-board monocular vision
based pose estimation: robustness to illumination conditions due to a lack of
reliable visual features and scarcity of image datasets required for training
and benchmarking. To address these two issues, this work details the design and
validation of a monocular vision based pose determination architecture for
spaceborne applications. The primary contribution to the state-of-the-art of
this work is the introduction of a novel pose determination method based on
Convolutional Neural Networks (CNN) to provide an initial guess of the pose in
real-time on-board. The method involves discretizing the pose space and
training the CNN with images corresponding to the resulting pose labels. Since
reliable training of the CNN requires massive image datasets and computational
resources, the parameters of the CNN must be determined prior to the mission
with synthetic imagery. Moreover, reliable training of the CNN requires
datasets that appropriately account for noise, color, and illumination
characteristics expected in orbit. Therefore, the secondary contribution of
this work is the introduction of an image synthesis pipeline, which is tailored
to generate high fidelity images of any spacecraft 3D model. The proposed
technique is scalable to spacecraft of different structural and physical
properties as well as robust to the dynamic illumination conditions of space.
Through metrics measuring classification and pose accuracy, it is shown that
the presented architecture has desirable robustness and scalable properties.Comment: Presented at the 2018 IEEE Aerospace Conference, Big Sky, M
Full Workspace Generation of Serial-link Manipulators by Deep Learning based Jacobian Estimation
Apart from solving complicated problems that require a certain level of
intelligence, fine-tuned deep neural networks can also create fast algorithms
for slow, numerical tasks. In this paper, we introduce an improved version of
[1]'s work, a fast, deep-learning framework capable of generating the full
workspace of serial-link manipulators. The architecture consists of two neural
networks: an estimation net that approximates the manipulator Jacobian, and a
confidence net that measures the confidence of the approximation. We also
introduce M3 (Manipulability Maps of Manipulators), a MATLAB robotics library
based on [2](RTB), the datasets generated by which are used by this work.
Results have shown that not only are the neural networks significantly faster
than numerical inverse kinematics, it also offers superior accuracy when
compared to other machine learning alternatives. Implementations of the
algorithm (based on Keras[3]), including benchmark evaluation script, are
available at https://github.com/liaopeiyuan/Jacobian-Estimation . The M3
Library APIs and datasets are also available at
https://github.com/liaopeiyuan/M3 .Comment: 10 pages, 12 figure
Towards Robust Human Activity Recognition from RGB Video Stream with Limited Labeled Data
Human activity recognition based on video streams has received numerous
attentions in recent years. Due to lack of depth information, RGB video based
activity recognition performs poorly compared to RGB-D video based solutions.
On the other hand, acquiring depth information, inertia etc. is costly and
requires special equipment, whereas RGB video streams are available in ordinary
cameras. Hence, our goal is to investigate whether similar or even higher
accuracy can be achieved with RGB-only modality. In this regard, we propose a
novel framework that couples skeleton data extracted from RGB video and deep
Bidirectional Long Short Term Memory (BLSTM) model for activity recognition. A
big challenge of training such a deep network is the limited training data, and
exploring RGB-only stream significantly exaggerates the difficulty. We
therefore propose a set of algorithmic techniques to train this model
effectively, e.g., data augmentation, dynamic frame dropout and gradient
injection. The experiments demonstrate that our RGB-only solution surpasses the
state-of-the-art approaches that all exploit RGB-D video streams by a notable
margin. This makes our solution widely deployable with ordinary cameras.Comment: To appear in ICMLA 201
Partial Face Detection in the Mobile Domain
Generic face detection algorithms do not perform well in the mobile domain
due to significant presence of occluded and partially visible faces. One
promising technique to handle the challenge of partial faces is to design face
detectors based on facial segments. In this paper two different approaches of
facial segment-based face detection are discussed, namely, proposal-based
detection and detection by end-to-end regression. Methods that follow the first
approach rely on generating face proposals that contain facial segment
information. The three detectors following this approach, namely Facial
Segment-based Face Detector (FSFD), SegFace and DeepSegFace, discussed in this
paper, perform binary classification on each proposal based on features learned
from facial segments. The process of proposal generation, however, needs to be
handled separately, which can be very time consuming, and is not truly
necessary given the nature of the active authentication problem. Hence a novel
algorithm, Deep Regression-based User Image Detector (DRUID) is proposed, which
shifts from the classification to the regression paradigm, thus obviating the
need for proposal generation. DRUID has an unique network architecture with
customized loss functions, is trained using a relatively small amount of data
by utilizing a novel data augmentation scheme and is fast since it outputs the
bounding boxes of a face and its segments in a single pass. Being robust to
occlusion by design, the facial segment-based face detection methods,
especially DRUID show superior performance over other state-of-the-art face
detectors in terms of precision-recall and ROC curve on two mobile face
datasets.Comment: 18 pages, 22 figures, 3 tables, submitted to IEEE Transactions on
Image Processin
Optical Neural Networks
We develop a novel optical neural network (ONN) framework which introduces a
degree of scalar invariance to image classification estima- tion. Taking a hint
from the human eye, which has higher resolution near the center of the retina,
images are broken out into multiple levels of varying zoom based on a focal
point. Each level is passed through an identical convolutional neural network
(CNN) in a Siamese fashion, and the results are recombined to produce a high
accuracy estimate of the object class. ONNs act as a wrapper around existing
CNNs, and can thus be applied to many existing algorithms to produce notable
accuracy improvements without having to change the underlying architecture.Comment: Submitted to NIPS 201
A Less Biased Evaluation of Out-of-distribution Sample Detectors
In the real world, a learning system could receive an input that is unlike
anything it has seen during training. Unfortunately, out-of-distribution
samples can lead to unpredictable behaviour. We need to know whether any given
input belongs to the population distribution of the training/evaluation data to
prevent unpredictable behaviour in deployed systems. A recent surge of interest
in this problem has led to the development of sophisticated techniques in the
deep learning literature. However, due to the absence of a standard problem
definition or an exhaustive evaluation, it is not evident if we can rely on
these methods. What makes this problem different from a typical supervised
learning setting is that the distribution of outliers used in training may not
be the same as the distribution of outliers encountered in the application.
Classical approaches that learn inliers vs. outliers with only two datasets can
yield optimistic results. We introduce OD-test, a three-dataset evaluation
scheme as a more reliable strategy to assess progress on this problem. We
present an exhaustive evaluation of a broad set of methods from related areas
on image classification tasks. Contrary to the existing results, we show that
for realistic applications of high-dimensional images the previous techniques
have low accuracy and are not reliable in practice.Comment: to appear in BMVC 2019; v2 is more compact, with more result
- …