1,549 research outputs found
Efficient learning of neighbor representations for boundary trees and forests
We introduce a semiparametric approach to neighbor-based classification. We
build off the recently proposed Boundary Trees algorithm by Mathy et al.(2015)
which enables fast neighbor-based classification, regression and retrieval in
large datasets. While boundary trees use an Euclidean measure of similarity,
the Differentiable Boundary Tree algorithm by Zoran et al.(2017) was introduced
to learn low-dimensional representations of complex input data, on which
semantic similarity can be calculated to train boundary trees. As is pointed
out by its authors, the differentiable boundary tree approach contains a few
limitations that prevents it from scaling to large datasets. In this paper, we
introduce Differentiable Boundary Sets, an algorithm that overcomes the
computational issues of the differentiable boundary tree scheme and also
improves its classification accuracy and data representability. Our algorithm
is efficiently implementable with existing tools and offers a significant
reduction in training time. We test and compare the algorithms on the well
known MNIST handwritten digits dataset and the newer Fashion-MNIST dataset by
Xiao et al.(2017).Comment: 9 pages, 2 figure
Optimal statistical inference in the presence of systematic uncertainties using neural network optimization based on binned Poisson likelihoods with nuisance parameters
Data analysis in science, e.g., high-energy particle physics, is often
subject to an intractable likelihood if the observables and observations span a
high-dimensional input space. Typically the problem is solved by reducing the
dimensionality using feature engineering and histograms, whereby the latter
technique allows to build the likelihood using Poisson statistics. However, in
the presence of systematic uncertainties represented by nuisance parameters in
the likelihood, the optimal dimensionality reduction with a minimal loss of
information about the parameters of interest is not known. This work presents a
novel strategy to construct the dimensionality reduction with neural networks
for feature engineering and a differential formulation of histograms so that
the full workflow can be optimized with the result of the statistical
inference, e.g., the variance of a parameter of interest, as objective. We
discuss how this approach results in an estimate of the parameters of interest
that is close to optimal and the applicability of the technique is demonstrated
with a simple example based on pseudo-experiments and a more complex example
from high-energy particle physics
Pathologies of Neural Models Make Interpretations Difficult
One way to interpret neural model predictions is to highlight the most
important input features---for example, a heatmap visualization over the words
in an input sentence. In existing interpretation methods for NLP, a word's
importance is determined by either input perturbation---measuring the decrease
in model confidence when that word is removed---or by the gradient with respect
to that word. To understand the limitations of these methods, we use input
reduction, which iteratively removes the least important word from the input.
This exposes pathological behaviors of neural models: the remaining words
appear nonsensical to humans and are not the ones determined as important by
interpretation methods. As we confirm with human experiments, the reduced
examples lack information to support the prediction of any label, but models
still make the same predictions with high confidence. To explain these
counterintuitive results, we draw connections to adversarial examples and
confidence calibration: pathological behaviors reveal difficulties in
interpreting neural models trained with maximum likelihood. To mitigate their
deficiencies, we fine-tune the models by encouraging high entropy outputs on
reduced examples. Fine-tuned models become more interpretable under input
reduction without accuracy loss on regular examples.Comment: EMNLP 2018 camera read
- …