56,602 research outputs found
The Limitations of Optimization from Samples
In this paper we consider the following question: can we optimize objective
functions from the training data we use to learn them? We formalize this
question through a novel framework we call optimization from samples (OPS). In
OPS, we are given sampled values of a function drawn from some distribution and
the objective is to optimize the function under some constraint.
While there are interesting classes of functions that can be optimized from
samples, our main result is an impossibility. We show that there are classes of
functions which are statistically learnable and optimizable, but for which no
reasonable approximation for optimization from samples is achievable. In
particular, our main result shows that there is no constant factor
approximation for maximizing coverage functions under a cardinality constraint
using polynomially-many samples drawn from any distribution.
We also show tight approximation guarantees for maximization under a
cardinality constraint of several interesting classes of functions including
unit-demand, additive, and general monotone submodular functions, as well as a
constant factor approximation for monotone submodular functions with bounded
curvature
Bayesian Conditional Density Filtering
We propose a Conditional Density Filtering (C-DF) algorithm for efficient
online Bayesian inference. C-DF adapts MCMC sampling to the online setting,
sampling from approximations to conditional posterior distributions obtained by
propagating surrogate conditional sufficient statistics (a function of data and
parameter estimates) as new data arrive. These quantities eliminate the need to
store or process the entire dataset simultaneously and offer a number of
desirable features. Often, these include a reduction in memory requirements and
runtime and improved mixing, along with state-of-the-art parameter inference
and prediction. These improvements are demonstrated through several
illustrative examples including an application to high dimensional compressed
regression. Finally, we show that C-DF samples converge to the target posterior
distribution asymptotically as sampling proceeds and more data arrives.Comment: 41 pages, 7 figures, 12 table
Dropout Distillation for Efficiently Estimating Model Confidence
We propose an efficient way to output better calibrated uncertainty scores
from neural networks. The Distilled Dropout Network (DDN) makes standard
(non-Bayesian) neural networks more introspective by adding a new training loss
which prevents them from being overconfident. Our method is more efficient than
Bayesian neural networks or model ensembles which, despite providing more
reliable uncertainty scores, are more cumbersome to train and slower to test.
We evaluate DDN on the the task of image classification on the CIFAR-10 dataset
and show that our calibration results are competitive even when compared to 100
Monte Carlo samples from a dropout network while they also increase the
classification accuracy. We also propose better calibration within the state of
the art Faster R-CNN object detection framework and show, using the COCO
dataset, that DDN helps train better calibrated object detectors
A role for the developing lexicon in phonetic category acquisition
Infants segment words from fluent speech during the same period when they are learning phonetic categories, yet accounts of phonetic category acquisition typically ignore information about the words in which sounds appear. We use a Bayesian model to illustrate how feedback from segmented words might constrain phonetic category learning by providing information about which sounds occur together in words. Simulations demonstrate that word-level information can successfully disambiguate overlapping English vowel categories. Learning patterns in the model are shown to parallel human behavior from artificial language learning tasks. These findings point to a central role for the developing lexicon in phonetic category acquisition and provide a framework for incorporating top-down constraints into models of category learning
- …