13,964 research outputs found
Generalization Error in Deep Learning
Deep learning models have lately shown great performance in various fields
such as computer vision, speech recognition, speech translation, and natural
language processing. However, alongside their state-of-the-art performance, it
is still generally unclear what is the source of their generalization ability.
Thus, an important question is what makes deep neural networks able to
generalize well from the training set to new data. In this article, we provide
an overview of the existing theory and bounds for the characterization of the
generalization error of deep neural networks, combining both classical and more
recent theoretical and empirical results
Low-Cost Learning via Active Data Procurement
We design mechanisms for online procurement of data held by strategic agents
for machine learning tasks. The challenge is to use past data to actively price
future data and give learning guarantees even when an agent's cost for
revealing her data may depend arbitrarily on the data itself. We achieve this
goal by showing how to convert a large class of no-regret algorithms into
online posted-price and learning mechanisms. Our results in a sense parallel
classic sample complexity guarantees, but with the key resource being money
rather than quantity of data: With a budget constraint , we give robust risk
(predictive error) bounds on the order of . Because we use an
active approach, we can often guarantee to do significantly better by
leveraging correlations between costs and data.
Our algorithms and analysis go through a model of no-regret learning with
arriving pairs (cost, data) and a budget constraint of . Our regret bounds
for this model are on the order of and we give lower bounds on the
same order.Comment: Full version of EC 2015 paper. Color recommended for figures but
nonessential. 36 pages, of which 12 appendi
Robust Interactive Learning
In this paper we propose and study a generalization of the standard
active-learning model where a more general type of query, class conditional
query, is allowed. Such queries have been quite useful in applications, but
have been lacking theoretical understanding. In this work, we characterize the
power of such queries under two well-known noise models. We give nearly tight
upper and lower bounds on the number of queries needed to learn both for the
general agnostic setting and for the bounded noise model. We further show that
our methods can be made adaptive to the (unknown) noise rate, with only
negligible loss in query complexity
- …