10,892 research outputs found
Wireless Data Acquisition for Edge Learning: Data-Importance Aware Retransmission
By deploying machine-learning algorithms at the network edge, edge learning
can leverage the enormous real-time data generated by billions of mobile
devices to train AI models, which enable intelligent mobile applications. In
this emerging research area, one key direction is to efficiently utilize radio
resources for wireless data acquisition to minimize the latency of executing a
learning task at an edge server. Along this direction, we consider the specific
problem of retransmission decision in each communication round to ensure both
reliability and quantity of those training data for accelerating model
convergence. To solve the problem, a new retransmission protocol called
data-importance aware automatic-repeat-request (importance ARQ) is proposed.
Unlike the classic ARQ focusing merely on reliability, importance ARQ
selectively retransmits a data sample based on its uncertainty which helps
learning and can be measured using the model under training. Underpinning the
proposed protocol is a derived elegant communication-learning relation between
two corresponding metrics, i.e., signal-to-noise ratio (SNR) and data
uncertainty. This relation facilitates the design of a simple threshold based
policy for importance ARQ. The policy is first derived based on the classic
classifier model of support vector machine (SVM), where the uncertainty of a
data sample is measured by its distance to the decision boundary. The policy is
then extended to the more complex model of convolutional neural networks (CNN)
where data uncertainty is measured by entropy. Extensive experiments have been
conducted for both the SVM and CNN using real datasets with balanced and
imbalanced distributions. Experimental results demonstrate that importance ARQ
effectively copes with channel fading and noise in wireless data acquisition to
achieve faster model convergence than the conventional channel-aware ARQ.Comment: This is an updated version: 1) extension to general classifiers; 2)
consideration of imbalanced classification in the experiments. Submitted to
IEEE Journal for possible publicatio
Generalization error for multi-class margin classification
In this article, we study rates of convergence of the generalization error of
multi-class margin classifiers. In particular, we develop an upper bound theory
quantifying the generalization error of various large margin classifiers. The
theory permits a treatment of general margin losses, convex or nonconvex, in
presence or absence of a dominating class. Three main results are established.
First, for any fixed margin loss, there may be a trade-off between the ideal
and actual generalization performances with respect to the choice of the class
of candidate decision functions, which is governed by the trade-off between the
approximation and estimation errors. In fact, different margin losses lead to
different ideal or actual performances in specific cases. Second, we
demonstrate, in a problem of linear learning, that the convergence rate can be
arbitrarily fast in the sample size depending on the joint distribution of
the input/output pair. This goes beyond the anticipated rate .
Third, we establish rates of convergence of several margin classifiers in
feature selection with the number of candidate variables allowed to greatly
exceed the sample size but no faster than .Comment: Published at http://dx.doi.org/10.1214/07-EJS069 in the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Surrogate Functions for Maximizing Precision at the Top
The problem of maximizing precision at the top of a ranked list, often dubbed
Precision@k (prec@k), finds relevance in myriad learning applications such as
ranking, multi-label classification, and learning with severe label imbalance.
However, despite its popularity, there exist significant gaps in our
understanding of this problem and its associated performance measure.
The most notable of these is the lack of a convex upper bounding surrogate
for prec@k. We also lack scalable perceptron and stochastic gradient descent
algorithms for optimizing this performance measure. In this paper we make key
contributions in these directions. At the heart of our results is a family of
truly upper bounding surrogates for prec@k. These surrogates are motivated in a
principled manner and enjoy attractive properties such as consistency to prec@k
under various natural margin/noise conditions.
These surrogates are then used to design a class of novel perceptron
algorithms for optimizing prec@k with provable mistake bounds. We also devise
scalable stochastic gradient descent style methods for this problem with
provable convergence bounds. Our proofs rely on novel uniform convergence
bounds which require an in-depth analysis of the structural properties of
prec@k and its surrogates. We conclude with experimental results comparing our
algorithms with state-of-the-art cutting plane and stochastic gradient
algorithms for maximizing [email protected]: To appear in the the proceedings of the 32nd International Conference
on Machine Learning (ICML 2015
- β¦