2 research outputs found
Progressive Stochastic Binarization of Deep Networks
A plethora of recent research has focused on improving the memory footprint
and inference speed of deep networks by reducing the complexity of (i)
numerical representations (for example, by deterministic or stochastic
quantization) and (ii) arithmetic operations (for example, by binarization of
weights).
We propose a stochastic binarization scheme for deep networks that allows for
efficient inference on hardware by restricting itself to additions of small
integers and fixed shifts. Unlike previous approaches, the underlying
randomized approximation is progressive, thus permitting an adaptive control of
the accuracy of each operation at run-time. In a low-precision setting, we
match the accuracy of previous binarized approaches. Our representation is
unbiased - it approaches continuous computation with increasing sample size. In
a high-precision regime, the computational costs are competitive with previous
quantization schemes. Progressive stochastic binarization also permits
localized, dynamic accuracy control within a single network, thereby providing
a new tool for adaptively focusing computational attention.
We evaluate our method on networks of various architectures, already
pretrained on ImageNet. With representational costs comparable to previous
schemes, we obtain accuracies close to the original floating point
implementation. This includes pruned networks, except the known special case of
certain types of separated convolutions. By focusing computational attention
using progressive sampling, we reduce inference costs on ImageNet further by a
factor of up to 33% (before network pruning)
A meta-learning recommender system for hyperparameter tuning: predicting when tuning improves SVM classifiers
For many machine learning algorithms, predictive performance is critically
affected by the hyperparameter values used to train them. However, tuning these
hyperparameters can come at a high computational cost, especially on larger
datasets, while the tuned settings do not always significantly outperform the
default values. This paper proposes a recommender system based on meta-learning
to identify exactly when it is better to use default values and when to tune
hyperparameters for each new dataset. Besides, an in-depth analysis is
performed to understand what they take into account for their decisions,
providing useful insights. An extensive analysis of different categories of
meta-features, meta-learners, and setups across 156 datasets is performed.
Results show that it is possible to accurately predict when tuning will
significantly improve the performance of the induced models. The proposed
system reduces the time spent on optimization processes, without reducing the
predictive performance of the induced models (when compared with the ones
obtained using tuned hyperparameters). We also explain the decision-making
process of the meta-learners in terms of linear separability-based hypotheses.
Although this analysis is focused on the tuning of Support Vector Machines, it
can also be applied to other algorithms, as shown in experiments performed with
decision trees.Comment: 49 pages, 11 figure