20,876 research outputs found
A Generalization of the Doubling Construction for Sums of Squares Identities
The doubling construction is a fast and important way to generate new
solutions to the Hurwitz problem on sums of squares identities from any known
ones. In this short note, we generalize the doubling construction and obtain
from any given admissible triple a series of new ones
for all positive integer , where is the
Hurwitz-Radon function
Frugal Optimization for Cost-related Hyperparameters
The increasing demand for democratizing machine learning algorithms calls for
hyperparameter optimization (HPO) solutions at low cost. Many machine learning
algorithms have hyperparameters which can cause a large variation in the
training cost. But this effect is largely ignored in existing HPO methods,
which are incapable to properly control cost during the optimization process.
To address this problem, we develop a new cost-frugal HPO solution. The core of
our solution is a simple but new randomized direct-search method, for which we
prove a convergence rate of and an
-approximation guarantee on the total cost. We provide
strong empirical results in comparison with state-of-the-art HPO methods on
large AutoML benchmarks.Comment: 29 pages (including supplementary appendix
How to Fine-Tune BERT for Text Classification?
Language model pre-training has proven to be useful in learning universal
language representations. As a state-of-the-art language model pre-training
model, BERT (Bidirectional Encoder Representations from Transformers) has
achieved amazing results in many language understanding tasks. In this paper,
we conduct exhaustive experiments to investigate different fine-tuning methods
of BERT on text classification task and provide a general solution for BERT
fine-tuning. Finally, the proposed solution obtains new state-of-the-art
results on eight widely-studied text classification datasets
- …