20,503 research outputs found

    A Generalization of the Doubling Construction for Sums of Squares Identities

    Full text link
    The doubling construction is a fast and important way to generate new solutions to the Hurwitz problem on sums of squares identities from any known ones. In this short note, we generalize the doubling construction and obtain from any given admissible triple [r,s,n][r,s,n] a series of new ones [r+ρ(2m1),2ms,2mn][r+\rho(2^{m-1}),2^ms,2^mn] for all positive integer mm, where ρ\rho is the Hurwitz-Radon function

    Frugal Optimization for Cost-related Hyperparameters

    Full text link
    The increasing demand for democratizing machine learning algorithms calls for hyperparameter optimization (HPO) solutions at low cost. Many machine learning algorithms have hyperparameters which can cause a large variation in the training cost. But this effect is largely ignored in existing HPO methods, which are incapable to properly control cost during the optimization process. To address this problem, we develop a new cost-frugal HPO solution. The core of our solution is a simple but new randomized direct-search method, for which we prove a convergence rate of O(dK)O(\frac{\sqrt{d}}{\sqrt{K}}) and an O(dϵ2)O(d\epsilon^{-2})-approximation guarantee on the total cost. We provide strong empirical results in comparison with state-of-the-art HPO methods on large AutoML benchmarks.Comment: 29 pages (including supplementary appendix

    How to Fine-Tune BERT for Text Classification?

    Full text link
    Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets
    corecore