14 research outputs found

    Precise Learning Curves and Higher-Order Scaling Limits for Dot Product Kernel Regression

    Full text link
    As modern machine learning models continue to advance the computational frontier, it has become increasingly important to develop precise estimates for expected performance improvements under different model and data scaling regimes. Currently, theoretical understanding of the learning curves that characterize how the prediction error depends on the number of samples is restricted to either large-sample asymptotics (mβ†’βˆžm\to\infty) or, for certain simple data distributions, to the high-dimensional asymptotics in which the number of samples scales linearly with the dimension (m∝dm\propto d). There is a wide gulf between these two regimes, including all higher-order scaling relations m∝drm\propto d^r, which are the subject of the present paper. We focus on the problem of kernel ridge regression for dot-product kernels and present precise formulas for the test error, bias, and variance, for data drawn uniformly from the sphere in the rrth-order asymptotic scaling regime mβ†’βˆžm\to\infty with m/drm/d^r held constant. We observe a peak in the learning curve whenever mβ‰ˆdr/r!m \approx d^r/r! for any integer rr, leading to multiple sample-wise descent and nontrivial behavior at multiple scales.Comment: 32 pages; 4 + 3 figure
    corecore