3 research outputs found
Learning curves of generic features maps for realistic datasets with a teacher-student model
Teacher-student models provide a framework in which the typical-case
performance of high-dimensional supervised learning can be described in closed
form. The assumptions of Gaussian i.i.d. input data underlying the canonical
teacher-student model may, however, be perceived as too restrictive to capture
the behaviour of realistic data sets. In this paper, we introduce a Gaussian
covariate generalisation of the model where the teacher and student can act on
different spaces, generated with fixed, but generic feature maps. While still
solvable in a closed form, this generalization is able to capture the learning
curves for a broad range of realistic data sets, thus redeeming the potential
of the teacher-student framework. Our contribution is then two-fold: First, we
prove a rigorous formula for the asymptotic training loss and generalisation
error. Second, we present a number of situations where the learning curve of
the model captures the one of a realistic data set learned with kernel
regression and classification, with out-of-the-box feature maps such as random
projections or scattering transforms, or with pre-learned ones - such as the
features learned by training multi-layer neural networks. We discuss both the
power and the limitations of the framework.Comment: v3: NeurIPS camera-read
PCA Initialization for Approximate Message Passing in Rotationally Invariant Models
We study the problem of estimating a rank- signal in the presence of
rotationally invariant noise-a class of perturbations more general than
Gaussian noise. Principal Component Analysis (PCA) provides a natural
estimator, and sharp results on its performance have been obtained in the
high-dimensional regime. Recently, an Approximate Message Passing (AMP)
algorithm has been proposed as an alternative estimator with the potential to
improve the accuracy of PCA. However, the existing analysis of AMP requires an
initialization that is both correlated with the signal and independent of the
noise, which is often unrealistic in practice. In this work, we combine the two
methods, and propose to initialize AMP with PCA. Our main result is a rigorous
asymptotic characterization of the performance of this estimator. Both the AMP
algorithm and its analysis differ from those previously derived in the Gaussian
setting: at every iteration, our AMP algorithm requires a specific term to
account for PCA initialization, while in the Gaussian case, PCA initialization
affects only the first iteration of AMP. The proof is based on a two-phase
artificial AMP that first approximates the PCA estimator and then mimics the
true AMP. Our numerical simulations show an excellent agreement between AMP
results and theoretical predictions, and suggest an interesting open direction
on achieving Bayes-optimal performance.Comment: 72 pages, 2 figures, appeared in Neural Information Processing
Systems (NeurIPS), 202