73,433 research outputs found

    Optimal Rates for Random Fourier Features

    Full text link
    Kernel methods represent one of the most powerful tools in machine learning to tackle problems expressed in terms of function values and derivatives due to their capability to represent and model complex relations. While these methods show good versatility, they are computationally intensive and have poor scalability to large data as they require operations on Gram matrices. In order to mitigate this serious computational limitation, recently randomized constructions have been proposed in the literature, which allow the application of fast linear algorithms. Random Fourier features (RFF) are among the most popular and widely applied constructions: they provide an easily computable, low-dimensional feature representation for shift-invariant kernels. Despite the popularity of RFFs, very little is understood theoretically about their approximation quality. In this paper, we provide a detailed finite-sample theoretical analysis about the approximation quality of RFFs by (i) establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and (ii) presenting guarantees in LrL^r (1r<1\le r<\infty) norms. We also propose an RFF approximation to derivatives of a kernel with a theoretical study on its approximation quality.Comment: To appear at NIPS-201

    Encoding and processing of sensory information in neuronal spike trains

    Get PDF
    Recently, a statistical signal-processing technique has allowed the information carried by single spike trains of sensory neurons on time-varying stimuli to be characterized quantitatively in a variety of preparations. In weakly electric fish, its application to first-order sensory neurons encoding electric field amplitude (P-receptor afferents) showed that they convey accurate information on temporal modulations in a behaviorally relevant frequency range (<80 Hz). At the next stage of the electrosensory pathway (the electrosensory lateral line lobe, ELL), the information sampled by first-order neurons is used to extract upstrokes and downstrokes in the amplitude modulation waveform. By using signal-detection techniques, we determined that these temporal features are explicitly represented by short spike bursts of second-order neurons (ELL pyramidal cells). Our results suggest that the biophysical mechanism underlying this computation is of dendritic origin. We also investigated the accuracy with which upstrokes and downstrokes are encoded across two of the three somatotopic body maps of the ELL (centromedial and lateral). Pyramidal cells of the centromedial map, in particular I-cells, encode up- and downstrokes more reliably than those of the lateral map. This result correlates well with the significance of these temporal features for a particular behavior (the jamming avoidance response) as assessed by lesion experiments of the centromedial map

    Learning with SGD and Random Features

    Get PDF
    Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments
    corecore