79 research outputs found

    Une brève histoire de l'apprentissage

    Get PDF
    International audienc

    Fast rates in statistical and online learning

    Get PDF
    The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most of these conditions are special cases of a single, unifying condition, that comes in two forms: the central condition for 'proper' learning algorithms that always output a hypothesis in the given model, and stochastic mixability for online algorithms that may make predictions outside of the model. We show that under surprisingly weak assumptions both conditions are, in a certain sense, equivalent. The central condition has a re-interpretation in terms of convexity of a set of pseudoprobabilities, linking it to density estimation under misspecification. For bounded losses, we show how the central condition enables a direct proof of fast rates and we prove its equivalence to the Bernstein condition, itself a generalization of the Tsybakov margin condition, both of which have played a central role in obtaining fast rates in statistical learning. Yet, while the Bernstein condition is two-sided, the central condition is one-sided, making it more suitable to deal with unbounded losses. In its stochastic mixability form, our condition generalizes both a stochastic exp-concavity condition identified by Juditsky, Rigollet and Tsybakov and Vovk's notion of mixability. Our unifying conditions thus provide a substantial step towards a characterization of fast rates in statistical learning, similar to how classical mixability characterizes constant regret in the sequential prediction with expert advice setting.Comment: 69 pages, 3 figure

    Analysis of CART and Random Forest on Statistics Student Status at Universitas Terbuka

    Get PDF
    CART and Random Forest are part of machine learning which is an essential part of the purpose of this research. CART is used to determine student status indicators, and Random Forest improves classification accuracy results. Based on the results of CART, three parameters can affect student status, namely the year of initial registration, number of rolls, and credits. Meanwhile, based on the classification accuracy results, RF can improve the accuracy performance on student status data with a difference in the percentage of CART by 1.44% in training data and testing data by 2.24%.CART and Random Forest are part of machine learning which is an essential part of the purpose of this research. CART is used to determine student status indicators, and Random Forest improves classification accuracy results. Based on the results of CART, three parameters can affect student status, namely the year of initial registration, number of rolls, and credits. Meanwhile, based on the classification accuracy results, RF can improve the accuracy performance on student status data with a difference in the percentage of CART by 1.44% in training data and testing data by 2.24%

    An {l1,l2,l}\{l_1,l_2,l_{\infty}\}-Regularization Approach to High-Dimensional Errors-in-variables Models

    Full text link
    Several new estimation methods have been recently proposed for the linear regression model with observation error in the design. Different assumptions on the data generating process have motivated different estimators and analysis. In particular, the literature considered (1) observation errors in the design uniformly bounded by some δˉ\bar \delta, and (2) zero mean independent observation errors. Under the first assumption, the rates of convergence of the proposed estimators depend explicitly on δˉ\bar \delta, while the second assumption has been applied when an estimator for the second moment of the observational error is available. This work proposes and studies two new estimators which, compared to other procedures for regression models with errors in the design, exploit an additional ll_{\infty}-norm regularization. The first estimator is applicable when both (1) and (2) hold but does not require an estimator for the second moment of the observational error. The second estimator is applicable under (2) and requires an estimator for the second moment of the observation error. Importantly, we impose no assumption on the accuracy of this pilot estimator, in contrast to the previously known procedures. As the recent proposals, we allow the number of covariates to be much larger than the sample size. We establish the rates of convergence of the estimators and compare them with the bounds obtained for related estimators in the literature. These comparisons show interesting insights on the interplay of the assumptions and the achievable rates of convergence

    Efficiency of the averaged rank-based estimator for first order Sobol index inference

    Full text link
    Among the many estimators of first order Sobol indices that have been proposed in the literature, the so-called rank-based estimator is arguably the simplest to implement. This estimator can be viewed as the empirical auto-correlation of the response variable sample obtained upon reordering the data by increasing values of the inputs. This simple idea can be extended to higher lags of autocorrelation, thus providing several competing estimators of the same parameter. We show that these estimators can be combined in a simple manner to achieve the theoretical variance efficiency bound asymptotically

    A Trichotomy for Transductive Online Learning

    Full text link
    We present new upper and lower bounds on the number of learner mistakes in the `transductive' online learning setting of Ben-David, Kushilevitz and Mansour (1997). This setting is similar to standard online learning, except that the adversary fixes a sequence of instances x1,,xnx_1,\dots,x_n to be labeled at the start of the game, and this sequence is known to the learner. Qualitatively, we prove a trichotomy, stating that the minimal number of mistakes made by the learner as nn grows can take only one of precisely three possible values: nn, Θ(log(n))\Theta\left(\log (n)\right), or Θ(1)\Theta(1). Furthermore, this behavior is determined by a combination of the VC dimension and the Littlestone dimension. Quantitatively, we show a variety of bounds relating the number of mistakes to well-known combinatorial dimensions. In particular, we improve the known lower bound on the constant in the Θ(1)\Theta(1) case from Ω(log(d))\Omega\left(\sqrt{\log(d)}\right) to Ω(log(d))\Omega(\log(d)) where dd is the Littlestone dimension. Finally, we extend our results to cover multiclass classification and the agnostic setting
    corecore