10,535 research outputs found

    Automatic Debiased Machine Learning of Causal and Structural Effects

    Full text link
    Many causal and structural effects depend on regressions. Examples include average treatment effects, policy effects, average derivatives, regression decompositions, economic average equivalent variation, and parameters of economic structural models. The regressions may be high dimensional. Plugging machine learners into identifying equations can lead to poor inference due to bias and/or model selection. This paper gives automatic debiasing for estimating equations and valid asymptotic inference for the estimators of effects of interest. The debiasing is automatic in that its construction uses the identifying equations without the full form of the bias correction and is performed by machine learning. Novel results include convergence rates for Lasso and Dantzig learners of the bias correction, primitive conditions for asymptotic inference for important examples, and general conditions for GMM. A variety of regression learners and identifying equations are covered. Automatic debiased machine learning (Auto-DML) is applied to estimating the average treatment effect on the treated for the NSW job training data and to estimating demand elasticities from Nielsen scanner data while allowing preferences to be correlated with prices and income

    Numerical analysis of least squares and perceptron learning for classification problems

    Get PDF
    This work presents study on regularized and non-regularized versions of perceptron learning and least squares algorithms for classification problems. Fr'echet derivatives for regularized least squares and perceptron learning algorithms are derived. Different Tikhonov's regularization techniques for choosing the regularization parameter are discussed. Decision boundaries obtained by non-regularized algorithms to classify simulated and experimental data sets are analyzed

    General nonexact oracle inequalities for classes with a subexponential envelope

    Full text link
    We show that empirical risk minimization procedures and regularized empirical risk minimization procedures satisfy nonexact oracle inequalities in an unbounded framework, under the assumption that the class has a subexponential envelope function. The main novelty, in addition to the boundedness assumption free setup, is that those inequalities can yield fast rates even in situations in which exact oracle inequalities only hold with slower rates. We apply these results to show that procedures based on ℓ1\ell_1 and nuclear norms regularization functions satisfy oracle inequalities with a residual term that decreases like 1/n1/n for every LqL_q-loss functions (q≥2q\geq2), while only assuming that the tail behavior of the input and output variables are well behaved. In particular, no RIP type of assumption or "incoherence condition" are needed to obtain fast residual terms in those setups. We also apply these results to the problems of convex aggregation and model selection.Comment: Published in at http://dx.doi.org/10.1214/11-AOS965 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Regularized brain reading with shrinkage and smoothing

    Full text link
    Functional neuroimaging measures how the brain responds to complex stimuli. However, sample sizes are modest, noise is substantial, and stimuli are high dimensional. Hence, direct estimates are inherently imprecise and call for regularization. We compare a suite of approaches which regularize via shrinkage: ridge regression, the elastic net (a generalization of ridge regression and the lasso), and a hierarchical Bayesian model based on small area estimation (SAE). We contrast regularization with spatial smoothing and combinations of smoothing and shrinkage. All methods are tested on functional magnetic resonance imaging (fMRI) data from multiple subjects participating in two different experiments related to reading, for both predicting neural response to stimuli and decoding stimuli from responses. Interestingly, when the regularization parameters are chosen by cross-validation independently for every voxel, low/high regularization is chosen in voxels where the classification accuracy is high/low, indicating that the regularization intensity is a good tool for identification of relevant voxels for the cognitive task. Surprisingly, all the regularization methods work about equally well, suggesting that beating basic smoothing and shrinkage will take not only clever methods, but also careful modeling.Comment: Published at http://dx.doi.org/10.1214/15-AOAS837 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …