10,535 research outputs found
Automatic Debiased Machine Learning of Causal and Structural Effects
Many causal and structural effects depend on regressions. Examples include
average treatment effects, policy effects, average derivatives, regression
decompositions, economic average equivalent variation, and parameters of
economic structural models. The regressions may be high dimensional. Plugging
machine learners into identifying equations can lead to poor inference due to
bias and/or model selection. This paper gives automatic debiasing for
estimating equations and valid asymptotic inference for the estimators of
effects of interest. The debiasing is automatic in that its construction uses
the identifying equations without the full form of the bias correction and is
performed by machine learning. Novel results include convergence rates for
Lasso and Dantzig learners of the bias correction, primitive conditions for
asymptotic inference for important examples, and general conditions for GMM. A
variety of regression learners and identifying equations are covered. Automatic
debiased machine learning (Auto-DML) is applied to estimating the average
treatment effect on the treated for the NSW job training data and to estimating
demand elasticities from Nielsen scanner data while allowing preferences to be
correlated with prices and income
Numerical analysis of least squares and perceptron learning for classification problems
This work presents study on regularized and non-regularized versions of
perceptron learning and least squares algorithms for classification problems.
Fr'echet derivatives for regularized least squares and perceptron learning
algorithms are derived. Different Tikhonov's regularization techniques for
choosing the regularization parameter are discussed. Decision boundaries
obtained by non-regularized algorithms to classify simulated and experimental
data sets are analyzed
General nonexact oracle inequalities for classes with a subexponential envelope
We show that empirical risk minimization procedures and regularized empirical
risk minimization procedures satisfy nonexact oracle inequalities in an
unbounded framework, under the assumption that the class has a subexponential
envelope function. The main novelty, in addition to the boundedness assumption
free setup, is that those inequalities can yield fast rates even in situations
in which exact oracle inequalities only hold with slower rates. We apply these
results to show that procedures based on and nuclear norms
regularization functions satisfy oracle inequalities with a residual term that
decreases like for every -loss functions (), while only
assuming that the tail behavior of the input and output variables are well
behaved. In particular, no RIP type of assumption or "incoherence condition"
are needed to obtain fast residual terms in those setups. We also apply these
results to the problems of convex aggregation and model selection.Comment: Published in at http://dx.doi.org/10.1214/11-AOS965 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Regularized brain reading with shrinkage and smoothing
Functional neuroimaging measures how the brain responds to complex stimuli.
However, sample sizes are modest, noise is substantial, and stimuli are high
dimensional. Hence, direct estimates are inherently imprecise and call for
regularization. We compare a suite of approaches which regularize via
shrinkage: ridge regression, the elastic net (a generalization of ridge
regression and the lasso), and a hierarchical Bayesian model based on small
area estimation (SAE). We contrast regularization with spatial smoothing and
combinations of smoothing and shrinkage. All methods are tested on functional
magnetic resonance imaging (fMRI) data from multiple subjects participating in
two different experiments related to reading, for both predicting neural
response to stimuli and decoding stimuli from responses. Interestingly, when
the regularization parameters are chosen by cross-validation independently for
every voxel, low/high regularization is chosen in voxels where the
classification accuracy is high/low, indicating that the regularization
intensity is a good tool for identification of relevant voxels for the
cognitive task. Surprisingly, all the regularization methods work about equally
well, suggesting that beating basic smoothing and shrinkage will take not only
clever methods, but also careful modeling.Comment: Published at http://dx.doi.org/10.1214/15-AOAS837 in the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …