1,014 research outputs found
Breaking the curse of dimensionality in regression
Models with many signals, high-dimensional models, often impose structures on
the signal strengths. The common assumption is that only a few signals are
strong and most of the signals are zero or close (collectively) to zero.
However, such a requirement might not be valid in many real-life applications.
In this article, we are interested in conducting large-scale inference in
models that might have signals of mixed strengths. The key challenge is that
the signals that are not under testing might be collectively non-negligible
(although individually small) and cannot be accurately learned. This article
develops a new class of tests that arise from a moment matching formulation. A
virtue of these moment-matching statistics is their ability to borrow strength
across features, adapt to the sparsity size and exert adjustment for testing
growing number of hypothesis. GRoup-level Inference of Parameter, GRIP, test
harvests effective sparsity structures with hypothesis formulation for an
efficient multiple testing procedure. Simulated data showcase that GRIPs error
control is far better than the alternative methods. We develop a minimax
theory, demonstrating optimality of GRIP for a broad range of models, including
those where the model is a mixture of a sparse and high-dimensional dense
signals.Comment: 51 page
Post-Selection Inference for Generalized Linear Models with Many Controls
This paper considers generalized linear models in the presence of many
controls. We lay out a general methodology to estimate an effect of interest
based on the construction of an instrument that immunize against model
selection mistakes and apply it to the case of logistic binary choice model.
More specifically we propose new methods for estimating and constructing
confidence regions for a regression parameter of primary interest , a
parameter in front of the regressor of interest, such as the treatment variable
or a policy variable. These methods allow to estimate at the
root- rate when the total number of other regressors, called controls,
potentially exceed the sample size using sparsity assumptions. The sparsity
assumption means that there is a subset of controls which suffices to
accurately approximate the nuisance part of the regression function.
Importantly, the estimators and these resulting confidence regions are valid
uniformly over -sparse models satisfying and other
technical conditions. These procedures do not rely on traditional consistent
model selection arguments for their validity. In fact, they are robust with
respect to moderate model selection mistakes in variable selection. Under
suitable conditions, the estimators are semi-parametrically efficient in the
sense of attaining the semi-parametric efficiency bounds for the class of
models in this paper
Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives
Over the past few years, adversarial training has become an extremely active
research topic and has been successfully applied to various Artificial
Intelligence (AI) domains. As a potentially crucial technique for the
development of the next generation of emotional AI systems, we herein provide a
comprehensive overview of the application of adversarial training to affective
computing and sentiment analysis. Various representative adversarial training
algorithms are explained and discussed accordingly, aimed at tackling diverse
challenges associated with emotional AI systems. Further, we highlight a range
of potential future research directions. We expect that this overview will help
facilitate the development of adversarial training for affective computing and
sentiment analysis in both the academic and industrial communities
Combinatorial Penalties: Which structures are preserved by convex relaxations?
We consider the homogeneous and the non-homogeneous convex relaxations for
combinatorial penalty functions defined on support sets. Our study identifies
key differences in the tightness of the resulting relaxations through the
notion of the lower combinatorial envelope of a set-function along with new
necessary conditions for support identification. We then propose a general
adaptive estimator for convex monotone regularizers, and derive new sufficient
conditions for support recovery in the asymptotic setting
Fast global convergence of gradient methods for high-dimensional statistical recovery
Many statistical -estimators are based on convex optimization problems
formed by the combination of a data-dependent loss function with a norm-based
regularizer. We analyze the convergence rates of projected gradient and
composite gradient methods for solving such problems, working within a
high-dimensional framework that allows the data dimension \pdim to grow with
(and possibly exceed) the sample size \numobs. This high-dimensional
structure precludes the usual global assumptions---namely, strong convexity and
smoothness conditions---that underlie much of classical optimization analysis.
We define appropriately restricted versions of these conditions, and show that
they are satisfied with high probability for various statistical models. Under
these conditions, our theory guarantees that projected gradient descent has a
globally geometric rate of convergence up to the \emph{statistical precision}
of the model, meaning the typical distance between the true unknown parameter
and an optimal solution . This result is substantially
sharper than previous convergence results, which yielded sublinear convergence,
or linear convergence only up to the noise level. Our analysis applies to a
wide range of -estimators and statistical models, including sparse linear
regression using Lasso (-regularized regression); group Lasso for block
sparsity; log-linear models with regularization; low-rank matrix recovery using
nuclear norm regularization; and matrix decomposition. Overall, our analysis
reveals interesting connections between statistical precision and computational
efficiency in high-dimensional estimation
Estimating individual treatment effects under unobserved confounding using binary instruments
Estimating individual treatment effects (ITEs) from observational data is
relevant in many fields such as personalized medicine. However, in practice,
the treatment assignment is usually confounded by unobserved variables and thus
introduces bias. A remedy to remove the bias is the use of instrumental
variables (IVs). Such settings are widespread in medicine (e.g., trials where
compliance is used as binary IV). In this paper, we propose a novel, multiply
robust machine learning framework, called MRIV, for estimating ITEs using
binary IVs and thus yield an unbiased ITE estimator. Different from previous
work for binary IVs, our framework estimates the ITE directly via a pseudo
outcome regression. (1) We provide a theoretical analysis where we show that
our framework yields multiply robust convergence rates: our ITE estimator
achieves fast convergence even if several nuisance estimators converge slowly.
(2) We further show that our framework asymptotically outperforms
state-of-the-art plug-in IV methods for ITE estimation. (3) We build upon our
theoretical results and propose a tailored deep neural network architecture
called MRIV-Net for ITE estimation using binary IVs. Across various
computational experiments, we demonstrate empirically that our MRIV-Net
achieves state-of-the-art performance. To the best of our knowledge, our MRIV
is the first machine learning framework for estimating ITEs in the binary IV
setting shown to be multiply robust
High-dimensional semi-supervised learning: in search for optimal inference of the mean
We provide a high-dimensional semi-supervised inference framework focused on
the mean and variance of the response. Our data are comprised of an extensive
set of observations regarding the covariate vectors and a much smaller set of
labeled observations where we observe both the response as well as the
covariates. We allow the size of the covariates to be much larger than the
sample size and impose weak conditions on a statistical form of the data. We
provide new estimators of the mean and variance of the response that extend
some of the recent results presented in low-dimensional models. In particular,
at times we will not necessitate consistent estimation of the functional form
of the data. Together with estimation of the population mean and variance, we
provide their asymptotic distribution and confidence intervals where we
showcase gains in efficiency compared to the sample mean and variance. Our
procedure, with minor modifications, is then presented to make important
contributions regarding inference about average treatment effects. We also
investigate the robustness of estimation and coverage and showcase widespread
applicability and generality of the proposed method
- …