2,511,011 research outputs found
Asymptotic inference for high-dimensional data
In this paper, we study inference for high-dimensional data characterized by
small sample sizes relative to the dimension of the data. In particular, we
provide an infinite-dimensional framework to study statistical models that
involve situations in which (i) the number of parameters increase with the
sample size (that is, allowed to be random) and (ii) there is a possibility of
missing data. Under a variety of tail conditions on the components of the data,
we provide precise conditions for the joint consistency of the estimators of
the mean. In the process, we clarify and improve some of the recent consistency
results that appeared in the literature. An important aspect of the work
presented is the development of asymptotic normality results for these models.
As a consequence, we construct different test statistics for one-sample and
two-sample problems concerning the mean vector and obtain their asymptotic
distributions as a corollary of the infinite-dimensional results. Finally, we
use these theoretical results to develop an asymptotically justifiable
methodology for data analyses. Simulation results presented here describe
situations where the methodology can be successfully applied. They also
evaluate its robustness under a variety of conditions, some of which are
substantially different from the technical conditions. Comparisons to other
methods used in the literature are provided. Analyses of real-life data is also
included.Comment: Published in at http://dx.doi.org/10.1214/09-AOS718 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
High Dimensional Data Enrichment: Interpretable, Fast, and Data-Efficient
High dimensional structured data enriched model describes groups of
observations by shared and per-group individual parameters, each with its own
structure such as sparsity or group sparsity. In this paper, we consider the
general form of data enrichment where data comes in a fixed but arbitrary
number of groups G. Any convex function, e.g., norms, can characterize the
structure of both shared and individual parameters. We propose an estimator for
high dimensional data enriched model and provide conditions under which it
consistently estimates both shared and individual parameters. We also delineate
sample complexity of the estimator and present high probability non-asymptotic
bound on estimation error of all parameters. Interestingly the sample
complexity of our estimator translates to conditions on both per-group sample
sizes and the total number of samples. We propose an iterative estimation
algorithm with linear convergence rate and supplement our theoretical analysis
with synthetic and real experimental results. Particularly, we show the
predictive power of data-enriched model along with its interpretable results in
anticancer drug sensitivity analysis
- …
