84,378 research outputs found

    Estimating Random Variables from Random Sparse Observations

    Full text link
    Let X_1,...., X_n be a collection of iid discrete random variables, and Y_1,..., Y_m a set of noisy observations of such variables. Assume each observation Y_a to be a random function of some a random subset of the X_i's, and consider the conditional distribution of X_i given the observations, namely \mu_i(x_i)\equiv\prob\{X_i=x_i|Y\} (a posteriori probability). We establish a general relation between the distribution of \mu_i, and the fixed points of the associated density evolution operator. Such relation holds asymptotically in the large system limit, provided the average number of variables an observation depends on is bounded. We discuss the relevance of our result to a number of applications, ranging from sparse graph codes, to multi-user detection, to group testing.Comment: 22 pages, 1 eps figures, invited paper for European Transactions on Telecommunication

    Penalized Likelihood Methods for Estimation of Sparse High Dimensional Directed Acyclic Graphs

    Full text link
    Directed acyclic graphs (DAGs) are commonly used to represent causal relationships among random variables in graphical models. Applications of these models arise in the study of physical, as well as biological systems, where directed edges between nodes represent the influence of components of the system on each other. The general problem of estimating DAGs from observed data is computationally NP-hard, Moreover two directed graphs may be observationally equivalent. When the nodes exhibit a natural ordering, the problem of estimating directed graphs reduces to the problem of estimating the structure of the network. In this paper, we propose a penalized likelihood approach that directly estimates the adjacency matrix of DAGs. Both lasso and adaptive lasso penalties are considered and an efficient algorithm is proposed for estimation of high dimensional DAGs. We study variable selection consistency of the two penalties when the number of variables grows to infinity with the sample size. We show that although lasso can only consistently estimate the true network under stringent assumptions, adaptive lasso achieves this task under mild regularity conditions. The performance of the proposed methods is compared to alternative methods in simulated, as well as real, data examples.Comment: 19 pages, 8 figure

    Fast and efficient algorithms for sparse semiparametric bi-functional regression

    Full text link
    A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.Comment: 33 pages, 6 figures, 10 table

    Most Likely Separation of Intensity and Warping Effects in Image Registration

    Full text link
    This paper introduces a class of mixed-effects models for joint modeling of spatially correlated intensity variation and warping variation in 2D images. Spatially correlated intensity variation and warp variation are modeled as random effects, resulting in a nonlinear mixed-effects model that enables simultaneous estimation of template and model parameters by optimization of the likelihood function. We propose an algorithm for fitting the model which alternates estimation of variance parameters and image registration. This approach avoids the potential estimation bias in the template estimate that arises when treating registration as a preprocessing step. We apply the model to datasets of facial images and 2D brain magnetic resonance images to illustrate the simultaneous estimation and prediction of intensity and warp effects

    The Dantzig selector: Statistical estimation when pp is much larger than nn

    Get PDF
    In many important statistical applications, the number of variables or parameters pp is much larger than the number of observations nn. Suppose then that we have observations y=Xβ+zy=X\beta+z, where βRp\beta\in\mathbf{R}^p is a parameter vector of interest, XX is a data matrix with possibly far fewer rows than columns, npn\ll p, and the ziz_i's are i.i.d. N(0,σ2)N(0,\sigma^2). Is it possible to estimate β\beta reliably based on the noisy data yy? To estimate β\beta, we introduce a new estimator--we call it the Dantzig selector--which is a solution to the 1\ell_1-regularization problem \min_{\tilde{\b eta}\in\mathbf{R}^p}\|\tilde{\beta}\|_{\ell_1}\quad subject to\quad \|X^*r\|_{\ell_{\infty}}\leq(1+t^{-1})\sqrt{2\log p}\cdot\sigma, where rr is the residual vector yXβ~y-X\tilde{\beta} and tt is a positive scalar. We show that if XX obeys a uniform uncertainty principle (with unit-normed columns) and if the true parameter vector β\beta is sufficiently sparse (which here roughly guarantees that the model is identifiable), then with very large probability, β^β22C22logp(σ2+imin(βi2,σ2)).\|\hat{\beta}-\beta\|_{\ell_2}^2\le C^2\cdot2\log p\cdot \Biggl(\sigma^2+\sum_i\min(\beta_i^2,\sigma^2)\Biggr). Our results are nonasymptotic and we give values for the constant CC. Even though nn may be much smaller than pp, our estimator achieves a loss within a logarithmic factor of the ideal mean squared error one would achieve with an oracle which would supply perfect information about which coordinates are nonzero, and which were above the noise level. In multivariate regression and from a model selection viewpoint, our result says that it is possible nearly to select the best subset of variables by solving a very simple convex program, which, in fact, can easily be recast as a convenient linear program (LP).Comment: This paper discussed in: [arXiv:0803.3124], [arXiv:0803.3126], [arXiv:0803.3127], [arXiv:0803.3130], [arXiv:0803.3134], [arXiv:0803.3135]. Rejoinder in [arXiv:0803.3136]. Published in at http://dx.doi.org/10.1214/009053606000001523 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
    corecore