83 research outputs found

    Tail bounds for all eigenvalues of a sum of random matrices

    Get PDF
    This work introduces the minimax Laplace transform method, a modification of the cumulant-based matrix Laplace transform method developed in "User-friendly tail bounds for sums of random matrices" (arXiv:1004.4389v6) that yields both upper and lower bounds on each eigenvalue of a sum of random self-adjoint matrices. This machinery is used to derive eigenvalue analogues of the classical Chernoff, Bennett, and Bernstein bounds. Two examples demonstrate the efficacy of the minimax Laplace transform. The first concerns the effects of column sparsification on the spectrum of a matrix with orthonormal rows. Here, the behavior of the singular values can be described in terms of coherence-like quantities. The second example addresses the question of relative accuracy in the estimation of eigenvalues of the covariance matrix of a random process. Standard results on the convergence of sample covariance matrices provide bounds on the number of samples needed to obtain relative accuracy in the spectral norm, but these results only guarantee relative accuracy in the estimate of the maximum eigenvalue. The minimax Laplace transform argument establishes that if the lowest eigenvalues decay sufficiently fast, on the order of (K^2*r*log(p))/eps^2 samples, where K is the condition number of an optimal rank-r approximation to C, are sufficient to ensure that the dominant r eigenvalues of the covariance matrix of a N(0, C) random vector are estimated to within a factor of 1+-eps with high probability.Comment: 20 pages, 1 figure, see also arXiv:1004.4389v

    Convex optimization methods for graphs and statistical modeling

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 209-220).An outstanding challenge in many problems throughout science and engineering is to succinctly characterize the relationships among a large number of interacting entities. Models based on graphs form one major thrust in this thesis, as graphs often provide a concise representation of the interactions among a large set of variables. A second major emphasis of this thesis are classes of structured models that satisfy certain algebraic constraints. The common theme underlying these approaches is the development of computational methods based on convex optimization, which are in turn useful in a broad array of problems in signal processing and machine learning. The specific contributions are as follows: -- We propose a convex optimization method for decomposing the sum of a sparse matrix and a low-rank matrix into the individual components. Based on new rank-sparsity uncertainty principles, we give conditions under which the convex program exactly recovers the underlying components. -- Building on the previous point, we describe a convex optimization approach to latent variable Gaussian graphical model selection. We provide theoretical guarantees of the statistical consistency of this convex program in the high-dimensional scaling regime in which the number of latent/observed variables grows with the number of samples of the observed variables. The algebraic varieties of sparse and low-rank matrices play a prominent role in this analysis. -- We present a general convex optimization formulation for linear inverse problems, in which we have limited measurements in the form of linear functionals of a signal or model of interest. When these underlying models have algebraic structure, the resulting convex programs can be solved exactly or approximately via semidefinite programming. We provide sharp estimates (based on computing certain Gaussian statistics related to the underlying model geometry) of the number of generic linear measurements required for exact and robust recovery in a variety of settings. -- We present convex graph invariants, which are invariants of a graph that are convex functions of the underlying adjacency matrix. Graph invariants characterize structural properties of a graph that do not depend on the labeling of the nodes; convex graph invariants constitute an important subclass, and they provide a systematic and unified computational framework based on convex optimization for solving a number of interesting graph problems. We emphasize a unified view of the underlying convex geometry common to these different frameworks. We describe applications of both these methods to problems in financial modeling and network analysis, and conclude with a discussion of directions for future research.by Venkat Chandrasekaran.Ph.D

    Compressive sensing for signal ensembles

    Get PDF
    Compressive sensing (CS) is a new approach to simultaneous sensing and compression that enables a potentially large reduction in the sampling and computation costs for acquisition of signals having a sparse or compressible representation in some basis. The CS literature has focused almost exclusively on problems involving single signals in one or two dimensions. However, many important applications involve distributed networks or arrays of sensors. In other applications, the signal is inherently multidimensional and sensed progressively along a subset of its dimensions; examples include hyperspectral imaging and video acquisition. Initial work proposed joint sparsity models for signal ensembles that exploit both intra- and inter-signal correlation structures. Joint sparsity models enable a reduction in the total number of compressive measurements required by CS through the use of specially tailored recovery algorithms. This thesis reviews several different models for sparsity and compressibility of signal ensembles and multidimensional signals and proposes practical CS measurement schemes for these settings. For joint sparsity models, we evaluate the minimum number of measurements required under a recovery algorithm with combinatorial complexity. We also propose a framework for CS that uses a union-of-subspaces signal model. This framework leverages the structure present in certain sparse signals and can exploit both intra- and inter-signal correlations in signal ensembles. We formulate signal recovery algorithms that employ these new models to enable a reduction in the number of measurements required. Additionally, we propose the use of Kronecker product matrices as sparsity or compressibility bases for signal ensembles and multidimensional signals to jointly model all types of correlation present in the signal when each type of correlation can be expressed using sparsity. We compare the performance of standard global measurement ensembles, which act on all of the signal samples; partitioned measurements, which act on a partition of the signal with a given measurement depending only on a piece of the signal; and Kronecker product measurements, which can be implemented in distributed measurement settings. The Kronecker product formulation in the sparsity and measurement settings enables the derivation of analytical bounds for transform coding compression of signal ensembles and multidimensional signals. We also provide new theoretical results for performance of CS recovery when Kronecker product matrices are used, which in turn motivates new design criteria for distributed CS measurement schemes

    Quantum Nescimus: Improving the characterization of quantum systems from limited information

    Get PDF
    We are currently approaching the point where quantum systems with 15 or more qubits will be controllable with high levels of coherence over long timescales. One of the fundamental problems that has been identified is that, as the number of qubits increases to these levels, there is currently no clear way to use efficiently the information that can be obtained from such a system to make diagnostic inferences and to enable improvements in the underlying quantum gates. Even with systems of only a few bits the exponential scaling in resources required by techniques such as quantum tomography or gate-set tomography will render these techniques impractical. Randomized benchmarking (RB) is a technique that will scale in a practical way with these increased system sizes. Although RB provides only a partial characterization of the quantum system, recent advances in the protocol and the interpretation of the results of such experiments confirm the information obtained as helpful in improving the control and verification of such processes. This thesis examines and extends the techniques of RB including practical analysis of systems affected by low frequency noise, extending techniques to allow the anisotropy of noise to be isolated, and showing how additional gates required for universal computation can be added to the protocol and thus benchmarked. Finally, it begins to explore the use of machine learning to aid in the ability to characterize, verify and validate noise in such systems, demonstrating by way of example how machine learning can be used to explore the edge between quantum non-locality and realism

    Spectral methods and computational trade-offs in high-dimensional statistical inference

    Get PDF
    Spectral methods have become increasingly popular in designing fast algorithms for modern highdimensional datasets. This thesis looks at several problems in which spectral methods play a central role. In some cases, we also show that such procedures have essentially the best performance among all randomised polynomial time algorithms by exhibiting statistical and computational trade-offs in those problems. In the first chapter, we prove a useful variant of the well-known Davis{Kahan theorem, which is a spectral perturbation result that allows us to bound of the distance between population eigenspaces and their sample versions. We then propose a semi-definite programming algorithm for the sparse principal component analysis (PCA) problem, and analyse its theoretical performance using the perturbation bounds we derived earlier. It turns out that the parameter regime in which our estimator is consistent is strictly smaller than the consistency regime of a minimax optimal (yet computationally intractable) estimator. We show through reduction from a well-known hard problem in computational complexity theory that the difference in consistency regimes is unavoidable for any randomised polynomial time estimator, hence revealing subtle statistical and computational trade-offs in this problem. Such computational trade-offs also exist in the problem of restricted isometry certification. Certifiers for restricted isometry properties can be used to construct design matrices for sparse linear regression problems. Similar to the sparse PCA problem, we show that there is also an intrinsic gap between the class of matrices certifiable using unrestricted algorithms and using polynomial time algorithms. Finally, we consider the problem of high-dimensional changepoint estimation, where we estimate the time of change in the mean of a high-dimensional time series with piecewise constant mean structure. Motivated by real world applications, we assume that changes only occur in a sparse subset of all coordinates. We apply a variant of the semi-definite programming algorithm in sparse PCA to aggregate the signals across different coordinates in a near optimal way so as to estimate the changepoint location as accurately as possible. Our statistical procedure shows superior performance compared to existing methods in this problem.St John's College and Cambridge Overseas Trus

    Link Prediction and Denoising in Networks

    Full text link
    Network data represent connections between units of interests, but are often noisy and/or include missing values. This thesis focuses on denoising network data via inferring underlying network structure from an observed noisy realization. The observed network data can be viewed as a single random realization of an unobserved latent structure, and our general approach to estimating this latent structure is based factorizing it into a product of interpretable components, with structural assumptions on the components determined by the nature of the problem. We first study the problem of predicting links when edge features are available, or node features that can be converted into edge features. We propose a regression-type model to combine information from network structure and edge features. We show that estimating parameters in this model is straightforward and the estimator enjoys excellent theoretical performance guarantees. Another direction we study is predicting links in time-stamped dynamic networks. A common approach to modeling networks observed over time is aggregating the networks to a few snapshots, which reduces computational complexity, but also loses information. We address this limitation through a dynamic network model based on tensor factorization, which simultaneously captures time trends and the graph structure of dynamic networks without aggregating over time. We develop an efficient algorithm to fit this model and demonstrate the method performs well numerically. The last contribution of this thesis is link prediction for ego-networks. Ego-networks are constructed by recording all friends of a particular user, or several users, which is widely used in survey-based social data collection. There are many methods for filling in missing data in a matrix when entries are missing independently at random, but here it is more appropriate to assume that whole rows of the matrix are missing (corresponding to users), whereas other rows are observed completely. We develop an approach to estimate missing links in this scenario via subspace estimation, exploiting potential low-rank structure common in networks. We obtain theoretical bounds on the estimator's performance and demonstrate it significantly outperforms many widely used benchmarks in both simulated and real networks.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/138596/1/yjwu_1.pd
    corecore