30 research outputs found

    On the sub-Gaussianity of the Beta and Dirichlet distributions

    Get PDF
    We obtain the optimal proxy variance for the sub-Gaussianity of Beta distribution, thus proving upper bounds recently conjectured by Elder (2016). We provide different proof techniques for the symmetrical (around its mean) case and the non-symmetrical case. The technique in the latter case relies on studying the ordinary differential equation satisfied by the Beta moment-generating function known as the confluent hypergeometric function. As a consequence, we derive the optimal proxy variance for the Dirichlet distribution, which is apparently a novel result. We also provide a new proof of the optimal proxy variance for the Bernoulli distribution, and discuss in this context the proxy variance relation to log-Sobolev inequalities and transport inequalities.Comment: 13 pages, 2 figure

    On the Accuracy of Hotelling-Type Asymmetric Tensor Deflation: A Random Tensor Analysis

    Full text link
    This work introduces an asymptotic study of Hotelling-type tensor deflation in the presence of noise, in the regime of large tensor dimensions. Specifically, we consider a low-rank asymmetric tensor model of the form ∑i=1rβiAi+W\sum_{i=1}^r \beta_i{\mathcal{A}}_i + {\mathcal{W}} where βi≥0\beta_i\geq 0 and the Ai{\mathcal{A}}_i's are unit-norm rank-one tensors such that ∣⟨Ai,Aj⟩∣∈[0,1]\left| \langle {\mathcal{A}}_i, {\mathcal{A}}_j \rangle \right| \in [0, 1] for i≠ji\neq j and W{\mathcal{W}} is an additive noise term. Assuming that the dominant components are successively estimated from the noisy observation and subsequently subtracted, we leverage recent advances in random tensor theory in the regime of asymptotically large tensor dimensions to analytically characterize the estimated singular values and the alignment of estimated and true singular vectors at each step of the deflation procedure. Furthermore, this result can be used to construct estimators of the signal-to-noise ratios βi\beta_i and the alignments between the estimated and true rank-1 signal components.Comment: Accepted at IEEE CAMSAP 2023. See also companion paper arXiv:2304.10248 for the symmetric case. arXiv admin note: text overlap with arXiv:2211.0900

    Information-theoretic bounds and phase transitions in clustering, sparse PCA, and submatrix localization

    Full text link
    We study the problem of detecting a structured, low-rank signal matrix corrupted with additive Gaussian noise. This includes clustering in a Gaussian mixture model, sparse PCA, and submatrix localization. Each of these problems is conjectured to exhibit a sharp information-theoretic threshold, below which the signal is too weak for any algorithm to detect. We derive upper and lower bounds on these thresholds by applying the first and second moment methods to the likelihood ratio between these "planted models" and null models where the signal matrix is zero. Our bounds differ by at most a factor of root two when the rank is large (in the clustering and submatrix localization problems, when the number of clusters or blocks is large) or the signal matrix is very sparse. Moreover, our upper bounds show that for each of these problems there is a significant regime where reliable detection is information- theoretically possible but where known algorithms such as PCA fail completely, since the spectrum of the observed matrix is uninformative. This regime is analogous to the conjectured 'hard but detectable' regime for community detection in sparse graphs.Comment: For sparse PCA and submatrix localization, we determine the information-theoretic threshold exactly in the limit where the number of blocks is large or the signal matrix is very sparse based on a conditional second moment method, closing the factor of root two gap in the first versio
    corecore