Search CORE

12,935 research outputs found

Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses

Author: Loh Po-Ling
Wainwright Martin J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2013
Field of study

We investigate the relationship between the structure of a discrete graphical model and the support of the inverse of a generalized covariance matrix. We show that for certain graph structures, the support of the inverse covariance matrix of indicator variables on the vertices of a graph reflects the conditional independence structure of the graph. Our work extends results that have previously been established only in the context of multivariate Gaussian graphical models, thereby addressing an open question about the significance of the inverse covariance matrix of a non-Gaussian distribution. The proof exploits a combination of ideas from the geometry of exponential families, junction tree theory and convex analysis. These population-level results have various consequences for graph selection methods, both known and novel, including a novel method for structure estimation for missing or corrupted observations. We provide nonasymptotic guarantees for such methods and illustrate the sharpness of these predictions via simulations.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1162 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

The Road From Classical to Quantum Codes: A Hashing Bound Approaching Design Procedure

Author: Alanis Dimitrios
Babar Zunaira
Botsinis Panagiotis
Hanzo Lajos
Ng Soon Xin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

Powerful Quantum Error Correction Codes (QECCs) are required for stabilizing and protecting fragile qubits against the undesirable effects of quantum decoherence. Similar to classical codes, hashing bound approaching QECCs may be designed by exploiting a concatenated code structure, which invokes iterative decoding. Therefore, in this paper we provide an extensive step-by-step tutorial for designing EXtrinsic Information Transfer (EXIT) chart aided concatenated quantum codes based on the underlying quantum-to-classical isomorphism. These design lessons are then exemplified in the context of our proposed Quantum Irregular Convolutional Code (QIRCC), which constitutes the outer component of a concatenated quantum code. The proposed QIRCC can be dynamically adapted to match any given inner code using EXIT charts, hence achieving a performance close to the hashing bound. It is demonstrated that our QIRCC-based optimized design is capable of operating within 0.4 dB of the noise limit

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs

Author: Korenblum Daniel
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2018
Field of study

Laplacian mixture models identify overlapping regions of influence in unlabeled graph and network data in a scalable and computationally efficient way, yielding useful low-dimensional representations. By combining Laplacian eigenspace and finite mixture modeling methods, they provide probabilistic or fuzzy dimensionality reductions or domain decompositions for a variety of input data types, including mixture distributions, feature vectors, and graphs or networks. Provable optimal recovery using the algorithm is analytically shown for a nontrivial class of cluster graphs. Heuristic approximations for scalable high-performance implementations are described and empirically tested. Connections to PageRank and community detection in network analysis demonstrate the wide applicability of this approach. The origins of fuzzy spectral methods, beginning with generalized heat or diffusion equations in physics, are reviewed and summarized. Comparisons to other dimensionality reduction and clustering methods for challenging unsupervised machine learning problems are also discussed.Comment: 13 figures, 35 reference

arXiv.org e-Print Archive

Directory of Open Access Journals

Haplotype Assembly: An Information Theoretic View

Author: Si Hongbo
Vikalo Haris
Vishwanath Sriram
Publication venue
Publication date: 11/05/2014
Field of study

This paper studies the haplotype assembly problem from an information theoretic perspective. A haplotype is a sequence of nucleotide bases on a chromosome, often conveniently represented by a binary string, that differ from the bases in the corresponding positions on the other chromosome in a homologous pair. Information about the order of bases in a genome is readily inferred using short reads provided by high-throughput DNA sequencing technologies. In this paper, the recovery of the target pair of haplotype sequences using short reads is rephrased as a joint source-channel coding problem. Two messages, representing haplotypes and chromosome memberships of reads, are encoded and transmitted over a channel with erasures and errors, where the channel model reflects salient features of high-throughput sequencing. The focus of this paper is on the required number of reads for reliable haplotype reconstruction, and both the necessary and sufficient conditions are presented with order-wise optimal bounds.Comment: 30 pages, 5 figures, 1 tabel, journa

arXiv.org e-Print Archive

Rank Minimization over Finite Fields: Fundamental Limits and Coding-Theoretic Interpretations

Author: Laura Balzano
Stark C. Draper
Student Member
Vincent Y. F. Tan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.Comment: Accepted to the IEEE Transactions on Information Theory; Presented at IEEE International Symposium on Information Theory (ISIT) 201

arXiv.org e-Print Archive

CiteSeerX

Robust Recovery of Subspace Structures by Low-Rank Representation

Author: Lin Zhouchen
Liu Guangcan
Ma Yi
Sun Ju
Yan Shuicheng
Yu Yong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/05/2012
Field of study

In this work we address the subspace recovery problem. Given a set of data samples (vectors) approximately drawn from a union of multiple subspaces, our goal is to segment the samples into their respective subspaces and correct the possible errors as well. To this end, we propose a novel method termed Low-Rank Representation (LRR), which seeks the lowest-rank representation among all the candidates that can represent the data samples as linear combinations of the bases in a given dictionary. It is shown that LRR well solves the subspace recovery problem: when the data is clean, we prove that LRR exactly captures the true subspace structures; for the data contaminated by outliers, we prove that under certain conditions LRR can exactly recover the row space of the original data and detect the outlier as well; for the data corrupted by arbitrary errors, LRR can also approximately recover the row space with theoretical guarantees. Since the subspace membership is provably determined by the row space, these further imply that LRR can perform robust subspace segmentation and error correction, in an efficient way.Comment: IEEE Trans. Pattern Analysis and Machine Intelligenc

arXiv.org e-Print Archive

CiteSeerX

Noise-Resilient Group Testing: Limitations and Constructions

Author: A. Bonis De
A. Dyachkov
A. Macula
A. Ta-Shma
A.G. D’yachkov
B. Chlebus
D. Eppstein
D.-Z. Du
D.Z. Du
E. Knill
L. Trevisan
R. Raz
Ruszinkó
T. Berger
W. Kautz
Y. Cheng
Z. Füredi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

We study combinatorial group testing schemes for learning

d

-sparse Boolean vectors using highly unreliable disjunctive measurements. We consider an adversarial noise model that only limits the number of false observations, and show that any noise-resilient scheme in this model can only approximately reconstruct the sparse vector. On the positive side, we take this barrier to our advantage and show that approximate reconstruction (within a satisfactory degree of approximation) allows us to break the information theoretic lower bound of

\tilde{\Omega}(d^2 \log n)

that is known for exact reconstruction of

d

-sparse vectors of length

n

via non-adaptive measurements, by a multiplicative factor

\tilde{\Omega}(d)

. Specifically, we give simple randomized constructions of non-adaptive measurement schemes, with

m=O(d \log n)

measurements, that allow efficient reconstruction of

d

-sparse vectors up to

O(d)

false positives even in the presence of

\delta m

false positives and

O(m/d)

false negatives within the measurement outcomes, for any constant

\delta < 1

. We show that, information theoretically, none of these parameters can be substantially improved without dramatically affecting the others. Furthermore, we obtain several explicit constructions, in particular one matching the randomized trade-off but using

m = O(d^{1+o(1)} \log n)

measurements. We also obtain explicit constructions that allow fast reconstruction in time \poly(m), which would be sublinear in

n

for sufficiently sparse vectors. The main tool used in our construction is the list-decoding view of randomness condensers and extractors.Comment: Full version. A preliminary summary of this work appears (under the same title) in proceedings of the 17th International Symposium on Fundamentals of Computation Theory (FCT 2009

arXiv.org e-Print Archive

CiteSeerX

Spiral - Imperial College Digital Repository