1,330 research outputs found

    On the number of matrices and a random matrix with prescribed row and column sums and 0-1 entries

    Get PDF
    We consider the set Sigma(R,C) of all mxn matrices having 0-1 entries and prescribed row sums R=(r_1, ..., r_m) and column sums C=(c_1, ..., c_n). We prove an asymptotic estimate for the cardinality |Sigma(R, C)| via the solution to a convex optimization problem. We show that if Sigma(R, C) is sufficiently large, then a random matrix D in Sigma(R, C) sampled from the uniform probability measure in Sigma(R,C) with high probability is close to a particular matrix Z=Z(R,C) that maximizes the sum of entropies of entries among all matrices with row sums R, column sums C and entries between 0 and 1. Similar results are obtained for 0-1 matrices with prescribed row and column sums and assigned zeros in some positions.Comment: 26 pages, proofs simplified, results strengthene

    Exact Enumeration and Sampling of Matrices with Specified Margins

    Full text link
    We describe a dynamic programming algorithm for exact counting and exact uniform sampling of matrices with specified row and column sums. The algorithm runs in polynomial time when the column sums are bounded. Binary or non-negative integer matrices are handled. The method is distinguished by applicability to non-regular margins, tractability on large matrices, and the capacity for exact sampling

    Exact sampling and counting for fixed-margin matrices

    Full text link
    The uniform distribution on matrices with specified row and column sums is often a natural choice of null model when testing for structure in two-way tables (binary or nonnegative integer). Due to the difficulty of sampling from this distribution, many approximate methods have been developed. We will show that by exploiting certain symmetries, exact sampling and counting is in fact possible in many nontrivial real-world cases. We illustrate with real datasets including ecological co-occurrence matrices and contingency tables.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1131 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1104.032

    Asymptotic enumeration of correlation-immune boolean functions

    Get PDF
    A boolean function of nn boolean variables is {correlation-immune} of order kk if the function value is uncorrelated with the values of any kk of the arguments. Such functions are of considerable interest due to their cryptographic properties, and are also related to the orthogonal arrays of statistics and the balanced hypercube colourings of combinatorics. The {weight} of a boolean function is the number of argument values that produce a function value of 1. If this is exactly half the argument values, that is, 2n12^{n-1} values, a correlation-immune function is called {resilient}. An asymptotic estimate of the number N(n,k)N(n,k) of nn-variable correlation-immune boolean functions of order kk was obtained in 1992 by Denisov for constant kk. Denisov repudiated that estimate in 2000, but we will show that the repudiation was a mistake. The main contribution of this paper is an asymptotic estimate of N(n,k)N(n,k) which holds if kk increases with nn within generous limits and specialises to functions with a given weight, including the resilient functions. In the case of k=1k=1, our estimates are valid for all weights.Comment: 18 page

    LEDAcrypt: QC-LDPC Code-Based Cryptosystems with Bounded Decryption Failure Rate

    Get PDF
    We consider the QC-LDPC code-based cryptosystems named LEDAcrypt, which are under consideration by NIST for the second round of the post-quantum cryptography standardization initiative. LEDAcrypt is the result of the merger of the key encapsulation mechanism LEDAkem and the public-key cryptosystem LEDApkc, which were submitted to the first round of the same competition. We provide a detailed quantification of the quantum and classical computational efforts needed to foil the cryptographic guarantees of these systems. To this end, we take into account the best known attacks that can be mounted against them employing both classical and quantum computers, and compare their computational complexities with the ones required to break AES, coherently with the NIST requirements. Assuming the original LEDAkem and LEDApkc parameters as a reference, we introduce an algorithmic optimization procedure to design new sets of parameters for LEDAcrypt. These novel sets match the security levels in the NIST call and make the C reference implementation of the systems exhibit significantly improved figures of merit, in terms of both running times and key sizes. As a further contribution, we develop a theoretical characterization of the decryption failure rate (DFR) of LEDAcrypt cryptosystems, which allows new instances of the systems with guaranteed low DFR to be designed. Such a characterization is crucial to withstand recent attacks exploiting the reactions of the legitimate recipient upon decrypting multiple ciphertexts with the same private key, and consequentially it is able to ensure a lifecycle of the corresponding key pairs which can be sufficient for the wide majority of practical purposes

    New methods for fixed-margin binary matrix sampling, Fréchet covariance, and MANOVA tests for random objects in multiple metric spaces

    Get PDF
    2022 Summer.Includes bibliographical references.Many approaches to the analysis of network data essentially view the data as Euclidean and apply standard multivariate techniques. In this dissertation, we refrain from this approach, exploring two alternate approaches to the analysis of networks and other structured data. The first approach seeks to determine how unique an observed simple, directed network is by comparing it to like networks which share its degree distribution. Generating networks for comparison requires sampling from the space of all binary matrices with the prescribed row and column margins, since enumeration of all such matrices is often infeasible for even moderately sized networks with 20-50 nodes. We propose two new sampling methods for this problem. First, we extend two Markov chain Monte Carlo methods to sample from the space non-uniformly, allowing flexibility in the case that some networks are more likely than others. We show that non-uniform sampling could impede the MCMC process, but in certain special cases is still valid. Critically, we illustrate the differential conclusions that could be drawn from uniform vs. nonuniform sampling. Second, we develop a generalized divide and conquer approach which recursively divides matrices into smaller subproblems which are much easier to count and sample. Each division step reveals interesting mathematics involving the enumeration of integer partitions and points in convex lattice polytopes. The second broad approach we explore is comparing random objects in metric spaces lacking a coordinate system. Traditional definitions of the mean and variance no longer apply, and standard statistical tests have needed reconceptualization in terms of only distances in the metric space. We consider the multivariate setting where random objects exist in multiple metric spaces, which can be thought of as distinct views of the random object. We define the notion of Fréchet covariance to measure dependence between two metric spaces, and establish consistency for the sample estimator. We then propose several tests for differences in means and covariance matrices among two or more groups in multiple metric spaces, and compare their performance on scenarios involving random probability distributions and networks with node covariates

    The Trapping Redundancy of Linear Block Codes

    Full text link
    We generalize the notion of the stopping redundancy in order to study the smallest size of a trapping set in Tanner graphs of linear block codes. In this context, we introduce the notion of the trapping redundancy of a code, which quantifies the relationship between the number of redundant rows in any parity-check matrix of a given code and the size of its smallest trapping set. Trapping sets with certain parameter sizes are known to cause error-floors in the performance curves of iterative belief propagation decoders, and it is therefore important to identify decoding matrices that avoid such sets. Bounds on the trapping redundancy are obtained using probabilistic and constructive methods, and the analysis covers both general and elementary trapping sets. Numerical values for these bounds are computed for the [2640,1320] Margulis code and the class of projective geometry codes, and compared with some new code-specific trapping set size estimates.Comment: 12 pages, 4 tables, 1 figure, accepted for publication in IEEE Transactions on Information Theor
    corecore