91,729 research outputs found

    Small Bias Requires Large Formulas

    Get PDF
    A small-biased function is a randomized function whose distribution of truth-tables is small-biased. We demonstrate that known explicit lower bounds on (1) the size of general Boolean formulas, (2) the size of De Morgan formulas, and (3) correlation against small De Morgan formulas apply to small-biased functions. As a consequence, any strongly explicit small-biased generator is subject to the best-known explicit formula lower bounds in all these models. On the other hand, we give a construction of a small-biased function that is tight with respect to lower bound (1) for the relevant range of parameters. We interpret this construction as a natural-type barrier against substantially stronger lower bounds for general formulas

    DNF Sparsification and a Faster Deterministic Counting Algorithm

    Full text link
    Given a DNF formula on n variables, the two natural size measures are the number of terms or size s(f), and the maximum width of a term w(f). It is folklore that short DNF formulas can be made narrow. We prove a converse, showing that narrow formulas can be sparsified. More precisely, any width w DNF irrespective of its size can be ϵ\epsilon-approximated by a width ww DNF with at most (wlog(1/ϵ))O(w)(w\log(1/\epsilon))^{O(w)} terms. We combine our sparsification result with the work of Luby and Velikovic to give a faster deterministic algorithm for approximately counting the number of satisfying solutions to a DNF. Given a formula on n variables with poly(n) terms, we give a deterministic nO~(loglog(n))n^{\tilde{O}(\log \log(n))} time algorithm that computes an additive ϵ\epsilon approximation to the fraction of satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best result due to Luby and Velickovic from nearly two decades ago had a run-time of nexp(O(loglogn))n^{\exp(O(\sqrt{\log \log n}))}.Comment: To appear in the IEEE Conference on Computational Complexity, 201

    Alternative formulas for synthetic dual system estimation in the 2000 census

    Get PDF
    The U.S. Census Bureau provides an estimate of the true population as a supplement to the basic census numbers. This estimate is constructed from data in a post-censal survey. The overall procedure is referred to as dual system estimation. Dual system estimation is designed to produce revised estimates at all levels of geography, via a synthetic estimation procedure. We design three alternative formulas for dual system estimation and investigate the differences in area estimates produced as a result of using those formulas. The primary target of this exercise is to better understand the nature of the homogeneity assumptions involved in dual system estimation and their consequences when used for the enumeration data that occurs in an actual large scale application like the Census. (Assumptions of this nature are sometimes collectively referred to as the ``synthetic assumption'' for dual system estimation.) The specific focus of our study is the treatment of the category of census counts referred to as imputations in dual system estimation. Our results show the degree to which varying treatment of these imputation counts can result in differences in population estimates for local areas such as states or counties.Comment: Published in at http://dx.doi.org/10.1214/193940307000000400 the IMS Collections (http://www.imstat.org/publications/imscollections.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    Exact properties of Efron's biased coin randomization procedure

    Full text link
    Efron [Biometrika 58 (1971) 403--417] developed a restricted randomization procedure to promote balance between two treatment groups in a sequential clinical trial. He called this the biased coin design. He also introduced the concept of accidental bias, and investigated properties of the procedure with respect to both accidental and selection bias, balance, and randomization-based inference using the steady-state properties of the induced Markov chain. In this paper we revisit this procedure, and derive closed-form expressions for the exact properties of the measures derived asymptotically in Efron's paper. In particular, we derive the exact distribution of the treatment imbalance and the variance-covariance matrix of the treatment assignments. These results have application in the design and analysis of clinical trials, by providing exact formulas to determine the role of the coin's bias probability in the context of selection and accidental bias, balancing properties and randomization-based inference.Comment: Published in at http://dx.doi.org/10.1214/09-AOS758 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Approximations of Shannon Mutual Information for Discrete Variables with Applications to Neural Population Coding

    Full text link
    Although Shannon mutual information has been widely used, its effective calculation is often difficult for many practical problems, including those in neural population coding. Asymptotic formulas based on Fisher information sometimes provide accurate approximations to the mutual information but this approach is restricted to continuous variables because the calculation of Fisher information requires derivatives with respect to the encoded variables. In this paper, we consider information-theoretic bounds and approximations of the mutual information based on Kullback--Leibler divergence and R\'{e}nyi divergence. We propose several information metrics to approximate Shannon mutual information in the context of neural population coding. While our asymptotic formulas all work for discrete variables, one of them has consistent performance and high accuracy regardless of whether the encoded variables are discrete or continuous. We performed numerical simulations and confirmed that our approximation formulas were highly accurate for approximating the mutual information between the stimuli and the responses of a large neural population. These approximation formulas may potentially bring convenience to the applications of information theory to many practical and theoretical problems.Comment: 31 pages, 6 figure

    Dark matter clustering: a simple renormalization group approach

    Get PDF
    I compute a renormalization group (RG) improvement to the standard beyond-linear-order Eulerian perturbation theory (PT) calculation of the power spectrum of large-scale density fluctuations in the Universe. At z=0, for a power spectrum matching current observations, lowest order RGPT appears to be as accurate as one can test using existing numerical simulation-calibrated fitting formulas out to at least k~=0.3 h/Mpc; although inaccuracy is guaranteed at some level by approximations in the calculation (which can be improved in the future). In contrast, standard PT breaks down virtually as soon as beyond-linear corrections become non-negligible, on scales even larger than k=0.1 h/Mpc. This extension in range of validity could substantially enhance the usefulness of PT for interpreting baryonic acoustic oscillation surveys aimed at probing dark energy, for example. I show that the predicted power spectrum converges at high k to a power law with index given by the fixed-point solution of the RG equation. I discuss many possible future directions for this line of work. The basic calculation of this paper should be easily understandable without any prior knowledge of RG methods, while a rich background of mathematical physics literature exists for the interested reader.Comment: much expanded explanation of basic calculatio
    corecore