91,729 research outputs found
Small Bias Requires Large Formulas
A small-biased function is a randomized function whose distribution of truth-tables is small-biased. We demonstrate that known explicit lower bounds on (1) the size of general Boolean formulas, (2) the size of De Morgan formulas, and (3) correlation against small De Morgan formulas apply to small-biased functions. As a consequence, any strongly explicit small-biased generator is subject to the best-known explicit formula lower bounds in all these models.
On the other hand, we give a construction of a small-biased function that is tight with respect to lower bound (1) for the relevant range of parameters. We interpret this construction as a natural-type barrier against substantially stronger lower bounds for general formulas
DNF Sparsification and a Faster Deterministic Counting Algorithm
Given a DNF formula on n variables, the two natural size measures are the
number of terms or size s(f), and the maximum width of a term w(f). It is
folklore that short DNF formulas can be made narrow. We prove a converse,
showing that narrow formulas can be sparsified. More precisely, any width w DNF
irrespective of its size can be -approximated by a width DNF with
at most terms.
We combine our sparsification result with the work of Luby and Velikovic to
give a faster deterministic algorithm for approximately counting the number of
satisfying solutions to a DNF. Given a formula on n variables with poly(n)
terms, we give a deterministic time algorithm
that computes an additive approximation to the fraction of
satisfying assignments of f for \epsilon = 1/\poly(\log n). The previous best
result due to Luby and Velickovic from nearly two decades ago had a run-time of
.Comment: To appear in the IEEE Conference on Computational Complexity, 201
Alternative formulas for synthetic dual system estimation in the 2000 census
The U.S. Census Bureau provides an estimate of the true population as a
supplement to the basic census numbers. This estimate is constructed from data
in a post-censal survey. The overall procedure is referred to as dual system
estimation. Dual system estimation is designed to produce revised estimates at
all levels of geography, via a synthetic estimation procedure. We design three
alternative formulas for dual system estimation and investigate the differences
in area estimates produced as a result of using those formulas. The primary
target of this exercise is to better understand the nature of the homogeneity
assumptions involved in dual system estimation and their consequences when used
for the enumeration data that occurs in an actual large scale application like
the Census. (Assumptions of this nature are sometimes collectively referred to
as the ``synthetic assumption'' for dual system estimation.) The specific focus
of our study is the treatment of the category of census counts referred to as
imputations in dual system estimation. Our results show the degree to which
varying treatment of these imputation counts can result in differences in
population estimates for local areas such as states or counties.Comment: Published in at http://dx.doi.org/10.1214/193940307000000400 the IMS
Collections (http://www.imstat.org/publications/imscollections.htm) by the
Institute of Mathematical Statistics (http://www.imstat.org
Exact properties of Efron's biased coin randomization procedure
Efron [Biometrika 58 (1971) 403--417] developed a restricted randomization
procedure to promote balance between two treatment groups in a sequential
clinical trial. He called this the biased coin design. He also introduced the
concept of accidental bias, and investigated properties of the procedure with
respect to both accidental and selection bias, balance, and randomization-based
inference using the steady-state properties of the induced Markov chain. In
this paper we revisit this procedure, and derive closed-form expressions for
the exact properties of the measures derived asymptotically in Efron's paper.
In particular, we derive the exact distribution of the treatment imbalance and
the variance-covariance matrix of the treatment assignments. These results have
application in the design and analysis of clinical trials, by providing exact
formulas to determine the role of the coin's bias probability in the context of
selection and accidental bias, balancing properties and randomization-based
inference.Comment: Published in at http://dx.doi.org/10.1214/09-AOS758 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Approximations of Shannon Mutual Information for Discrete Variables with Applications to Neural Population Coding
Although Shannon mutual information has been widely used, its effective
calculation is often difficult for many practical problems, including those in
neural population coding. Asymptotic formulas based on Fisher information
sometimes provide accurate approximations to the mutual information but this
approach is restricted to continuous variables because the calculation of
Fisher information requires derivatives with respect to the encoded variables.
In this paper, we consider information-theoretic bounds and approximations of
the mutual information based on Kullback--Leibler divergence and R\'{e}nyi
divergence. We propose several information metrics to approximate Shannon
mutual information in the context of neural population coding. While our
asymptotic formulas all work for discrete variables, one of them has consistent
performance and high accuracy regardless of whether the encoded variables are
discrete or continuous. We performed numerical simulations and confirmed that
our approximation formulas were highly accurate for approximating the mutual
information between the stimuli and the responses of a large neural population.
These approximation formulas may potentially bring convenience to the
applications of information theory to many practical and theoretical problems.Comment: 31 pages, 6 figure
Dark matter clustering: a simple renormalization group approach
I compute a renormalization group (RG) improvement to the standard
beyond-linear-order Eulerian perturbation theory (PT) calculation of the power
spectrum of large-scale density fluctuations in the Universe. At z=0, for a
power spectrum matching current observations, lowest order RGPT appears to be
as accurate as one can test using existing numerical simulation-calibrated
fitting formulas out to at least k~=0.3 h/Mpc; although inaccuracy is
guaranteed at some level by approximations in the calculation (which can be
improved in the future). In contrast, standard PT breaks down virtually as soon
as beyond-linear corrections become non-negligible, on scales even larger than
k=0.1 h/Mpc. This extension in range of validity could substantially enhance
the usefulness of PT for interpreting baryonic acoustic oscillation surveys
aimed at probing dark energy, for example. I show that the predicted power
spectrum converges at high k to a power law with index given by the fixed-point
solution of the RG equation. I discuss many possible future directions for this
line of work. The basic calculation of this paper should be easily
understandable without any prior knowledge of RG methods, while a rich
background of mathematical physics literature exists for the interested reader.Comment: much expanded explanation of basic calculatio
- …