2,873 research outputs found

    The Generalized Asymptotic Equipartition Property: Necessary and Sufficient Conditions

    Full text link
    Suppose a string X1n=(X1,X2,...,Xn)X_1^n=(X_1,X_2,...,X_n) generated by a memoryless source (Xn)n≥1(X_n)_{n\geq 1} with distribution PP is to be compressed with distortion no greater than D≥0D\geq 0, using a memoryless random codebook with distribution QQ. The compression performance is determined by the ``generalized asymptotic equipartition property'' (AEP), which states that the probability of finding a DD-close match between X1nX_1^n and any given codeword Y1nY_1^n, is approximately 2−nR(P,Q,D)2^{-n R(P,Q,D)}, where the rate function R(P,Q,D)R(P,Q,D) can be expressed as an infimum of relative entropies. The main purpose here is to remove various restrictive assumptions on the validity of this result that have appeared in the recent literature. Necessary and sufficient conditions for the generalized AEP are provided in the general setting of abstract alphabets and unbounded distortion measures. All possible distortion levels D≥0D\geq 0 are considered; the source (Xn)n≥1(X_n)_{n\geq 1} can be stationary and ergodic; and the codebook distribution can have memory. Moreover, the behavior of the matching probability is precisely characterized, even when the generalized AEP is not valid. Natural characterizations of the rate function R(P,Q,D)R(P,Q,D) are established under equally general conditions.Comment: 19 page

    Conservative Hypothesis Tests and Confidence Intervals using Importance Sampling

    Full text link
    Importance sampling is a common technique for Monte Carlo approximation, including Monte Carlo approximation of p-values. Here it is shown that a simple correction of the usual importance sampling p-values creates valid p-values, meaning that a hypothesis test created by rejecting the null when the p-value is <= alpha will also have a type I error rate <= alpha. This correction uses the importance weight of the original observation, which gives valuable diagnostic information under the null hypothesis. Using the corrected p-values can be crucial for multiple testing and also in problems where evaluating the accuracy of importance sampling approximations is difficult. Inverting the corrected p-values provides a useful way to create Monte Carlo confidence intervals that maintain the nominal significance level and use only a single Monte Carlo sample. Several applications are described, including accelerated multiple testing for a large neurophysiological dataset and exact conditional inference for a logistic regression model with nuisance parameters.Comment: 26 pages, 3 figures, 3 tables [significant rewrite of version 1, including additional examples, title change

    Exact Enumeration and Sampling of Matrices with Specified Margins

    Full text link
    We describe a dynamic programming algorithm for exact counting and exact uniform sampling of matrices with specified row and column sums. The algorithm runs in polynomial time when the column sums are bounded. Binary or non-negative integer matrices are handled. The method is distinguished by applicability to non-regular margins, tractability on large matrices, and the capacity for exact sampling

    Exact sampling and counting for fixed-margin matrices

    Full text link
    The uniform distribution on matrices with specified row and column sums is often a natural choice of null model when testing for structure in two-way tables (binary or nonnegative integer). Due to the difficulty of sampling from this distribution, many approximate methods have been developed. We will show that by exploiting certain symmetries, exact sampling and counting is in fact possible in many nontrivial real-world cases. We illustrate with real datasets including ecological co-occurrence matrices and contingency tables.Comment: Published in at http://dx.doi.org/10.1214/13-AOS1131 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1104.032

    Inconsistency of Pitman-Yor process mixtures for the number of components

    Full text link
    In many applications, a finite mixture is a natural model, but it can be difficult to choose an appropriate number of components. To circumvent this choice, investigators are increasingly turning to Dirichlet process mixtures (DPMs), and Pitman-Yor process mixtures (PYMs), more generally. While these models may be well-suited for Bayesian density estimation, many investigators are using them for inferences about the number of components, by considering the posterior on the number of components represented in the observed data. We show that this posterior is not consistent --- that is, on data from a finite mixture, it does not concentrate at the true number of components. This result applies to a large class of nonparametric mixtures, including DPMs and PYMs, over a wide variety of families of component distributions, including essentially all discrete families, as well as continuous exponential families satisfying mild regularity conditions (such as multivariate Gaussians).Comment: This is a general treatment of the problem discussed in our related article, "A simple example of Dirichlet process mixture inconsistency for the number of components", Miller and Harrison (2013) arXiv:1301.270

    On the insufficiency of laterality-based accounts of face perception and corresponding visual field asymmetries

    Get PDF
    It has been known for nearly a century that the left half of a face is better recognized than the right half (Wolff, 1933). This left half-face advantage is commonly thought to reflect a combination of right hemisphere (RH) superiority for face recognition and a contralateral hemifield-hemisphere correspondence between the RH and the left visual field (LVF). The purpose of this set of experiments was to determine whether RH superiority for faces and contralateral hemifield-hemisphere correspondence is sufficient to explain the LVF half-face advantage. We set out four aims to accomplish this: (1) Use behavioral and fMRI methods to demonstrate the LVF half-face advantage and identify its neural basis in ventral occipital-temporal cortex (VOTC); (2) use behavioral methods to show that RH superiority is insufficient to explain the LVF half-face advantage; (3) use behavioral methods to show that we perceive only one half of a face at a time; and (4), albeit not initially proposed, use methods developed to accomplish aims 1-3 to distinguish retinotopic face representation from face-centered representation.In our first set of experiments (behavioral and fMRI), we identified for the first time a neural LVF half-face bias in RH face-selective cortex. We also found that the neural LVF bias in right FFA underlies the relationship between FFA laterality and the LVF half-face advantage. This revealed an explicit neural mechanism to describe the commonly assumed basis of the LVF advantage for centrally-viewed faces. In our next set of experiments (behavioral) we addressed the second aim, and found that LVF half-face advantage is contingent upon the simultaneous presence of both an upright LVF and RVF half-face, and does not reflect inherently superior processing of LVF over RVF half-face information. This challenged the sufficiency of the mechanism we discovered in Aim 1 as an explanation of the LVF half-face advantage. In our next set of behavioral experiments (which addressed our third aim) we found that half-face identities compete for limited processing resources, and only one identity can be processed at a time. Furthermore, we found that this does not apply to faces in which half-face identities are similar enough to be perceived as a normal (i.e. non-chimeric) face. In our final set of experiments (behavioral) we addressed our additional Aim 4, and found that the LVF half-face advantage occurs regardless of the location of the face in the visual field. This suggests that faces are represented to some degree in an object-centered reference frame, and the LVF half-face bias reflects a bias to the left half of a face, rather than a retinotopic bias to the left half of visual space
    • …
    corecore