167,136 research outputs found

    Asymptotic Mutual Information for the Two-Groups Stochastic Block Model

    Full text link
    We develop an information-theoretic view of the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph GG from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit `single-letter' characterization of the per-vertex mutual information between the vertex labels and the graph. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and --in particular-- reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime there exists a procedure that estimates the vertex labels better than random guessing.Comment: 41 pages, 3 pdf figure

    Inferring Rankings Using Constrained Sensing

    Full text link
    We consider the problem of recovering a function over the space of permutations (or, the symmetric group) over nn elements from given partial information; the partial information we consider is related to the group theoretic Fourier Transform of the function. This problem naturally arises in several settings such as ranked elections, multi-object tracking, ranking systems, and recommendation systems. Inspired by the work of Donoho and Stark in the context of discrete-time functions, we focus on non-negative functions with a sparse support (support size \ll domain size). Our recovery method is based on finding the sparsest solution (through 0\ell_0 optimization) that is consistent with the available information. As the main result, we derive sufficient conditions for functions that can be recovered exactly from partial information through 0\ell_0 optimization. Under a natural random model for the generation of functions, we quantify the recoverability conditions by deriving bounds on the sparsity (support size) for which the function satisfies the sufficient conditions with a high probability as nn \to \infty. 0\ell_0 optimization is computationally hard. Therefore, the popular compressive sensing literature considers solving the convex relaxation, 1\ell_1 optimization, to find the sparsest solution. However, we show that 1\ell_1 optimization fails to recover a function (even with constant sparsity) generated using the random model with a high probability as nn \to \infty. In order to overcome this problem, we propose a novel iterative algorithm for the recovery of functions that satisfy the sufficient conditions. Finally, using an Information Theoretic framework, we study necessary conditions for exact recovery to be possible.Comment: 19 page

    Boolean Compressed Sensing and Noisy Group Testing

    Full text link
    The fundamental task of group testing is to recover a small distinguished subset of items from a large population while efficiently reducing the total number of tests (measurements). The key contribution of this paper is in adopting a new information-theoretic perspective on group testing problems. We formulate the group testing problem as a channel coding/decoding problem and derive a single-letter characterization for the total number of tests used to identify the defective set. Although the focus of this paper is primarily on group testing, our main result is generally applicable to other compressive sensing models. The single letter characterization is shown to be order-wise tight for many interesting noisy group testing scenarios. Specifically, we consider an additive Bernoulli(qq) noise model where we show that, for NN items and KK defectives, the number of tests TT is O(KlogN1q)O(\frac{K\log N}{1-q}) for arbitrarily small average error probability and O(K2logN1q)O(\frac{K^2\log N}{1-q}) for a worst case error criterion. We also consider dilution effects whereby a defective item in a positive pool might get diluted with probability uu and potentially missed. In this case, it is shown that TT is O(KlogN(1u)2)O(\frac{K\log N}{(1-u)^2}) and O(K2logN(1u)2)O(\frac{K^2\log N}{(1-u)^2}) for the average and the worst case error criteria, respectively. Furthermore, our bounds allow us to verify existing known bounds for noiseless group testing including the deterministic noise-free case and approximate reconstruction with bounded distortion. Our proof of achievability is based on random coding and the analysis of a Maximum Likelihood Detector, and our information theoretic lower bound is based on Fano's inequality.Comment: In this revision: reorganized the paper, added citations to related work, and fixed some bug

    Quality-based Multimodal Classification Using Tree-Structured Sparsity

    Full text link
    Recent studies have demonstrated advantages of information fusion based on sparsity models for multimodal classification. Among several sparsity models, tree-structured sparsity provides a flexible framework for extraction of cross-correlated information from different sources and for enforcing group sparsity at multiple granularities. However, the existing algorithm only solves an approximated version of the cost functional and the resulting solution is not necessarily sparse at group levels. This paper reformulates the tree-structured sparse model for multimodal classification task. An accelerated proximal algorithm is proposed to solve the optimization problem, which is an efficient tool for feature-level fusion among either homogeneous or heterogeneous sources of information. In addition, a (fuzzy-set-theoretic) possibilistic scheme is proposed to weight the available modalities, based on their respective reliability, in a joint optimization problem for finding the sparsity codes. This approach provides a general framework for quality-based fusion that offers added robustness to several sparsity-based multimodal classification algorithms. To demonstrate their efficacy, the proposed methods are evaluated on three different applications - multiview face recognition, multimodal face recognition, and target classification.Comment: To Appear in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2014

    Access Fees in Politics

    Get PDF
    This paper develops a game-theoretic model of lobbying in which a politician sells access to interest groups. The politician sets an access fee, or the minimum contribution necessary to secure access, and an interest group that pays this fee can share verifiable evidence in favor of its preferred policy. The more the politician knows about interest group evidence, the better able he is to identify and implement the welfare-maximizing policy. In equilibrium, a wealthy interest group must pay more for access than an otherwise similar poor group; and a group involved with an important issue must pay less than an otherwise similar group involved with a less-important issue. The politician sets higher-than-optimal access fees in order to increase contributions. A contribution limit can improve constituent welfare by lowering the price of access, which tends to result in a more-informed politician. However, a limit can also decrease the range of issues for which the politician is willing to sell access, thereby reducing politician information and constituent welfare. Although the optimal limit is binding for some issues, it is never optimal to ban contributions.Lobbying, campaign contributions, contribution limits, political access, hard information, evidence disclosure

    An information theoretic approach to ecological inference in presence of spatial heterogeneity and dependence

    Get PDF
    This paper introduces Information Theoretic – based methods for estimating a target variable in a set of small geographical areas, by exploring spatially heterogeneous relationships at the disaggregate level. Controlling for spatial effects means introducing models whereby the assumption is that values in adjacent geographic locations are linked to each other by means of some form of underlying spatial relationship. This method offers a flexible framework for modeling the underlying variation in sub-group indicators, by addressing the spatial dependency problem. A basic ecological inference problem, which allows for spatial heterogeneity and dependence, is presented with the aim of first estimating the model at the aggregate level, and then of employing the estimated coefficients to obtain the sub-group level indicators. The Information Theoretic-based formulations could be a useful means of including spatial and inter-temporal features in analyses of micro-level behavior, and of providing an effective, flexible way of reconciling micro and macro data. An unique optimum solution may be obtained even if there are more parameters to be estimated than available moment conditions and the problem is ill-posed. Additional non-sample information from theory and/or empirical evidence can be introduced in the form of known probabilities by means of the cross-entropy formalism. Consistent estimates in small samples can be computed in the presence of incomplete micro-level data as well as in the presence of problems of collinearity and endogeneity in the individual local models, without imposing strong distributional assumptions. Keywords: Generalized Cross Entropy Estimation, Ecological Inference, Spatial Heterogeneity

    Group-theoretic models of the inversion process in bacterial genomes

    Full text link
    The variation in genome arrangements among bacterial taxa is largely due to the process of inversion. Recent studies indicate that not all inversions are equally probable, suggesting, for instance, that shorter inversions are more frequent than longer, and those that move the terminus of replication are less probable than those that do not. Current methods for establishing the inversion distance between two bacterial genomes are unable to incorporate such information. In this paper we suggest a group-theoretic framework that in principle can take these constraints into account. In particular, we show that by lifting the problem from circular permutations to the affine symmetric group, the inversion distance can be found in polynomial time for a model in which inversions are restricted to acting on two regions. This requires the proof of new results in group theory, and suggests a vein of new combinatorial problems concerning permutation groups on which group theorists will be needed to collaborate with biologists. We apply the new method to inferring distances and phylogenies for published Yersinia pestis data.Comment: 19 pages, 7 figures, in Press, Journal of Mathematical Biolog

    A Cognitive-Motivational Model of Group Member Decision Satisfaction

    Get PDF
    A theoretic model of group member decision satisfaction based on a cognitive-motivational view of information- processing in inferential contexts is presented. Unlike normative-rational theorists, we acknowledge that information-processing is biased by the decision-maker\u27s motivations which are assumed to derive from situation- specific goals. Information processing is assumed to be more extensive when judgmental accuracy is the salient goal and less extensive when other goals (e.g., self-esteem) are relatively more salient. The model analyzes the implications of this view for the relationship between confidence and satisfaction. Research propositions are advanced
    corecore