4 research outputs found

    Towards Information Theory-Based Discovery of Equivariances

    Get PDF
    © 2023 H. Charvin, N. Catenacci Volpi & D. Polani.The presence of symmetries imposes a stringent set of constraints on a system. This constrained structure allows intelligent agents interacting with such a system to drasti- cally improve the efficiency of learning and generalization, through the internalisation of the system’s symmetries into their information-processing. In parallel, principled mod- els of complexity-constrained learning and behaviour make increasing use of information- theoretic methods. Here, we wish to marry these two perspectives and understand whether and in which form the information-theoretic lens can “see” the effect of symmetries of a system. For this purpose, we propose a novel variant of the Information Bottleneck prin- ciple, which has served as a productive basis for many principled studies of learning and information-constrained adaptive behaviour. We show (in the discrete case) that our ap- proach formalises a certain duality between symmetry and information parsimony: namely, channel equivariances can be characterised by the optimal mutual information-preserving joint compression of the channel’s input and output. This information-theoretic treatment furthermore suggests a principled notion of “soft” equivariance, whose “coarseness” is mea- sured by the amount of input-output mutual information preserved by the corresponding optimal compression. This new notion offers a bridge between the field of bounded ratio- nality and the study of symmetries in neural representations. The framework may also allow (exact and soft) equivariances to be automatically discovered.Peer reviewe

    Towards Information Theory-Based Discovery of Equivariances

    Full text link
    The presence of symmetries imposes a stringent set of constraints on a system. This constrained structure allows intelligent agents interacting with such a system to drastically improve the efficiency of learning and generalization, through the internalisation of the system's symmetries into their information-processing. In parallel, principled models of complexity-constrained learning and behaviour make increasing use of information-theoretic methods. Here, we wish to marry these two perspectives and understand whether and in which form the information-theoretic lens can "see" the effect of symmetries of a system. For this purpose, we propose a novel variant of the Information Bottleneck principle, which has served as a productive basis for many principled studies of learning and information-constrained adaptive behaviour. We show (in the discrete case) that our approach formalises a certain duality between symmetry and information parsimony: namely, channel equivariances can be characterised by the optimal mutual information-preserving joint compression of the channel's input and output. This information-theoretic treatment furthermore suggests a principled notion of "soft" equivariance, whose "coarseness" is measured by the amount of input-output mutual information preserved by the corresponding optimal compression. This new notion offers a bridge between the field of bounded rationality and the study of symmetries in neural representations. The framework may also allow (exact and soft) equivariances to be automatically discovered.Comment: 23 pages, 0 figure

    Exact and Soft Successive Refinement of the Information Bottleneck

    Get PDF
    © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY), https://creativecommons.org/licenses/by/4.0/The information bottleneck (IB) framework formalises the essential requirement for efficient information processing systems to achieve an optimal balance between the complexity of their representation and the amount of information extracted about relevant features. However, since the representation complexity affordable by real-world systems may vary in time, the processing cost of updating the representations should also be taken into account. A crucial question is thus the extent to which adaptive systems can leverage the information content of already existing IB-optimal representations for producing new ones, which target the same relevant features but at a different granularity. We investigate the information-theoretic optimal limits of this process by studying and extending, within the IB framework, the notion of successive refinement, which describes the ideal situation where no information needs to be discarded for adapting an IB-optimal representation’s granularity. Thanks in particular to a new geometric characterisation, we analytically derive the successive refinability of some specific IB problems (for binary variables, for jointly Gaussian variables, and for the relevancy variable being a deterministic function of the source variable), and provide a linear-programming-based tool to numerically investigate, in the discrete case, the successive refinement of the IB. We then soften this notion into a quantification of the loss of information optimality induced by several-stage processing through an existing measure of unique information. Simple numerical experiments suggest that this quantity is typically low, though not entirely negligible. These results could have important implications for (i) the structure and efficiency of incremental learning in biological and artificial agents, (ii) the comparison of IB-optimal observation channels in statistical decision problems, and (iii) the IB theory of deep neural networks.Peer reviewe

    Successive Refinement and Coarsening of the Information Bottleneck

    Get PDF
    We study two central aspects of information processing in cognitive systems: one is the ability to incorporate fresh information to already learnt models; the other is the “trickling” of information through the many layers of a cognitive processing pipeline. We investigate the extent to which these specific structures of cognitive processing impact their informational optimal limits. To do so, we present mathematical characterisations and low-dimensional numerical examples, which explore formal properties of the Information Bottleneck method: namely, how it relates to successive refinement, and successive coarsening of information.Peer reviewe
    corecore