5 research outputs found

    Bottleneck Problems: Information and Estimation-Theoretic View

    Full text link
    Information bottleneck (IB) and privacy funnel (PF) are two closely related optimization problems which have found applications in machine learning, design of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong data processing inequalities, among others. In this work, we first investigate the functional properties of IB and PF through a unified theoretical framework. We then connect them to three information-theoretic coding problems, namely hypothesis testing against independence, noisy source coding and dependence dilution. Leveraging these connections, we prove a new cardinality bound for the auxiliary variable in IB, making its computation more tractable for discrete random variables. In the second part, we introduce a general family of optimization problems, termed as \textit{bottleneck problems}, by replacing mutual information in IB and PF with other notions of mutual information, namely ff-information and Arimoto's mutual information. We then argue that, unlike IB and PF, these problems lead to easily interpretable guarantee in a variety of inference tasks with statistical constraints on accuracy and privacy. Although the underlying optimization problems are non-convex, we develop a technique to evaluate bottleneck problems in closed form by equivalently expressing them in terms of lower convex or upper concave envelope of certain functions. By applying this technique to binary case, we derive closed form expressions for several bottleneck problems

    Collaborative information bottleneck

    No full text
    International audienceThis paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the Information Bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with respect to other unobserved (hidden) sources. More precisely, we study fundamental information-theoretic limits of the so-called: (i) Two-way Collaborative Information Bottleneck (TW-CIB) and (ii) the Collaborative Distributed Information Bottleneck (CDIB) problems. The TW-CIB problem consists of two distant encoders that separately observe marginal (dependent) components X1 and X2 and can cooperate through multiple exchanges of limited information with the aim of extracting information about hidden variables (Y1,Y2), which can be arbitrarily dependent on (X1,X2). On the other hand, in CDIB there are two cooperating encoders which separately observe X1 and X2 and a third node which can listen to the exchanges between the two encoders in order to obtain information about a hidden variable Y. The relevance (figure-of-merit) is measured in terms of a normalized (per-sample) multi-letter mutual information metric (log-loss fidelity) and an interesting tradeoff arises by constraining the complexity of descriptions, measured in terms of the rates needed for the exchanges between the encoders and decoders involved. Inner and outer bounds to the complexity-relevance region of these problems are derived from which optimality is characterized for several cases of interest. Our resulting theoretical complexity-relevance regions are finally evaluated for binary symmetric and Gaussian statistical models

    Collaborative Information Bottleneck

    No full text
    This paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the information bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with respect to other unobserved (hidden) sources. More precisely, we study fundamental informationtheoretic limits of the so-called: 1) two-way collaborative information bottleneck (TW-CIB) and 2) the collaborative distributed information bottleneck (CDIB) problems. The TW-CIB problem consists of two distant encoders that separately observe marginal (dependent) components X1 and X2 and can cooperate through multiple exchanges of limited information with the aim of extracting information about hidden variables (Y1, Y2), which can be arbitrarily dependent on (X1, X2). On the other hand, in CDIB, there are two cooperating encoders which separately observe X1 and X2 and a third node which can listen to the exchanges between the two encoders in order to obtain information about a hidden variable Y. The relevance (figureof-merit) is measured in terms of a normalized (per-sample) multi-letter mutual information metric (log-loss fidelity), and an interesting tradeoff arises by constraining the complexity of descriptions, measured in terms of the rates needed for the exchanges between the encoders and decoders involved. Inner and outer bounds to the complexity-relevance region of these problems are derived from which optimality is characterized for several cases of interest. Our resulting theoretical complexityrelevance regions are finally evaluated for binary symmetric and Gaussian statistical models, showing theoretical tradeoffs between the complexity-constrained descriptions and their relevance with respect to the hidden variablesFil: Vera, Matías Alejandro. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; ArgentinaFil: Rey Vega, Leonardo Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Simulación Computacional para Aplicaciones Tecnológicas; Argentina. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; ArgentinaFil: Piantanida, Pablo. Université Paris Sud; Francia. Centre National de la Recherche Scientifique; Franci
    corecore