5 research outputs found
Bottleneck Problems: Information and Estimation-Theoretic View
Information bottleneck (IB) and privacy funnel (PF) are two closely related
optimization problems which have found applications in machine learning, design
of privacy algorithms, capacity problems (e.g., Mrs. Gerber's Lemma), strong
data processing inequalities, among others. In this work, we first investigate
the functional properties of IB and PF through a unified theoretical framework.
We then connect them to three information-theoretic coding problems, namely
hypothesis testing against independence, noisy source coding and dependence
dilution. Leveraging these connections, we prove a new cardinality bound for
the auxiliary variable in IB, making its computation more tractable for
discrete random variables.
In the second part, we introduce a general family of optimization problems,
termed as \textit{bottleneck problems}, by replacing mutual information in IB
and PF with other notions of mutual information, namely -information and
Arimoto's mutual information. We then argue that, unlike IB and PF, these
problems lead to easily interpretable guarantee in a variety of inference tasks
with statistical constraints on accuracy and privacy. Although the underlying
optimization problems are non-convex, we develop a technique to evaluate
bottleneck problems in closed form by equivalently expressing them in terms of
lower convex or upper concave envelope of certain functions. By applying this
technique to binary case, we derive closed form expressions for several
bottleneck problems
Collaborative information bottleneck
International audienceThis paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the Information Bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with respect to other unobserved (hidden) sources. More precisely, we study fundamental information-theoretic limits of the so-called: (i) Two-way Collaborative Information Bottleneck (TW-CIB) and (ii) the Collaborative Distributed Information Bottleneck (CDIB) problems. The TW-CIB problem consists of two distant encoders that separately observe marginal (dependent) components X1 and X2 and can cooperate through multiple exchanges of limited information with the aim of extracting information about hidden variables (Y1,Y2), which can be arbitrarily dependent on (X1,X2). On the other hand, in CDIB there are two cooperating encoders which separately observe X1 and X2 and a third node which can listen to the exchanges between the two encoders in order to obtain information about a hidden variable Y. The relevance (figure-of-merit) is measured in terms of a normalized (per-sample) multi-letter mutual information metric (log-loss fidelity) and an interesting tradeoff arises by constraining the complexity of descriptions, measured in terms of the rates needed for the exchanges between the encoders and decoders involved. Inner and outer bounds to the complexity-relevance region of these problems are derived from which optimality is characterized for several cases of interest. Our resulting theoretical complexity-relevance regions are finally evaluated for binary symmetric and Gaussian statistical models
Collaborative Information Bottleneck
This paper investigates a multi-terminal source coding problem under a logarithmic loss fidelity which does not necessarily lead to an additive distortion measure. The problem is motivated by an extension of the information bottleneck method to a multi-source scenario where several encoders have to build cooperatively rate-limited descriptions of their sources in order to maximize information with respect to other unobserved (hidden) sources. More precisely, we study fundamental informationtheoretic limits of the so-called: 1) two-way collaborative information bottleneck (TW-CIB) and 2) the collaborative distributed information bottleneck (CDIB) problems. The TW-CIB problem consists of two distant encoders that separately observe marginal (dependent) components X1 and X2 and can cooperate through multiple exchanges of limited information with the aim of extracting information about hidden variables (Y1, Y2), which can be arbitrarily dependent on (X1, X2). On the other hand, in CDIB, there are two cooperating encoders which separately observe X1 and X2 and a third node which can listen to the exchanges between the two encoders in order to obtain information about a hidden variable Y. The relevance (figureof-merit) is measured in terms of a normalized (per-sample) multi-letter mutual information metric (log-loss fidelity), and an interesting tradeoff arises by constraining the complexity of descriptions, measured in terms of the rates needed for the exchanges between the encoders and decoders involved. Inner and outer bounds to the complexity-relevance region of these problems are derived from which optimality is characterized for several cases of interest. Our resulting theoretical complexityrelevance regions are finally evaluated for binary symmetric and Gaussian statistical models, showing theoretical tradeoffs between the complexity-constrained descriptions and their relevance with respect to the hidden variablesFil: Vera, MatÃas Alejandro. Universidad de Buenos Aires. Facultad de IngenierÃa. Departamento de Electronica; ArgentinaFil: Rey Vega, Leonardo Javier. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Centro de Simulación Computacional para Aplicaciones Tecnológicas; Argentina. Universidad de Buenos Aires. Facultad de IngenierÃa. Departamento de Electronica; ArgentinaFil: Piantanida, Pablo. Université Paris Sud; Francia. Centre National de la Recherche Scientifique; Franci