Search CORE

58,040 research outputs found

Caveats for information bottleneck in deterministic scenarios

Author: Kolchinsky Artemy
Tracey Brendan D.
Van Kuyk Steven
Publication venue
Publication date: 08/02/2019
Field of study

Information bottleneck (IB) is a method for extracting information from one random variable

X

that is relevant for predicting another random variable

Y

. To do so, IB identifies an intermediate "bottleneck" variable

T

that has low mutual information

I(X;T)

and high mutual information

I(Y;T)

. The "IB curve" characterizes the set of bottleneck variables that achieve maximal

I(Y;T)

for a given

I(X;T)

, and is typically explored by maximizing the "IB Lagrangian",

I(Y;T) - \beta I(X;T)

. In some cases,

Y

is a deterministic function of

X

, including many classification problems in supervised learning where the output class

Y

is a deterministic function of the input

X

. We demonstrate three caveats when using IB in any situation where

Y

is a deterministic function of

X

: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of

\beta

; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when

Y

is a small perturbation away from being a deterministic function of

X

, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset

arXiv.org e-Print Archive

Evolution towards Smart Optical Networking: Where Artificial Intelligence (AI) meets the World of Photonics

Author: Borkowski
Chamania
Kyriakopoulos
Largo
Morales
Thrane
Zibar
Publication venue
Publication date: 01/01/2017
Field of study

Smart optical networks are the next evolution of programmable networking and programmable automation of optical networks, with human-in-the-loop network control and management. The paper discusses this evolution and the role of Artificial Intelligence (AI)

arXiv.org e-Print Archive

Crossref

High capacity associative memory with bipolar and binary, biased patterns

Author: Adams R.G.
Calcraft L.
Chen W.
Davey N.
Steuber Volker
Publication venue
Publication date: 01/01/2007
Field of study

The high capacity associative memory model is interesting due to its significantly higher capacity when compared with the standard Hopfield model. These networks can use either bipolar or binary patterns, which may also be biased. This paper investigates the performance of a high capacity associative memory model trained with biased patterns, using either bipolar or binary representations. Our results indicate that the binary network performs less well under low bias, but better in other situations, compared with the bipolar network.Peer reviewe

University of Hertfordshire Research Archive