87 research outputs found
The DLR Hierarchy of Approximate Inference
We propose a hierarchy for approximate inference based on the Dobrushin,
Lanford, Ruelle (DLR) equations. This hierarchy includes existing algorithms,
such as belief propagation, and also motivates novel algorithms such as
factorized neighbors (FN) algorithms and variants of mean field (MF)
algorithms. In particular, we show that extrema of the Bethe free energy
correspond to approximate solutions of the DLR equations. In addition, we
demonstrate a close connection between these approximate algorithms and Gibbs
sampling. Finally, we compare and contrast various of the algorithms in the DLR
hierarchy on spin-glass problems. The experiments show that algorithms higher
up in the hierarchy give more accurate results when they converge but tend to
be less stable.Comment: Appears in Proceedings of the Twenty-First Conference on Uncertainty
in Artificial Intelligence (UAI2005
Cryptography based on neural networks - analytical results
Mutual learning process between two parity feed-forward networks with
discrete and continuous weights is studied analytically, and we find that the
number of steps required to achieve full synchronization between the two
networks in the case of discrete weights is finite. The synchronization process
is shown to be non-self-averaging and the analytical solution is based on
random auxiliary variables. The learning time of an attacker that is trying to
imitate one of the networks is examined analytically and is found to be much
longer than the synchronization time. Analytical results are found to be in
agreement with simulations
Mutual learning in a tree parity machine and its application to cryptography
Mutual learning of a pair of tree parity machines with continuous and
discrete weight vectors is studied analytically. The analysis is based on a
mapping procedure that maps the mutual learning in tree parity machines onto
mutual learning in noisy perceptrons. The stationary solution of the mutual
learning in the case of continuous tree parity machines depends on the learning
rate where a phase transition from partial to full synchronization is observed.
In the discrete case the learning process is based on a finite increment and a
full synchronized state is achieved in a finite number of steps. The
synchronization of discrete parity machines is introduced in order to construct
an ephemeral key-exchange protocol. The dynamic learning of a third tree parity
machine (an attacker) that tries to imitate one of the two machines while the
two still update their weight vectors is also analyzed. In particular, the
synchronization times of the naive attacker and the flipping attacker recently
introduced in [1] are analyzed. All analytical results are found to be in good
agreement with simulation results
Training a perceptron in a discrete weight space
On-line and batch learning of a perceptron in a discrete weight space, where
each weight can take different values, are examined analytically and
numerically. The learning algorithm is based on the training of the continuous
perceptron and prediction following the clipped weights. The learning is
described by a new set of order parameters, composed of the overlaps between
the teacher and the continuous/clipped students. Different scenarios are
examined among them on-line learning with discrete/continuous transfer
functions and off-line Hebb learning. The generalization error of the clipped
weights decays asymptotically as / in the case of on-line learning with binary/continuous activation
functions, respectively, where is the number of examples divided by N,
the size of the input vector and is a positive constant that decays
linearly with 1/L. For finite and , a perfect agreement between the
discrete student and the teacher is obtained for . A crossover to the generalization error ,
characterized continuous weights with binary output, is obtained for synaptic
depth .Comment: 10 pages, 5 figs., submitted to PR
Minimally Supervised Categorization of Text with Metadata
Document categorization, which aims to assign a topic label to each document,
plays a fundamental role in a wide variety of applications. Despite the success
of existing studies in conventional supervised document classification, they
are less concerned with two real problems: (1) the presence of metadata: in
many domains, text is accompanied by various additional information such as
authors and tags. Such metadata serve as compelling topic indicators and should
be leveraged into the categorization framework; (2) label scarcity: labeled
training samples are expensive to obtain in some cases, where categorization
needs to be performed using only a small set of annotated data. In recognition
of these two challenges, we propose MetaCat, a minimally supervised framework
to categorize text with metadata. Specifically, we develop a generative process
describing the relationships between words, documents, labels, and metadata.
Guided by the generative model, we embed text and metadata into the same
semantic space to encode heterogeneous signals. Then, based on the same
generative process, we synthesize training samples to address the bottleneck of
label scarcity. We conduct a thorough evaluation on a wide range of datasets.
Experimental results prove the effectiveness of MetaCat over many competitive
baselines.Comment: 10 pages; Accepted to SIGIR 2020; Some typos fixe
Multilayer neural networks with extensively many hidden units
The information processing abilities of a multilayer neural network with a
number of hidden units scaling as the input dimension are studied using
statistical mechanics methods. The mapping from the input layer to the hidden
units is performed by general symmetric Boolean functions whereas the hidden
layer is connected to the output by either discrete or continuous couplings.
Introducing an overlap in the space of Boolean functions as order parameter the
storage capacity if found to scale with the logarithm of the number of
implementable Boolean functions. The generalization behaviour is smooth for
continuous couplings and shows a discontinuous transition to perfect
generalization for discrete ones.Comment: 4 pages, 2 figure
- …