84,884 research outputs found
Efficient Learning for Undirected Topic Models
Replicated Softmax model, a well-known undirected topic model, is powerful in
extracting semantic representations of documents. Traditional learning
strategies such as Contrastive Divergence are very inefficient. This paper
provides a novel estimator to speed up the learning based on Noise Contrastive
Estimate, extended for documents of variant lengths and weighted inputs.
Experiments on two benchmarks show that the new estimator achieves great
learning efficiency and high accuracy on document retrieval and classification.Comment: Accepted by ACL-IJCNLP 2015 short paper. 6 page
Prototypical Contrastive Learning of Unsupervised Representations
This paper presents Prototypical Contrastive Learning (PCL), an unsupervised
representation learning method that addresses the fundamental limitations of
instance-wise contrastive learning. PCL not only learns low-level features for
the task of instance discrimination, but more importantly, it implicitly
encodes semantic structures of the data into the learned embedding space.
Specifically, we introduce prototypes as latent variables to help find the
maximum-likelihood estimation of the network parameters in an
Expectation-Maximization framework. We iteratively perform E-step as finding
the distribution of prototypes via clustering and M-step as optimizing the
network via contrastive learning. We propose ProtoNCE loss, a generalized
version of the InfoNCE loss for contrastive learning, which encourages
representations to be closer to their assigned prototypes. PCL outperforms
state-of-the-art instance-wise contrastive learning methods on multiple
benchmarks with substantial improvement in low-resource transfer learning. Code
and pretrained models are available at https://github.com/salesforce/PCL
Contrastive learning and neural oscillations
The concept of Contrastive Learning (CL) is developed as a family of possible learning algorithms for neural networks. CL is an extension of Deterministic Boltzmann Machines to more general dynamical systems. During learning, the network oscillates between two phases. One phase has a teacher signal and one phase has no teacher signal. The weights are updated using a learning rule that corresponds to gradient descent on a contrast function that measures the discrepancy between the free network and the network with a teacher signal. The CL approach provides a general unified framework for developing new learning algorithms. It also shows that many different types of clamping and teacher signals are possible. Several examples are given and an analysis of the landscape of the contrast function is proposed with some relevant predictions for the CL curves. An approach that may be suitable for collective analog implementations is described. Simulation results and possible extensions are briefly discussed together with a new conjecture regarding the function of certain oscillations in the brain. In the appendix, we also examine two extensions of contrastive learning to time-dependent trajectories
Deep Metric Learning with Angular Loss
The modern image search system requires semantic understanding of image, and
a key yet under-addressed problem is to learn a good metric for measuring the
similarity between images. While deep metric learning has yielded impressive
performance gains by extracting high level abstractions from image data, a
proper objective loss function becomes the central issue to boost the
performance. In this paper, we propose a novel angular loss, which takes angle
relationship into account, for learning better similarity metric. Whereas
previous metric learning methods focus on optimizing the similarity
(contrastive loss) or relative similarity (triplet loss) of image pairs, our
proposed method aims at constraining the angle at the negative point of triplet
triangles. Several favorable properties are observed when compared with
conventional methods. First, scale invariance is introduced, improving the
robustness of objective against feature variance. Second, a third-order
geometric constraint is inherently imposed, capturing additional local
structure of triplet triangles than contrastive loss or triplet loss. Third,
better convergence has been demonstrated by experiments on three publicly
available datasets.Comment: International Conference on Computer Vision 201
Unsupervised learning with contrastive latent variable models
In unsupervised learning, dimensionality reduction is an important tool for
data exploration and visualization. Because these aims are typically
open-ended, it can be useful to frame the problem as looking for patterns that
are enriched in one dataset relative to another. These pairs of datasets occur
commonly, for instance a population of interest vs. control or signal vs.
signal free recordings.However, there are few methods that work on sets of data
as opposed to data points or sequences. Here, we present a probabilistic model
for dimensionality reduction to discover signal that is enriched in the target
dataset relative to the background dataset. The data in these sets do not need
to be paired or grouped beyond set membership. By using a probabilistic model
where some structure is shared amongst the two datasets and some is unique to
the target dataset, we are able to recover interesting structure in the latent
space of the target dataset. The method also has the advantages of a
probabilistic model, namely that it allows for the incorporation of prior
information, handles missing data, and can be generalized to different
distributional assumptions. We describe several possible variations of the
model and demonstrate the application of the technique to de-noising, feature
selection, and subgroup discovery settings
- …
