7,503 research outputs found
A Multiplicative Model for Learning Distributed Text-Based Attribute Representations
In this paper we propose a general framework for learning distributed
representations of attributes: characteristics of text whose representations
can be jointly learned with word embeddings. Attributes can correspond to
document indicators (to learn sentence vectors), language indicators (to learn
distributed language representations), meta-data and side information (such as
the age, gender and industry of a blogger) or representations of authors. We
describe a third-order model where word context and attribute vectors interact
multiplicatively to predict the next word in a sequence. This leads to the
notion of conditional word similarity: how meanings of words change when
conditioned on different attributes. We perform several experimental tasks
including sentiment classification, cross-lingual document classification, and
blog authorship attribution. We also qualitatively evaluate conditional word
neighbours and attribute-conditioned text generation.Comment: 11 pages. An earlier version was accepted to the ICML-2014 Workshop
on Knowledge-Powered Deep Learning for Text Minin
Manifold Relevance Determination
In this paper we present a fully Bayesian latent variable model which
exploits conditional nonlinear(in)-dependence structures to learn an efficient
latent representation. The latent space is factorized to represent shared and
private information from multiple views of the data. In contrast to previous
approaches, we introduce a relaxation to the discrete segmentation and allow
for a "softly" shared latent space. Further, Bayesian techniques allow us to
automatically estimate the dimensionality of the latent spaces. The model is
capable of capturing structure underlying extremely high dimensional spaces.
This is illustrated by modelling unprocessed images with tenths of thousands of
pixels. This also allows us to directly generate novel images from the trained
model by sampling from the discovered latent spaces. We also demonstrate the
model by prediction of human pose in an ambiguous setting. Our Bayesian
framework allows us to perform disambiguation in a principled manner by
including latent space priors which incorporate the dynamic nature of the data.Comment: ICML201
Relating visual and semantic image descriptors
This paper addresses the automatic analysis of visual content and extraction of metadata beyond pure visual descriptors. Two approaches are described: Automatic Image Annotation (AIA) and Confidence Clustering (CC). AIA attempts to automatically classify images based on two binary classifiers and is
designed for the consumer electronics domain. Contrastingly, the CC approach does not attempt to assign a unique label to images but rather to organise the database based on concepts
Interactions In Space For Archaeological Models
In this article we examine a variety of quantitative models for describing
archaeological networks, with particular emphasis on the maritime networks of
the Aegean Middle Bronze Age. In particular, we discriminate between those
gravitational networks that are most likely (maximum entropy) and most
efficient (best cost/benefit outcomes).Comment: 21 pages, 6 figures, 2 tables. Contribution to special issue of
Advances in Complex Systems from the conference `Cultural Evolution in
Spatially Structured Populations', UCL, London, September 2010. To appear in
Advances in Complex System
Generative Temporal Models with Spatial Memory for Partially Observed Environments
In model-based reinforcement learning, generative and temporal models of
environments can be leveraged to boost agent performance, either by tuning the
agent's representations during training or via use as part of an explicit
planning mechanism. However, their application in practice has been limited to
simplistic environments, due to the difficulty of training such models in
larger, potentially partially-observed and 3D environments. In this work we
introduce a novel action-conditioned generative model of such challenging
environments. The model features a non-parametric spatial memory system in
which we store learned, disentangled representations of the environment.
Low-dimensional spatial updates are computed using a state-space model that
makes use of knowledge on the prior dynamics of the moving agent, and
high-dimensional visual observations are modelled with a Variational
Auto-Encoder. The result is a scalable architecture capable of performing
coherent predictions over hundreds of time steps across a range of partially
observed 2D and 3D environments.Comment: ICML 201
Replica theory for learning curves for Gaussian processes on random graphs
Statistical physics approaches can be used to derive accurate predictions for
the performance of inference methods learning from potentially noisy data, as
quantified by the learning curve defined as the average error versus number of
training examples. We analyse a challenging problem in the area of
non-parametric inference where an effectively infinite number of parameters has
to be learned, specifically Gaussian process regression. When the inputs are
vertices on a random graph and the outputs noisy function values, we show that
replica techniques can be used to obtain exact performance predictions in the
limit of large graphs. The covariance of the Gaussian process prior is defined
by a random walk kernel, the discrete analogue of squared exponential kernels
on continuous spaces. Conventionally this kernel is normalised only globally,
so that the prior variance can differ between vertices; as a more principled
alternative we consider local normalisation, where the prior variance is
uniform
- …