41,428 research outputs found
Dynamical properties of a randomly diluted neural network with variable activity
The subject of study is a neural network with binary neurons, randomly
diluted synapses and variable pattern activity. We look at the system with
parallel updating using a probabilistic approach to solve the one step dynamics
with one condensed pattern. We derive restrictions on the storage capacity and
the mutual information content occuring during the retrieval process. Special
focus is on the constraints on the threshold for optimal performance. We also
look at the effect of noisy updating, giving a dynamical version of the
critical temperature, the corresponding threshold and an approximation for the
time evolution for small temperatures. The description is applicable to the
whole retrieval process in the limit of strong dilution. The analysis is
carried out as exactly as possible and over the full parameter ranges,
generalizing some former results.Comment: 15 pages, 5 figures, to be published in Journal of Physics
Parameterized Neural Network Language Models for Information Retrieval
Information Retrieval (IR) models need to deal with two difficult issues,
vocabulary mismatch and term dependencies. Vocabulary mismatch corresponds to
the difficulty of retrieving relevant documents that do not contain exact query
terms but semantically related terms. Term dependencies refers to the need of
considering the relationship between the words of the query when estimating the
relevance of a document. A multitude of solutions has been proposed to solve
each of these two problems, but no principled model solve both. In parallel, in
the last few years, language models based on neural networks have been used to
cope with complex natural language processing tasks like emotion and paraphrase
detection. Although they present good abilities to cope with both term
dependencies and vocabulary mismatch problems, thanks to the distributed
representation of words they are based upon, such models could not be used
readily in IR, where the estimation of one language model per document (or
query) is required. This is both computationally unfeasible and prone to
over-fitting. Based on a recent work that proposed to learn a generic language
model that can be modified through a set of document-specific parameters, we
explore use of new neural network models that are adapted to ad-hoc IR tasks.
Within the language model IR framework, we propose and study the use of a
generic language model as well as a document-specific language model. Both can
be used as a smoothing component, but the latter is more adapted to the
document at hand and has the potential of being used as a full document
language model. We experiment with such models and analyze their results on
TREC-1 to 8 datasets
Variational Deep Semantic Hashing for Text Documents
As the amount of textual data has been rapidly increasing over the past
decade, efficient similarity search methods have become a crucial component of
large-scale information retrieval systems. A popular strategy is to represent
original data samples by compact binary codes through hashing. A spectrum of
machine learning methods have been utilized, but they often lack expressiveness
and flexibility in modeling to learn effective representations. The recent
advances of deep learning in a wide range of applications has demonstrated its
capability to learn robust and powerful feature representations for complex
data. Especially, deep generative models naturally combine the expressiveness
of probabilistic generative models with the high capacity of deep neural
networks, which is very suitable for text modeling. However, little work has
leveraged the recent progress in deep learning for text hashing.
In this paper, we propose a series of novel deep document generative models
for text hashing. The first proposed model is unsupervised while the second one
is supervised by utilizing document labels/tags for hashing. The third model
further considers document-specific factors that affect the generation of
words. The probabilistic generative formulation of the proposed models provides
a principled framework for model extension, uncertainty estimation, simulation,
and interpretability. Based on variational inference and reparameterization,
the proposed models can be interpreted as encoder-decoder deep neural networks
and thus they are capable of learning complex nonlinear distributed
representations of the original documents. We conduct a comprehensive set of
experiments on four public testbeds. The experimental results have demonstrated
the effectiveness of the proposed supervised learning models for text hashing.Comment: 11 pages, 4 figure
Neural Networks for Information Retrieval
Machine learning plays a role in many aspects of modern IR systems, and deep
learning is applied in all of them. The fast pace of modern-day research has
given rise to many different approaches for many different IR problems. The
amount of information available can be overwhelming both for junior students
and for experienced researchers looking for new research topics and directions.
Additionally, it is interesting to see what key insights into IR problems the
new technologies are able to give us. The aim of this full-day tutorial is to
give a clear overview of current tried-and-trusted neural methods in IR and how
they benefit IR research. It covers key architectures, as well as the most
promising future directions.Comment: Overview of full-day tutorial at SIGIR 201
Ice water path retrievals from Meteosat-9 using quantile regression neural networks
The relationship between geostationary radiances and ice water path (IWP) is complex, and traditional retrieval approaches are not optimal. This work applies machine learning to improve the IWP retrieval from Meteosat-9 observations, with a focus on low latitudes, training the models against retrievals based on CloudSat. Advantages of machine learning include avoiding explicit physical assumptions on the data, an efficient use of information from all channels, and easily leveraging spatial information. Thermal infrared (IR) retrievals are used as input to achieve a performance independent of the solar angle. They are compared with retrievals including solar reflectances as well as a subset of IR channels for compatibility with historical sensors. The retrievals are accomplished with quantile regression neural networks. This network type provides case-specific uncertainty estimates, compatible with non-Gaussian errors, and is flexible enough to be applied to different network architectures. Spatial information is incorporated into the network through a convolutional neural network (CNN) architecture. This choice outperforms architectures that only work pixelwise. In fact, the CNN shows a good retrieval performance by using only IR channels. This makes it possible to compute diurnal cycles, a problem that CloudSat cannot resolve due to its limited temporal and spatial sampling. These retrievals compare favourably with IWP retrievals in CLAAS, a dataset based on a traditional approach. These results highlight the possibilities to overcome limitations from physics-based approaches using machine learning while providing efficient, probabilistic IWP retrieval methods. Moreover, they suggest this first work can be extended to higher latitudes as well as that geostationary data can be considered as a complement to the upcoming Ice Cloud Imager mission, for example, to bridge the gap in temporal sampling with respect to space-based radars
- …