12,232 research outputs found
Deep Learning Relevance: Creating Relevant Information (as Opposed to Retrieving it)
What if Information Retrieval (IR) systems did not just retrieve relevant
information that is stored in their indices, but could also "understand" it and
synthesise it into a single document? We present a preliminary study that makes
a first step towards answering this question. Given a query, we train a
Recurrent Neural Network (RNN) on existing relevant information to that query.
We then use the RNN to "deep learn" a single, synthetic, and we assume,
relevant document for that query. We design a crowdsourcing experiment to
assess how relevant the "deep learned" document is, compared to existing
relevant documents. Users are shown a query and four wordclouds (of three
existing relevant documents and our deep learned synthetic document). The
synthetic document is ranked on average most relevant of all.Comment: Neu-IR '16 SIGIR Workshop on Neural Information Retrieval, July 21,
2016, Pisa, Ital
LiveSketch: Query Perturbations for Guided Sketch-based Visual Search
LiveSketch is a novel algorithm for searching large image collections using
hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch
search by creating visual suggestions that augment the query as it is drawn,
making query specification an iterative rather than one-shot process that helps
disambiguate users' search intent. Our technical contributions are: a triplet
convnet architecture that incorporates an RNN based variational autoencoder to
search for images using vector (stroke-based) queries; real-time clustering to
identify likely search intents (and so, targets within the search embedding);
and the use of backpropagation from those targets to perturb the input stroke
sequence, so suggesting alterations to the query in order to guide the search.
We show improvements in accuracy and time-to-task over contemporary baselines
using a 67M image corpus.Comment: Accepted to CVPR 201
Language Transfer of Audio Word2Vec: Learning Audio Segment Representations without Target Language Data
Audio Word2Vec offers vector representations of fixed dimensionality for
variable-length audio segments using Sequence-to-sequence Autoencoder (SA).
These vector representations are shown to describe the sequential phonetic
structures of the audio segments to a good degree, with real world applications
such as query-by-example Spoken Term Detection (STD). This paper examines the
capability of language transfer of Audio Word2Vec. We train SA from one
language (source language) and use it to extract the vector representation of
the audio segments of another language (target language). We found that SA can
still catch phonetic structure from the audio segments of the target language
if the source and target languages are similar. In query-by-example STD, we
obtain the vector representations from the SA learned from a large amount of
source language data, and found them surpass the representations from naive
encoder and SA directly learned from a small amount of target language data.
The result shows that it is possible to learn Audio Word2Vec model from
high-resource languages and use it on low-resource languages. This further
expands the usability of Audio Word2Vec.Comment: arXiv admin note: text overlap with arXiv:1603.0098
A Hierarchical Recurrent Encoder-Decoder For Generative Context-Aware Query Suggestion
Users may strive to formulate an adequate textual query for their information
need. Search engines assist the users by presenting query suggestions. To
preserve the original search intent, suggestions should be context-aware and
account for the previous queries issued by the user. Achieving context
awareness is challenging due to data sparsity. We present a probabilistic
suggestion model that is able to account for sequences of previous queries of
arbitrary lengths. Our novel hierarchical recurrent encoder-decoder
architecture allows the model to be sensitive to the order of queries in the
context while avoiding data sparsity. Additionally, our model can suggest for
rare, or long-tail, queries. The produced suggestions are synthetic and are
sampled one word at a time, using computationally cheap decoding techniques.
This is in contrast to current synthetic suggestion models relying upon machine
learning pipelines and hand-engineered feature sets. Results show that it
outperforms existing context-aware approaches in a next query prediction
setting. In addition to query suggestion, our model is general enough to be
used in a variety of other applications.Comment: To appear in Conference of Information Knowledge and Management
(CIKM) 201
- …