140 research outputs found
Multi-Task Learning for Email Search Ranking with Auxiliary Query Clustering
User information needs vary significantly across different tasks, and
therefore their queries will also differ considerably in their expressiveness
and semantics. Many studies have been proposed to model such query diversity by
obtaining query types and building query-dependent ranking models. These
studies typically require either a labeled query dataset or clicks from
multiple users aggregated over the same document. These techniques, however,
are not applicable when manual query labeling is not viable, and aggregated
clicks are unavailable due to the private nature of the document collection,
e.g., in email search scenarios. In this paper, we study how to obtain query
type in an unsupervised fashion and how to incorporate this information into
query-dependent ranking models. We first develop a hierarchical clustering
algorithm based on truncated SVD and varimax rotation to obtain coarse-to-fine
query types. Then, we study three query-dependent ranking models, including two
neural models that leverage query type information as additional features, and
one novel multi-task neural model that views query type as the label for the
auxiliary query cluster prediction task. This multi-task model is trained to
simultaneously rank documents and predict query types. Our experiments on tens
of millions of real-world email search queries demonstrate that the proposed
multi-task model can significantly outperform the baseline neural ranking
models, which either do not incorporate query type information or just simply
feed query type as an additional feature.Comment: CIKM 201
Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization
This paper introduces Stochastic RAG--a novel approach for end-to-end
optimization of retrieval-augmented generation (RAG) models that relaxes the
simplifying assumptions of marginalization and document independence, made in
most prior work. Stochastic RAG casts the retrieval process in RAG as a
stochastic sampling without replacement process. Through this formulation, we
employ straight-through Gumbel-top-k that provides a differentiable
approximation for sampling without replacement and enables effective end-to-end
optimization for RAG. We conduct extensive experiments on seven diverse
datasets on a wide range of tasks, from open-domain question answering to fact
verification to slot-filling for relation extraction and to dialogue systems.
By applying this optimization method to a recent and effective RAG model, we
advance state-of-the-art results on six out of seven datasets.Comment: To appear in the proceedings of SIGIR 202
It's All Relative! -- A Synthetic Query Generation Approach for Improving Zero-Shot Relevance Prediction
Recent developments in large language models (LLMs) have shown promise in
their ability to generate synthetic query-document pairs by prompting with as
few as 8 demonstrations. This has enabled building better IR models, especially
for tasks with no training data readily available. Typically, such synthetic
query generation (QGen) approaches condition on an input context (e.g. a text
document) and generate a query relevant to that context, or condition the QGen
model additionally on the relevance label (e.g. relevant vs irrelevant) to
generate queries across relevance buckets. However, we find that such QGen
approaches are sub-optimal as they require the model to reason about the
desired label and the input from a handful of examples. In this work, we
propose to reduce this burden of LLMs by generating queries simultaneously for
different labels. We hypothesize that instead of asking the model to generate,
say, an irrelevant query given an input context, asking the model to generate
an irrelevant query relative to a relevant query is a much simpler task setup
for the model to reason about. Extensive experimentation across seven IR
datasets shows that synthetic queries generated in such a fashion translates to
a better downstream performance, suggesting that the generated queries are
indeed of higher quality.Comment: 18 page
Exact Eigenstates of Tight-Binding Hamiltonians on the Penrose Tiling
We investigate exact eigenstates of tight-binding models on the planar
rhombic Penrose tiling. We consider a vertex model with hopping along the edges
and the diagonals of the rhombi. For the wave functions, we employ an ansatz,
first introduced by Sutherland, which is based on the arrow decoration that
encodes the matching rules of the tiling. Exact eigenstates are constructed for
particular values of the hopping parameters and the eigenenergy. By a
generalized ansatz that exploits the inflation symmetry of the tiling, we show
that the corresponding eigenenergies are infinitely degenerate. Generalizations
and applications to other systems are outlined.Comment: 24 pages, REVTeX, 13 PostScript figures include
Take One Step at a Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context Learning
In-Context Learning (ICL) is an emergent capability of Large Language Models
(LLMs). Only a few demonstrations enable LLMs to be used as blackbox for new
tasks. Previous studies have shown that using LLMs' outputs as labels is
effective in training models to select demonstrations. Such a label is expected
to estimate utility of a demonstration in ICL; however, it has not been well
understood how different labeling strategies affect results on target tasks.
This paper presents an analysis on different utility functions by focusing on
LLMs' output probability given ground-truth output, and task-specific reward
given LLMs' prediction. Unlike the previous work, we introduce a novel labeling
method, incremental utility, which estimates how much incremental knowledge is
brought into the LLMs by a demonstration. We conduct experiments with
instruction-tuned LLMs on binary/multi-class classification, segmentation, and
translation across Arabic, English, Finnish, Japanese, and Spanish. Our results
show that (1) the probability is effective when the probability values are
distributed across the whole value range (on the classification tasks), and (2)
the downstream metric is more robust when nuanced reward values are provided
with long outputs (on the segmentation and translation tasks). We then show
that the proposed incremental utility further helps ICL by contrasting how the
LLMs perform with and without the demonstrations.Comment: Accepted as a long paper at NAACL 202
- …
