140 research outputs found

    Multi-Task Learning for Email Search Ranking with Auxiliary Query Clustering

    Full text link
    User information needs vary significantly across different tasks, and therefore their queries will also differ considerably in their expressiveness and semantics. Many studies have been proposed to model such query diversity by obtaining query types and building query-dependent ranking models. These studies typically require either a labeled query dataset or clicks from multiple users aggregated over the same document. These techniques, however, are not applicable when manual query labeling is not viable, and aggregated clicks are unavailable due to the private nature of the document collection, e.g., in email search scenarios. In this paper, we study how to obtain query type in an unsupervised fashion and how to incorporate this information into query-dependent ranking models. We first develop a hierarchical clustering algorithm based on truncated SVD and varimax rotation to obtain coarse-to-fine query types. Then, we study three query-dependent ranking models, including two neural models that leverage query type information as additional features, and one novel multi-task neural model that views query type as the label for the auxiliary query cluster prediction task. This multi-task model is trained to simultaneously rank documents and predict query types. Our experiments on tens of millions of real-world email search queries demonstrate that the proposed multi-task model can significantly outperform the baseline neural ranking models, which either do not incorporate query type information or just simply feed query type as an additional feature.Comment: CIKM 201

    Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

    Full text link
    This paper introduces Stochastic RAG--a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models that relaxes the simplifying assumptions of marginalization and document independence, made in most prior work. Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process. Through this formulation, we employ straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG. We conduct extensive experiments on seven diverse datasets on a wide range of tasks, from open-domain question answering to fact verification to slot-filling for relation extraction and to dialogue systems. By applying this optimization method to a recent and effective RAG model, we advance state-of-the-art results on six out of seven datasets.Comment: To appear in the proceedings of SIGIR 202

    It's All Relative! -- A Synthetic Query Generation Approach for Improving Zero-Shot Relevance Prediction

    Full text link
    Recent developments in large language models (LLMs) have shown promise in their ability to generate synthetic query-document pairs by prompting with as few as 8 demonstrations. This has enabled building better IR models, especially for tasks with no training data readily available. Typically, such synthetic query generation (QGen) approaches condition on an input context (e.g. a text document) and generate a query relevant to that context, or condition the QGen model additionally on the relevance label (e.g. relevant vs irrelevant) to generate queries across relevance buckets. However, we find that such QGen approaches are sub-optimal as they require the model to reason about the desired label and the input from a handful of examples. In this work, we propose to reduce this burden of LLMs by generating queries simultaneously for different labels. We hypothesize that instead of asking the model to generate, say, an irrelevant query given an input context, asking the model to generate an irrelevant query relative to a relevant query is a much simpler task setup for the model to reason about. Extensive experimentation across seven IR datasets shows that synthetic queries generated in such a fashion translates to a better downstream performance, suggesting that the generated queries are indeed of higher quality.Comment: 18 page

    Exact Eigenstates of Tight-Binding Hamiltonians on the Penrose Tiling

    Full text link
    We investigate exact eigenstates of tight-binding models on the planar rhombic Penrose tiling. We consider a vertex model with hopping along the edges and the diagonals of the rhombi. For the wave functions, we employ an ansatz, first introduced by Sutherland, which is based on the arrow decoration that encodes the matching rules of the tiling. Exact eigenstates are constructed for particular values of the hopping parameters and the eigenenergy. By a generalized ansatz that exploits the inflation symmetry of the tiling, we show that the corresponding eigenenergies are infinitely degenerate. Generalizations and applications to other systems are outlined.Comment: 24 pages, REVTeX, 13 PostScript figures include

    Take One Step at a Time to Know Incremental Utility of Demonstration: An Analysis on Reranking for Few-Shot In-Context Learning

    Full text link
    In-Context Learning (ICL) is an emergent capability of Large Language Models (LLMs). Only a few demonstrations enable LLMs to be used as blackbox for new tasks. Previous studies have shown that using LLMs' outputs as labels is effective in training models to select demonstrations. Such a label is expected to estimate utility of a demonstration in ICL; however, it has not been well understood how different labeling strategies affect results on target tasks. This paper presents an analysis on different utility functions by focusing on LLMs' output probability given ground-truth output, and task-specific reward given LLMs' prediction. Unlike the previous work, we introduce a novel labeling method, incremental utility, which estimates how much incremental knowledge is brought into the LLMs by a demonstration. We conduct experiments with instruction-tuned LLMs on binary/multi-class classification, segmentation, and translation across Arabic, English, Finnish, Japanese, and Spanish. Our results show that (1) the probability is effective when the probability values are distributed across the whole value range (on the classification tasks), and (2) the downstream metric is more robust when nuanced reward values are provided with long outputs (on the segmentation and translation tasks). We then show that the proposed incremental utility further helps ICL by contrasting how the LLMs perform with and without the demonstrations.Comment: Accepted as a long paper at NAACL 202
    corecore