675 research outputs found
Reinforcement Learning for Generative AI: A Survey
Deep Generative AI has been a long-standing essential topic in the machine
learning community, which can impact a number of application areas like text
generation and computer vision. The major paradigm to train a generative model
is maximum likelihood estimation, which pushes the learner to capture and
approximate the target data distribution by decreasing the divergence between
the model distribution and the target distribution. This formulation
successfully establishes the objective of generative tasks, while it is
incapable of satisfying all the requirements that a user might expect from a
generative model. Reinforcement learning, serving as a competitive option to
inject new training signals by creating new objectives that exploit novel
signals, has demonstrated its power and flexibility to incorporate human
inductive bias from multiple angles, such as adversarial learning,
hand-designed rules and learned reward model to build a performant model.
Thereby, reinforcement learning has become a trending research field and has
stretched the limits of generative AI in both model design and application. It
is reasonable to summarize and conclude advances in recent years with a
comprehensive review. Although there are surveys in different application areas
recently, this survey aims to shed light on a high-level review that spans a
range of application areas. We provide a rigorous taxonomy in this area and
make sufficient coverage on various models and applications. Notably, we also
surveyed the fast-developing large language model area. We conclude this survey
by showing the potential directions that might tackle the limit of current
models and expand the frontiers for generative AI
Query Understanding in the Age of Large Language Models
Querying, conversing, and controlling search and information-seeking
interfaces using natural language are fast becoming ubiquitous with the rise
and adoption of large-language models (LLM). In this position paper, we
describe a generic framework for interactive query-rewriting using LLMs. Our
proposal aims to unfold new opportunities for improved and transparent intent
understanding while building high-performance retrieval systems using LLMs. A
key aspect of our framework is the ability of the rewriter to fully specify the
machine intent by the search engine in natural language that can be further
refined, controlled, and edited before the final retrieval phase. The ability
to present, interact, and reason over the underlying machine intent in natural
language has profound implications on transparency, ranking performance, and a
departure from the traditional way in which supervised signals were collected
for understanding intents. We detail the concept, backed by initial
experiments, along with open questions for this interactive query understanding
framework.Comment: Accepted to GENIR(SIGIR'23
Large Language Model based Long-tail Query Rewriting in Taobao Search
In the realm of e-commerce search, the significance of semantic matching
cannot be overstated, as it directly impacts both user experience and company
revenue. Along this line, query rewriting, serving as an important technique to
bridge the semantic gaps inherent in the semantic matching process, has
attached wide attention from the industry and academia. However, existing query
rewriting methods often struggle to effectively optimize long-tail queries and
alleviate the phenomenon of "few-recall" caused by semantic gap. In this paper,
we present BEQUE, a comprehensive framework that Bridges the sEmantic gap for
long-tail QUEries. In detail, BEQUE comprises three stages: multi-instruction
supervised fine tuning (SFT), offline feedback, and objective alignment. We
first construct a rewriting dataset based on rejection sampling and auxiliary
tasks mixing to fine-tune our large language model (LLM) in a supervised
fashion. Subsequently, with the well-trained LLM, we employ beam search to
generate multiple candidate rewrites, and feed them into Taobao offline system
to obtain the partial order. Leveraging the partial order of rewrites, we
introduce a contrastive learning method to highlight the distinctions between
rewrites, and align the model with the Taobao online objectives. Offline
experiments prove the effectiveness of our method in bridging semantic gap.
Online A/B tests reveal that our method can significantly boost gross
merchandise volume (GMV), number of transaction (#Trans) and unique visitor
(UV) for long-tail queries. BEQUE has been deployed on Taobao, one of most
popular online shopping platforms in China, since October 2023.Comment: WWW Industry Under Revie
Context Aware Query Rewriting for Text Rankers using LLM
Query rewriting refers to an established family of approaches that are
applied to underspecified and ambiguous queries to overcome the vocabulary
mismatch problem in document ranking. Queries are typically rewritten during
query processing time for better query modelling for the downstream ranker.
With the advent of large-language models (LLMs), there have been initial
investigations into using generative approaches to generate pseudo documents to
tackle this inherent vocabulary gap. In this work, we analyze the utility of
LLMs for improved query rewriting for text ranking tasks. We find that there
are two inherent limitations of using LLMs as query re-writers -- concept drift
when using only queries as prompts and large inference costs during query
processing. We adopt a simple, yet surprisingly effective, approach called
context aware query rewriting (CAR) to leverage the benefits of LLMs for query
understanding. Firstly, we rewrite ambiguous training queries by context-aware
prompting of LLMs, where we use only relevant documents as context.Unlike
existing approaches, we use LLM-based query rewriting only during the training
phase. Eventually, a ranker is fine-tuned on the rewritten queries instead of
the original queries during training. In our extensive experiments, we find
that fine-tuning a ranker using re-written queries offers a significant
improvement of up to 33% on the passage ranking task and up to 28% on the
document ranking task when compared to the baseline performance of using
original queries
- …