257,040 research outputs found
RRescue: Ranking LLM Responses to Enhance Reasoning Over Context
Effectively using a given context is paramount for large language models. A
context window can include task specifications, retrieved documents, previous
conversations, and even model self-reflections, functioning similarly to
episodic memory. While efforts are being made to expand the context window,
studies indicate that LLMs do not use their context optimally for response
generation. In this paper, we present a novel approach to optimize LLMs using
ranking metrics, which teaches LLMs to rank a collection of
contextually-grounded candidate responses. Rather than a traditional full
ordering, we advocate for a partial ordering. This is because achieving
consensus on the perfect order for system responses can be challenging. Our
partial ordering is more robust, less sensitive to noise, and can be acquired
through human labelers, heuristic functions, or model distillation. We test our
system's improved contextual understanding using the latest benchmarks,
including a new multi-document question answering dataset. We conduct ablation
studies to understand crucial factors, such as how to gather candidate
responses, determine their most suitable order, and balance supervised
fine-tuning with ranking metrics. Our approach, named RRescue, suggests a
promising avenue for enhancing LLMs' contextual understanding via response
ranking
Regression Compatible Listwise Objectives for Calibrated Ranking
As Learning-to-Rank (LTR) approaches primarily seek to improve ranking
quality, their output scores are not scale-calibrated by design -- for example,
adding a constant to the score of each item on the list will not affect the
list ordering. This fundamentally limits LTR usage in score-sensitive
applications. Though a simple multi-objective approach that combines a
regression and a ranking objective can effectively learn scale-calibrated
scores, we argue that the two objectives can be inherently conflicting, which
makes the trade-off far from ideal for both of them. In this paper, we propose
a novel regression compatible ranking (RCR) approach to achieve a better
trade-off. The advantage of the proposed approach is that the regression and
ranking components are well aligned which brings new opportunities for
harmonious regression and ranking. Theoretically, we show that the two
components share the same minimizer at global minima while the regression
component ensures scale calibration. Empirically, we show that the proposed
approach performs well on both regression and ranking metrics on several public
LTR datasets, and significantly improves the Pareto frontiers in the context of
multi-objective optimization. Furthermore, we evaluated the proposed approach
on YouTube Search and found that it not only improved the ranking quality of
the production pCTR model, but also brought gains to the click prediction
accuracy
Entity Linking for Queries by Searching Wikipedia Sentences
We present a simple yet effective approach for linking entities in queries.
The key idea is to search sentences similar to a query from Wikipedia articles
and directly use the human-annotated entities in the similar sentences as
candidate entities for the query. Then, we employ a rich set of features, such
as link-probability, context-matching, word embeddings, and relatedness among
candidate entities as well as their related entities, to rank the candidates
under a regression based framework. The advantages of our approach lie in two
aspects, which contribute to the ranking process and final linking result.
First, it can greatly reduce the number of candidate entities by filtering out
irrelevant entities with the words in the query. Second, we can obtain the
query sensitive prior probability in addition to the static link-probability
derived from all Wikipedia articles. We conduct experiments on two benchmark
datasets on entity linking for queries, namely the ERD14 dataset and the GERDAQ
dataset. Experimental results show that our method outperforms state-of-the-art
systems and yields 75.0% in F1 on the ERD14 dataset and 56.9% on the GERDAQ
dataset
Unsupervised Context-Sensitive Spelling Correction of English and Dutch Clinical Free-Text with Word and Character N-Gram Embeddings
We present an unsupervised context-sensitive spelling correction method for
clinical free-text that uses word and character n-gram embeddings. Our method
generates misspelling replacement candidates and ranks them according to their
semantic fit, by calculating a weighted cosine similarity between the
vectorized representation of a candidate and the misspelling context. To tune
the parameters of this model, we generate self-induced spelling error corpora.
We perform our experiments for two languages. For English, we greatly
outperform off-the-shelf spelling correction tools on a manually annotated
MIMIC-III test set, and counter the frequency bias of a noisy channel model,
showing that neural embeddings can be successfully exploited to improve upon
the state-of-the-art. For Dutch, we also outperform an off-the-shelf spelling
correction tool on manually annotated clinical records from the Antwerp
University Hospital, but can offer no empirical evidence that our method
counters the frequency bias of a noisy channel model in this case as well.
However, both our context-sensitive model and our implementation of the noisy
channel model obtain high scores on the test set, establishing a
state-of-the-art for Dutch clinical spelling correction with the noisy channel
model.Comment: Appears in volume 7 of the CLIN Journal,
http://www.clinjournal.org/biblio/volum
- …