212 research outputs found
Trial2Vec: Zero-Shot Clinical Trial Document Similarity Search using Self-Supervision
Clinical trials are essential for drug development but are extremely
expensive and time-consuming to conduct. It is beneficial to study similar
historical trials when designing a clinical trial. However, lengthy trial
documents and lack of labeled data make trial similarity search difficult. We
propose a zero-shot clinical trial retrieval method, Trial2Vec, which learns
through self-supervision without annotating similar clinical trials.
Specifically, the meta-structure of trial documents (e.g., title, eligibility
criteria, target disease) along with clinical knowledge (e.g., UMLS knowledge
base https://www.nlm.nih.gov/research/umls/index.html) are leveraged to
automatically generate contrastive samples. Besides, Trial2Vec encodes trial
documents considering meta-structure thus producing compact embeddings
aggregating multi-aspect information from the whole document. We show that our
method yields medically interpretable embeddings by visualization and it gets a
15% average improvement over the best baselines on precision/recall for trial
retrieval, which is evaluated on our labeled 1600 trial pairs. In addition, we
prove the pre-trained embeddings benefit the downstream trial outcome
prediction task over 240k trials. Software ias available at
https://github.com/RyanWangZf/Trial2Vec.Comment: Findings of EMNLP 202
Contextualized Sequence Likelihood: Enhanced Confidence Scores for Natural Language Generation
The advent of large language models (LLMs) has dramatically advanced the
state-of-the-art in numerous natural language generation tasks. For LLMs to be
applied reliably, it is essential to have an accurate measure of their
confidence. Currently, the most commonly used confidence score function is the
likelihood of the generated sequence, which, however, conflates semantic and
syntactic components. For instance, in question-answering (QA) tasks, an
awkward phrasing of the correct answer might result in a lower probability
prediction. Additionally, different tokens should be weighted differently
depending on the context. In this work, we propose enhancing the predicted
sequence probability by assigning different weights to various tokens using
attention values elicited from the base LLM. By employing a validation set, we
can identify the relevant attention heads, thereby significantly improving the
reliability of the vanilla sequence probability confidence measure. We refer to
this new score as the Contextualized Sequence Likelihood (CSL). CSL is easy to
implement, fast to compute, and offers considerable potential for further
improvement with task-specific prompts. Across several QA datasets and a
diverse array of LLMs, CSL has demonstrated significantly higher reliability
than state-of-the-art baselines in predicting generation quality, as measured
by the AUROC or AUARC
- …