71 research outputs found
Coloring clique hypergraphs
Let G = (V, E) be a simple graph. The clique hypergraph of G, denoted as CH( G), has V as its set of vertices, and the maximal cliques as its hyperedges. Let Sk be a set of k colors. A map c : V Sk is a proper k-coloring for CH(G) if any maximal clique of G with at least two vertices receives at least two distinct colors. Let W β V, and let s β₯ 1. We say that G is (W, s)-extendible if any assignment on W with at most s colors can be extended to a proper s-coloring of CH(G). We prove that the clique hypergraphs of chordal and comparability graphs are bicolorable and that the clique hypergraphs of circular-arc graphs are 3-colorable. Our main result is the characterization of (W, 2)-extendibility for chordal graphs in the case when W=2
Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision
Deep learning has emerged as a versatile tool for a wide range of NLP tasks,
due to its superior capacity in representation learning. But its applicability
is limited by the reliance on annotated examples, which are difficult to
produce at scale. Indirect supervision has emerged as a promising direction to
address this bottleneck, either by introducing labeling functions to
automatically generate noisy examples from unlabeled text, or by imposing
constraints over interdependent label decisions. A plethora of methods have
been proposed, each with respective strengths and limitations. Probabilistic
logic offers a unifying language to represent indirect supervision, but
end-to-end modeling with probabilistic logic is often infeasible due to
intractable inference and learning. In this paper, we propose deep
probabilistic logic (DPL) as a general framework for indirect supervision, by
composing probabilistic logic with deep learning. DPL models label decisions as
latent variables, represents prior knowledge on their relations using weighted
first-order logical formulas, and alternates between learning a deep neural
network for the end task and refining uncertain formula weights for indirect
supervision, using variational EM. This framework subsumes prior indirect
supervision methods as special cases, and enables novel combination via
infusion of rich domain and linguistic knowledge. Experiments on biomedical
machine reading demonstrate the promise of this approach.Comment: EMNLP 2018 final versio
Context-faithful Prompting for Large Language Models
Large language models (LLMs) encode parametric knowledge about world facts
and have shown remarkable performance in knowledge-driven NLP tasks. However,
their reliance on parametric knowledge may cause them to overlook contextual
cues, leading to incorrect predictions in context-sensitive NLP tasks (e.g.,
knowledge acquisition tasks). In this paper, we seek to assess and enhance
LLMs' contextual faithfulness in two aspects: knowledge conflict and prediction
with abstention. We demonstrate that LLMs' faithfulness can be significantly
improved using carefully designed prompting strategies. In particular, we
identify opinion-based prompts and counterfactual demonstrations as the most
effective methods. Opinion-based prompts reframe the context as a narrator's
statement and inquire about the narrator's opinions, while counterfactual
demonstrations use instances containing false facts to improve faithfulness in
knowledge conflict situations. Neither technique requires additional training.
We conduct experiments on three datasets of two standard NLP tasks, machine
reading comprehension and relation extraction, and the results demonstrate
significant improvement in faithfulness to contexts. Code and data are released
at https://github.com/wzhouad/context-faithful-llm.Comment: Accepted at EMNLP 2023 Findings. Code and data are released at
https://github.com/wzhouad/context-faithful-ll
BLIAM: Literature-based Data Synthesis for Synergistic Drug Combination Prediction
Language models pre-trained on scientific literature corpora have
substantially advanced scientific discovery by offering high-quality feature
representations for downstream applications. However, these features are often
not interpretable, and thus can reveal limited insights to domain experts.
Instead of obtaining features from language models, we propose BLIAM, a
literature-based data synthesis approach to directly generate training data
points that are interpretable and model-agnostic to downstream applications.
The key idea of BLIAM is to create prompts using existing training data and
then use these prompts to synthesize new data points. BLIAM performs these two
steps iteratively as new data points will define more informative prompts and
new prompts will in turn synthesize more accurate data points. Notably,
literature-based data augmentation might introduce data leakage since labels of
test data points in downstream applications might have already been mentioned
in the language model corpus. To prevent such leakage, we introduce GDSC-combo,
a large-scale drug combination discovery dataset that was published after the
biomedical language model was trained. We found that BLIAM substantially
outperforms a non-augmented approach and manual prompting in this rigorous data
split setting. BLIAM can be further used to synthesize data points for novel
drugs and cell lines that were not even measured in biomedical experiments. In
addition to the promising prediction performance, the data points synthesized
by BLIAM are interpretable and model-agnostic, enabling in silico augmentation
for in vitro experiments
Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision
Large language models (LLMs) have demonstrated remarkable capabilities out of
box for a wide range of applications, yet accuracy still remains a major growth
area, especially in mission-critical domains such as biomedicine. An effective
method to calibrate the confidence level on LLM responses is essential to
automatically detect errors and facilitate human-in-the-loop verification. An
important source of calibration signals stems from expert-stipulated
programmatic supervision, which is often available at low cost but has its own
limitations such as noise and coverage. In this paper, we introduce a Pareto
optimal self-supervision framework that can leverage available programmatic
supervision to systematically calibrate LLM responses by producing a risk score
for every response, without any additional manual efforts. This is accomplished
by learning a harmonizer model to align LLM output with other available
supervision sources, which would assign higher risk scores to more uncertain
LLM responses and facilitate error correction. Experiments on standard relation
extraction tasks in biomedical and general domains demonstrate the promise of
this approach, with our proposed risk scores highly correlated with the real
error rate of LLMs. For the most uncertain test instances, dynamic prompting
based on our proposed risk scores results in significant accuracy improvement
for off-the-shelf LLMs, boosting GPT-3 results past state-of-the-art (SOTA)
weak supervision and GPT-4 results past SOTA supervised results on challenging
evaluation datasets
- β¦