38 research outputs found
TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks
Language models (LMs) can solve tasks such as answering questions about
tables or images by writing programs. However, using primitive functions often
leads to verbose and error-prone programs, and higher-level functions require
expert design. To enable better solutions without human labor, we ask code LMs
to curate reusable high-level functions, and use them to write solutions. We
present TROVE, a training-free method of inducing a verifiable and efficient
toolbox of functions, by generating via using, growing, and periodically
trimming the toolbox. On 11 datasets from math, table question answering, and
image reasoning tasks, TROVE consistently yields simpler solutions with higher
accuracy than baselines using CODELLAMA and previous methods using GPT, while
using 79-98% smaller toolboxes. TROVE further enables 31% faster and 13% more
accurate human verification than baselines. With the same pipeline, it creates
diverse functions for varied tasks and datasets, providing insights into their
individual characteristics
API-Assisted Code Generation for Question Answering on Varied Table Structures
A persistent challenge to table question answering (TableQA) by generating
executable programs has been adapting to varied table structures, typically
requiring domain-specific logical forms. In response, this paper introduces a
unified TableQA framework that: (1) provides a unified representation for
structured tables as multi-index Pandas data frames, (2) uses Python as a
powerful querying language, and (3) uses few-shot prompting to translate NL
questions into Python programs, which are executable on Pandas data frames.
Furthermore, to answer complex relational questions with extended program
functionality and external knowledge, our framework allows customized APIs that
Python programs can call. We experiment with four TableQA datasets that involve
tables of different structures -- relational, multi-table, and hierarchical
matrix shapes -- and achieve prominent improvements over past state-of-the-art
systems. In ablation studies, we (1) show benefits from our multi-index
representation and APIs over baselines that use only an LLM, and (2)
demonstrate that our approach is modular and can incorporate additional APIs.Comment: EMNLP 2023 camera ready, 13 pages, 11 figure
NLQxform: A Language Model-based Question to SPARQL Transformer
In recent years, scholarly data has grown dramatically in terms of both scale
and complexity. It becomes increasingly challenging to retrieve information
from scholarly knowledge graphs that include large-scale heterogeneous
relationships, such as authorship, affiliation, and citation, between various
types of entities, e.g., scholars, papers, and organizations. As part of the
Scholarly QALD Challenge, this paper presents a question-answering (QA) system
called NLQxform, which provides an easy-to-use natural language interface to
facilitate accessing scholarly knowledge graphs. NLQxform allows users to
express their complex query intentions in natural language questions. A
transformer-based language model, i.e., BART, is employed to translate
questions into standard SPARQL queries, which can be evaluated to retrieve the
required information. According to the public leaderboard of the Scholarly QALD
Challenge at ISWC 2023 (Task 1: DBLP-QUAD - Knowledge Graph Question Answering
over DBLP), NLQxform achieved an F1 score of 0.85 and ranked first on the QA
task, demonstrating the competitiveness of the system
Learning to Filter Context for Retrieval-Augmented Generation
On-the-fly retrieval of relevant knowledge has proven an essential element of
reliable systems for tasks such as open-domain question answering and fact
verification. However, because retrieval systems are not perfect, generation
models are required to generate outputs given partially or entirely irrelevant
passages. This can cause over- or under-reliance on context, and result in
problems in the generated output such as hallucinations. To alleviate these
problems, we propose FILCO, a method that improves the quality of the context
provided to the generator by (1) identifying useful context based on lexical
and information-theoretic approaches, and (2) training context filtering models
that can filter retrieved contexts at test time. We experiment on six
knowledge-intensive tasks with FLAN-T5 and LLaMa2, and demonstrate that our
method outperforms existing approaches on extractive question answering (QA),
complex multi-hop and long-form QA, fact verification, and dialog generation
tasks. FILCO effectively improves the quality of context, whether or not it
supports the canonical output
K-BERT: Enabling Language Representation with Knowledge Graph
Pre-trained language representation models, such as BERT, capture a general
language representation from large-scale corpora, but lack domain-specific
knowledge. When reading a domain text, experts make inferences with relevant
knowledge. For machines to achieve this capability, we propose a
knowledge-enabled language representation model (K-BERT) with knowledge graphs
(KGs), in which triples are injected into the sentences as domain knowledge.
However, too much knowledge incorporation may divert the sentence from its
correct meaning, which is called knowledge noise (KN) issue. To overcome KN,
K-BERT introduces soft-position and visible matrix to limit the impact of
knowledge. K-BERT can easily inject domain knowledge into the models by
equipped with a KG without pre-training by-self because it is capable of
loading model parameters from the pre-trained BERT. Our investigation reveals
promising results in twelve NLP tasks. Especially in domain-specific tasks
(including finance, law, and medicine), K-BERT significantly outperforms BERT,
which demonstrates that K-BERT is an excellent choice for solving the
knowledge-driven problems that require experts.Comment: 8 pages, 2019091
Improving Factuality of Abstractive Summarization via Contrastive Reward Learning
Modern abstractive summarization models often generate summaries that contain
hallucinated or contradictory information. In this paper, we propose a simple
but effective contrastive learning framework that incorporates recent
developments in reward learning and factuality metrics. Empirical studies
demonstrate that the proposed framework enables summarization models to learn
from feedback of factuality metrics using contrastive reward learning, leading
to more factual summaries by human evaluations. This suggests that further
advances in learning and evaluation algorithms can feed directly into providing
more factual summaries.Comment: TrustNLP @ ACL 202
SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
In this work, we introduce Semantic Pyramid AutoEncoder (SPAE) for enabling
frozen LLMs to perform both understanding and generation tasks involving
non-linguistic modalities such as images or videos. SPAE converts between raw
pixels and interpretable lexical tokens (or words) extracted from the LLM's
vocabulary. The resulting tokens capture both the semantic meaning and the
fine-grained details needed for visual reconstruction, effectively translating
the visual content into a language comprehensible to the LLM, and empowering it
to perform a wide array of multimodal tasks. Our approach is validated through
in-context learning experiments with frozen PaLM 2 and GPT 3.5 on a diverse set
of image understanding and generation tasks. Our method marks the first
successful attempt to enable a frozen LLM to generate image content while
surpassing state-of-the-art performance in image understanding tasks, under the
same setting, by over 25%.Comment: NeurIPS 2023 spotligh
Retinal microvasculature features in patients with migraine: a systematic review and meta-analysis
BackgroundMigraine is a central nervous system disorder involving neuronal and vascular factors. The brain has a close anatomical relationship with retinal vessels and similar regulatory processes, and the retinal vascular system is the only in vivo vessel that can be directly visualized, while optical coherence tomography angiography (OCTA) is an advanced retinal vascular imaging technique. In this study, OCTA was used to study the retinal vascular density (VD) and foveal avascular zone (FAZ) in migraine patients, which provided a theoretical basis for its use as a candidate for rapid and non-invasive diagnosis of migraine.MethodsPublished studies comparing retinal microvascular profiles between migraine patients and healthy controls were obtained by a comprehensive search of electronic databases. Nine studies were finally included, including 775 eyes (migraine group: 444 eyes, control group: 331 eyes). Pooled effect sizes were presented as standardized mean differences (SMDs) and 95% confidence intervals (CIs). Statistical analysis was performed using Review Manager software (version 5.30).ResultsThe combined results revealed that the superficial and deep macular whole enface VD (MWEVD) (superficial VD: SMD = −0.30, P = 0.0001; deep VD: SMD = −0.61, P = 0.02), superficial foveal VD (FVD) (SMD = −0.42, P = 0.03), deep parafoveal VD (PFVD) (SMD = −0.31, P = 0.002), and peripapillary VD (PVD) (SMD = −0.49, P = 0.002) were significantly reduced in migraine patients compared with healthy people. However, there was a significant increase in the area of the FAZ in migraine patients (SMD = 0.56, P < 0.0001).ConclusionMigraine patients are prone to retinal microcirculation disorders, such as decreased blood vessel density and increased avascular area in the fovea. This provides a theoretical basis for OCTA as a candidate for rapid, non-invasive diagnosis of migraine