149 research outputs found
Large Language Models are few(1)-shot Table Reasoners
Recent literature has shown that large language models (LLMs) are generally
excellent few-shot reasoners to solve text reasoning tasks. However, the
capability of LLMs on table reasoning tasks is yet to be explored. In this
paper, we aim at understanding how well LLMs can perform table-related tasks
with few-shot in-context learning. Specifically, we evaluated LLMs on popular
table QA and fact verification datasets like WikiTableQuestion, FetaQA,
TabFact, and FEVEROUS and found that LLMs are competent at complex reasoning
over table structures, though these models are not pre-trained on any table
corpus. When combined with `chain of thoughts' prompting, LLMs can achieve very
strong performance with only a 1-shot demonstration, even on par with some SoTA
models. We show that LLMs are even more competent at generating comprehensive
long-form answers on FetaQA than tuned T5-large. We further manually studied
the reasoning chains elicited from LLMs and found that these reasoning chains
are highly consistent with the underlying semantic form. We believe that LLMs
can serve as a simple yet generic baseline for future research. The code and
data are released in https://github.com/wenhuchen/TableCoT.Comment: Accepted to Findings of EACL 202
XL-NBT: A Cross-lingual Neural Belief Tracking Framework
Task-oriented dialog systems are becoming pervasive, and many companies
heavily rely on them to complement human agents for customer service in call
centers. With globalization, the need for providing cross-lingual customer
support becomes more urgent than ever. However, cross-lingual support poses
great challenges---it requires a large amount of additional annotated data from
native speakers. In order to bypass the expensive human annotation and achieve
the first step towards the ultimate goal of building a universal dialog system,
we set out to build a cross-lingual state tracking framework. Specifically, we
assume that there exists a source language with dialog belief tracking
annotations while the target languages have no annotated dialog data of any
form. Then, we pre-train a state tracker for the source language as a teacher,
which is able to exploit easy-to-access parallel data. We then distill and
transfer its own knowledge to the student state tracker in target languages. We
specifically discuss two types of common parallel resources: bilingual corpus
and bilingual dictionary, and design different transfer learning strategies
accordingly. Experimentally, we successfully use English state tracker as the
teacher to transfer its knowledge to both Italian and German trackers and
achieve promising results.Comment: 13 pages, 5 figures, 3 tables, accepted to EMNLP 2018 conferenc
Video Captioning via Hierarchical Reinforcement Learning
Video captioning is the task of automatically generating a textual
description of the actions in a video. Although previous work (e.g.
sequence-to-sequence model) has shown promising results in abstracting a coarse
description of a short video, it is still very challenging to caption a video
containing multiple fine-grained actions with a detailed description. This
paper aims to address the challenge by proposing a novel hierarchical
reinforcement learning framework for video captioning, where a high-level
Manager module learns to design sub-goals and a low-level Worker module
recognizes the primitive actions to fulfill the sub-goal. With this
compositional framework to reinforce video captioning at different levels, our
approach significantly outperforms all the baseline methods on a newly
introduced large-scale dataset for fine-grained video captioning. Furthermore,
our non-ensemble model has already achieved the state-of-the-art results on the
widely-used MSR-VTT dataset.Comment: CVPR 2018, with supplementary materia
Augmenting Black-box LLMs with Medical Textbooks for Clinical Question Answering
Large-scale language models (LLMs), such as ChatGPT, are capable of
generating human-like responses for various downstream tasks, such as
task-oriented dialogues and question answering. However, applying LLMs to
medical domains remains challenging due to their inability to leverage
domain-specific knowledge. In this study, we present the Large-scale Language
Models Augmented with Medical Textbooks (LLM-AMT), which integrates
authoritative medical textbooks as the cornerstone of its design, enhancing its
proficiency in the specialized domain through plug-and-play modules, comprised
of a Hybrid Textbook Retriever, supplemented by the Query Augmenter and the LLM
Reader. Experimental evaluation on three open-domain medical question-answering
tasks reveals a substantial enhancement in both the professionalism and
accuracy of the LLM responses when utilizing LLM-AMT, exhibiting an improvement
ranging from 11.4% to 13.2%. Despite being 100 times smaller, we found that
medical textbooks as the retrieval corpus serves as a more valuable external
knowledge source than Wikipedia in the medical domain. Our experiments show
that textbook augmentation results in a performance improvement ranging from
9.7% to 12.2% over Wikipedia augmentation
- …