37 research outputs found
PESCO: Prompt-enhanced Self Contrastive Learning for Zero-shot Text Classification
We present PESCO, a novel contrastive learning framework that substantially
improves the performance of zero-shot text classification. We formulate text
classification as a neural text matching problem where each document is treated
as a query, and the system learns the mapping from each query to the relevant
class labels by (1) adding prompts to enhance label matching, and (2) using
retrieved labels to enrich the training set in a self-training loop of
contrastive learning. PESCO achieves state-of-the-art performance on four
benchmark text classification datasets. On DBpedia, we achieve 98.5\% accuracy
without any labeled data, which is close to the fully-supervised result.
Extensive experiments and analyses show all the components of PESCO are
necessary for improving the performance of zero-shot text classification.Comment: accepted by ACL 202
A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest
Large Language Models (LLMs), despite their great power in language
generation, often encounter challenges when dealing with intricate and
knowledge-demanding queries in specific domains. This paper introduces a novel
approach to enhance LLMs by effectively extracting the relevant knowledge from
domain-specific textual sources, and the adaptive training of a chatbot with
domain-specific inquiries. Our two-step approach starts from training a
knowledge miner, namely LLMiner, which autonomously extracts Question-Answer
pairs from relevant documents through a chain-of-thought reasoning process.
Subsequently, we blend the mined QA pairs with a conversational dataset to
fine-tune the LLM as a chatbot, thereby enriching its domain-specific expertise
and conversational capabilities. We also developed a new evaluation benchmark
which comprises four domain-specific text corpora and associated human-crafted
QA pairs for testing. Our model shows remarkable performance improvement over
generally aligned LLM and surpasses domain-adapted models directly fine-tuned
on domain corpus. In particular, LLMiner achieves this with minimal human
intervention, requiring only 600 seed instances, thereby providing a pathway
towards self-improvement of LLMs through model-synthesized training data.Comment: Work in progres
Whole genome sequencing of foodborne pathogens and global data sharing development
With the rapid development of molecular typing techniques for monitoring foodborne pathogens and outbreak investigations, whole genome sequencing (WGS) is gradually revealing its importance. In the context of the globalization of food trade, it’s urgent to establish details of the links between foodborne pathogens and human exposure in order to accurately monitor and reduce their occurrence. The accuracy of WGS is significantly better than prior analysis tools in the aspect. In this paper, we take Listeria monocytogenes as example to expound the monitoring of foodborne pathogens and the investigation of infection outbreaks, emphasizing the value of WGS in trace-back of foodborne diseases. The technologies for data generation and analysis are summarized, the practical application progress of WGS in the worldwide foodborne pathogen typing is emphasized, and the challenges in the future are prospected
Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Preference modeling techniques, such as direct preference optimization (DPO),
has shown effective in enhancing the generalization abilities of large language
model (LLM). However, in tasks involving video instruction-following, providing
informative feedback, especially for detecting hallucinations in generated
responses, remains a significant challenge. Previous studies have explored
using large large multimodal models (LMMs) as reward models to guide preference
modeling, but their ability to accurately assess the factuality of generated
responses compared to corresponding videos has not been conclusively
established. This paper introduces a novel framework that utilizes detailed
video captions as a proxy of video content, enabling language models to
incorporate this information as supporting evidence for scoring video Question
Answering (QA) predictions. Our approach demonstrates robust alignment with
OpenAI GPT-4V model's reward mechanism, which directly takes video frames as
input. Furthermore, we show that applying this tailored reward through DPO
significantly improves the performance of video LMMs on video QA tasks
Gadolinium‐Doped Iron Oxide Nanoprobe as Multifunctional Bioimaging Agent and Drug Delivery System
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/116012/1/adfm201502868.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/116012/2/adfm201502868-sup-0001-S1.pd
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Humans are social beings; we pursue social goals in our daily interactions,
which is a crucial aspect of social intelligence. Yet, AI systems' abilities in
this realm remain elusive. We present SOTOPIA, an open-ended environment to
simulate complex social interactions between artificial agents and evaluate
their social intelligence. In our environment, agents role-play and interact
under a wide variety of scenarios; they coordinate, collaborate, exchange, and
compete with each other to achieve complex social goals. We simulate the
role-play interaction between LLM-based agents and humans within this task
space and evaluate their performance with a holistic evaluation framework
called SOTOPIA-Eval. With SOTOPIA, we find significant differences between
these models in terms of their social intelligence, and we identify a subset of
SOTOPIA scenarios, SOTOPIA-hard, that is generally challenging for all models.
We find that on this subset, GPT-4 achieves a significantly lower goal
completion rate than humans and struggles to exhibit social commonsense
reasoning and strategic communication skills. These findings demonstrate
SOTOPIA's promise as a general platform for research on evaluating and
improving social intelligence in artificial agents.Comment: Preprint, 43 pages. The first two authors contribute equall