Search CORE

7 research outputs found

Teaching Text-to-Image Models to Communicate in Dialog

Author: Feng Jiazhan
Lai Yuxuan
Shen Xingyu
Sun Xiaowen
Wang Yuxuan
Zhao Dongyan
Publication venue
Publication date: 07/02/2024
Field of study

A picture is worth a thousand words, thus, it is crucial for conversational agents to understand, perceive, and effectively respond with pictures. However, we find that directly employing conventional image generation techniques is inadequate for conversational agents to produce image responses effectively. In this paper, we focus on the innovative dialog-to-image generation task, where the model synthesizes a high-resolution image aligned with the given dialog context as a response. To tackle this problem, we design a tailored fine-tuning approach on the top of state-of-the-art text-to-image generation models to fully exploit the structural and semantic features in dialog context during image generation. Concretely, we linearize the dialog context with specific indicators to maintain the dialog structure, and employ in-domain data to alleviate the style mismatch between dialog-to-image and conventional image generation tasks. Empirical results on PhotoChat and MMDialog Corpus show that our approach brings consistent and remarkable improvement with 3 state-of-the-art pre-trained text-to-image generation backbones.Comment: Work in progres

arXiv.org e-Print Archive

A Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language Models

Author: An Zhenwei
Cao Hejing
Chen Liwei
Feng Jiazhan
Xu Kun
Zhao Dongyan
Publication venue
Publication date: 13/11/2023
Field of study

While large language models exhibit remarkable performance in the Question Answering task, they are susceptible to hallucinations. Challenges arise when these models grapple with understanding multi-hop relations in complex questions or lack the necessary knowledge for a comprehensive response. To address this issue, we introduce the "Decompose-and-Query" framework (D&Q). This framework guides the model to think and utilize external knowledge similar to ReAct, while also restricting its thinking to reliable information, effectively mitigating the risk of hallucinations. Experiments confirm the effectiveness of D&Q: On our ChitChatQA dataset, D&Q does not lose to ChatGPT in 67% of cases; on the HotPotQA question-only setting, D&Q achieved an F1 score of 59.6%. Our code is available at https://github.com/alkaidpku/DQ-ToolQA

arXiv.org e-Print Archive

Language Models can be Logical Solvers

Author: Chen Weizhu
Feng Jiazhan
Hao Junheng
Sharma Hiteshi
Shen Yelong
Xu Ruochen
Zhao Dongyan
Publication venue
Publication date: 10/11/2023
Field of study

Logical reasoning is a fundamental aspect of human intelligence and a key component of tasks like problem-solving and decision-making. Recent advancements have enabled Large Language Models (LLMs) to potentially exhibit reasoning capabilities, but complex logical reasoning remains a challenge. The state-of-the-art, solver-augmented language models, use LLMs to parse natural language logical questions into symbolic representations first and then adopt external logical solvers to take in the symbolic representations and output the answers. Despite their impressive performance, any parsing errors will inevitably result in the failure of the execution of the external logical solver and no answer to the logical questions. In this paper, we introduce LoGiPT, a novel language model that directly emulates the reasoning processes of logical solvers and bypasses the parsing errors by learning to strict adherence to solver syntax and grammar. LoGiPT is fine-tuned on a newly constructed instruction-tuning dataset derived from revealing and refining the invisible reasoning process of deductive solvers. Experimental results on two public deductive reasoning datasets demonstrate that LoGiPT outperforms state-of-the-art solver-augmented LMs and few-shot prompting methods on competitive LLMs like ChatGPT or GPT-4.Comment: Preprin

arXiv.org e-Print Archive

WizardLM: Empowering Large Language Models to Follow Complex Instructions

Author: Feng Jiazhan
Geng Xiubo
Jiang Daxin
Sun Qingfeng
Tao Chongyang
Xu Can
Zhao Pu
Zheng Kai
Publication venue
Publication date: 24/04/2023
Field of study

Training large language models (LLM) with open-domain instruction following data brings colossal success. However, manually creating such instruction data is very time-consuming and labor-intensive. Moreover, humans may struggle to produce high-complexity instructions. In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans. Starting with an initial set of instructions, we use our proposed Evol-Instruct to rewrite them step by step into more complex instructions. Then, we mix all generated instruction data to fine-tune LLaMA. We call the resulting model WizardLM. Human evaluations on a complexity-balanced test bed show that instructions from Evol-Instruct are superior to human-created ones. By analyzing the human evaluation results of the high complexity part, we demonstrate that outputs from our WizardLM model are preferred to outputs from OpenAI ChatGPT. Even though WizardLM still lags behind ChatGPT in some aspects, our findings suggest that fine-tuning with AI-evolved instructions is a promising direction for enhancing large language models. Our codes and generated data are public at https://github.com/nlpxucan/WizardLMComment: large language model, instruction fine-tun

arXiv.org e-Print Archive

Synergistic Interplay between Search and Large Language Models for Information Retrieval

Author: Feng Jiazhan
Geng Xiubo
Jiang Daxin
Long Guodong
Shen Tao
Tao Chongyang
Xu Can
Zhao Dongyan
Publication venue
Publication date: 12/12/2023
Field of study

Information retrieval (IR) plays a crucial role in locating relevant resources from vast amounts of data, and its applications have evolved from traditional knowledge bases to modern retrieval models (RMs). The emergence of large language models (LLMs) has further revolutionized the IR field by enabling users to interact with search systems in natural languages. In this paper, we explore the advantages and disadvantages of LLMs and RMs, highlighting their respective strengths in understanding user-issued queries and retrieving up-to-date information. To leverage the benefits of both paradigms while circumventing their limitations, we propose InteR, a novel framework that facilitates information refinement through synergy between RMs and LLMs. InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections and enables LLMs to enhance prompt formulation using retrieved documents. This iterative refinement process augments the inputs of RMs and LLMs, leading to more accurate retrieval. Experiments on large-scale retrieval benchmarks involving web search and low-resource retrieval tasks demonstrate that InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods, even those using relevance judgment. Source code is available at https://github.com/Cyril-JZ/InteRComment: Pre-print. Work in progres

arXiv.org e-Print Archive

Topological defect and sp3/sp2 carbon interface derived from ZIF-8 with linker vacancies for oxygen reduction reaction

Author: Bin Feng
Chen Fuming
Cheong Weng-Chon (max)
Dinh Duc Anh
Fan Xi
Gao Haixing
Huang Aijian
Hui Kwun Nam
Ip Weng-Fai (andy)
Li Jiazhan
Ma Junguo
San Hui Kwan
Wang Kaixi
Wang Shuo
Xu Huifang
Publication venue: 'Elsevier BV'
Publication date: 19/10/2022
Field of study

Defects in nanocarbon materials can trigger their intriguing electrochemical properties and potential applications, but their synthesis is challenging. Herein, we report the synthesis of ultrathin nitrogen-doped carbon nanosheets with intrinsic defects through the pyrolysis of ZIF-8 with linker vacancies. The as-synthesized electrocatalyst exhibits excellent oxygen reduction reaction (ORR) activity with an onset potential and half-wave potential of 1.05 and 0.873 V vs. RHE, respectively, outperforming the reported metal-free ORR electrocatalysts. It also shows a commercial Pt/C-comparable performance in zinc–air battery with a power density of 154.4 mW cm−2. Characterization and DFT calculation results suggest the adjacent sp3-carbon in carbon pentagon can significantly strengthen the adsorption and activation of oxygen molecules on sp2-carbon, hence the potential determining step is altered and ORR overpotential is lowered. This work highlights a promising green synthesis strategy of MOF-derived metal-free nanocarbon materials for wide application in advanced energy technologies

University of East Anglia digital repository

Institute Of Mechanics,Chinese Academy of Sciences