Search CORE

196 research outputs found

Explanation Regeneration via Information Bottleneck

Author: Bi Wei
Kong Lingpeng
Li Qintong
Wu Zhiyong
Publication venue
Publication date: 11/07/2023
Field of study

Explaining the black-box predictions of NLP models naturally and accurately is an important open problem in natural language generation. These free-text explanations are expected to contain sufficient and carefully-selected evidence to form supportive arguments for predictions. Due to the superior generative capacity of large pretrained language models, recent work built on prompt engineering enables explanation generation without specific training. However, explanation generated through single-pass prompting often lacks sufficiency and conciseness. To address this problem, we develop an information bottleneck method EIB to produce refined explanations that are sufficient and concise. Our approach regenerates the free-text explanation by polishing the single-pass output from the pretrained language model but retaining the information that supports the contents being explained. Experiments on two out-of-domain tasks verify the effectiveness of EIB through automatic evaluation and thoroughly-conducted human evaluation.Comment: Accepted in ACL2023 Finding

arXiv.org e-Print Archive

DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models

Author: Feng Jiangtao
Gong Shansan
Kong Lingpeng
Li Mukai
Wu Zhiyong
Publication venue
Publication date: 14/02/2023
Field of study

Recently, diffusion models have emerged as a new paradigm for generative models. Despite the success in domains using continuous signals such as vision and audio, adapting diffusion models to natural language is under-explored due to the discrete nature of texts, especially for conditional generation. We tackle this challenge by proposing DiffuSeq: a diffusion model designed for sequence-to-sequence (Seq2Seq) text generation tasks. Upon extensive evaluation over a wide range of Seq2Seq tasks, we find DiffuSeq achieving comparable or even better performance than six established baselines, including a state-of-the-art model that is based on pre-trained language models. Apart from quality, an intriguing property of DiffuSeq is its high diversity during generation, which is desired in many Seq2Seq tasks. We further include a theoretical analysis revealing the connection between DiffuSeq and autoregressive/non-autoregressive models. Bringing together theoretical analysis and empirical evidence, we demonstrate the great potential of diffusion models in complex conditional language generation tasks. Code is available at \url{https://github.com/Shark-NLP/DiffuSeq}Comment: ICLR 2023 camera read

arXiv.org e-Print Archive

Compositional Exemplars for In-context Learning

Author: Feng Jiangtao
Kong Lingpeng
Wu Zhiyong
Ye Jiacheng
Yu Tao
Publication venue
Publication date: 20/06/2023
Field of study

Large pretrained language models (LMs) have shown impressive In-Context Learning (ICL) ability, where the model learns to do an unseen task via a prompt consisting of input-output examples as the demonstration, without any parameter updates. The performance of ICL is highly dominated by the quality of the selected in-context examples. However, previous selection methods are mostly based on simple heuristics, leading to sub-optimal performance. In this work, we formulate in-context example selection as a subset selection problem. We propose CEIL (Compositional Exemplars for In-context Learning), which is instantiated by Determinantal Point Processes (DPPs) to model the interaction between the given input and in-context examples, and optimized through a carefully-designed contrastive learning objective to obtain preference from LMs. We validate CEIL on 12 classification and generation datasets from 7 distinct NLP tasks, including sentiment analysis, paraphrase detection, natural language inference, commonsense reasoning, open-domain question answering, code generation, and semantic parsing. Extensive experiments demonstrate not only the state-of-the-art performance but also the transferability and compositionality of CEIL, shedding new light on effective and efficient in-context learning. Our code is released at https://github.com/HKUNLP/icl-ceil.Comment: Accepted in ICML 202

arXiv.org e-Print Archive

Regeneration under crisis - research on the renewal and evolution of the forms of future urban residential communities

Author: Gao Hao
Kong Zhiyong
Sun Haoyu
Wang Fang
Zhang Qian
Zhang Xinyue
Publication venue: ISUF 2020 Virtual Conference Proceedings
Publication date: 23/02/2021
Field of study

University of Utah E Publications

Unsupervised Explanation Generation via Correct Instantiations

Author: Chen Jiangjie
Cheng Sijie
Kong Lingpeng
Li Zhixing
Liu Yang
Wu Zhiyong
Publication venue
Publication date: 20/11/2022
Field of study

While large pre-trained language models (PLM) have shown their great skills at solving discriminative tasks, a significant gap remains when compared with humans for explanation-related tasks. Among them, explaining the reason why a statement is wrong (e.g., against commonsense) is incredibly challenging. The major difficulty is finding the conflict point, where the statement contradicts our real world. This paper proposes Neon, a two-phrase, unsupervised explanation generation framework. Neon first generates corrected instantiations of the statement (phase I), then uses them to prompt large PLMs to find the conflict point and complete the explanation (phase II). We conduct extensive experiments on two standard explanation benchmarks, i.e., ComVE and e-SNLI. According to both automatic and human evaluations, Neon outperforms baselines, even for those with human-annotated instantiations. In addition to explaining a negative prediction, we further demonstrate that Neon remains effective when generalizing to different scenarios.Comment: Accepted to AAAI-2

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Recommended from our members

Tin-graphene tubes as anodes for lithium-ion batteries with high volumetric and gravimetric energy densities.

Author: Kong Dejia
Li Fan
Li Jinlai
Lu Yunfeng
Mo Runwei
Peng Yiting
Tan Xinyi
Tao Ran
Wang Chongmin
Wang Xiang
Wang Zhiyong
Xu Bin
Xu Jinhui
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

Limited by the size of microelectronics, as well as the space of electrical vehicles, there are tremendous demands for lithium-ion batteries with high volumetric energy densities. Current lithium-ion batteries, however, adopt graphite-based anodes with low tap density and gravimetric capacity, resulting in poor volumetric performance metric. Here, by encapsulating nanoparticles of metallic tin in mechanically robust graphene tubes, we show tin anodes with high volumetric and gravimetric capacities, high rate performance, and long cycling life. Pairing with a commercial cathode material LiNi0.6Mn0.2Co0.2O2, full cells exhibit a gravimetric and volumetric energy density of 590 W h Kg-1 and 1,252 W h L-1, respectively, the latter of which doubles that of the cell based on graphite anodes. This work provides an effective route towards lithium-ion batteries with high energy density for a broad range of applications

eScholarship - University of California

Preparation of alginate coated chitosan microparticles for vaccine delivery

Author: Guo Gang
Kong XiangYe
Li XingYi
Qian ZhiYong
Shi Shuai
Wei YuQuan
Zheng XiuLing
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

OS-Copilot: Towards Generalist Computer Agents with Self-Improvement

Author: Ding Zichen
Han Chengcheng
Kong Lingpeng
Liu Zhoumianze
Weng Zhenmin
Wu Zhiyong
Yao Shunyu
Yu Tao
Publication venue
Publication date: 15/02/2024
Field of study

Autonomous interaction with the computer has been a longstanding challenge with great potential, and the recent proliferation of large language models (LLMs) has markedly accelerated progress in building digital agents. However, most of these agents are designed to interact with a narrow domain, such as a specific software or website. This narrow focus constrains their applicability for general computer tasks. To this end, we introduce OS-Copilot, a framework to build generalist agents capable of interfacing with comprehensive elements in an operating system (OS), including the web, code terminals, files, multimedia, and various third-party applications. We use OS-Copilot to create FRIDAY, a self-improving embodied agent for automating general computer tasks. On GAIA, a general AI assistants benchmark, FRIDAY outperforms previous methods by 35%, showcasing strong generalization to unseen applications via accumulated skills from previous tasks. We also present numerical and quantitative evidence that FRIDAY learns to control and self-improve on Excel and Powerpoint with minimal supervision. Our OS-Copilot framework and empirical findings provide infrastructure and insights for future research toward more capable and general-purpose computer agents.Comment: Project page: https://os-copilot.github.i

arXiv.org e-Print Archive

Chitosan-Alginate Sponge: Preparation and Application in Curcumin Delivery for Dermal Wound Healing in Rat

Author: Dai Mei
Guo Gang
Kong XiangYe
Li XingYi
Luo Feng
Qian Zhiyong
Wei Yu Quan
Xu Xu
Zhao Xia
Zheng XiuLing
Publication venue: Hindawi Publishing Corporation
Publication date: 01/01/2009
Field of study

A biodegradable sponge, composed of chitosan (CS) and sodium alginate (SA), was successfully obtained in this work. The sponge was ethereal and pliable. The chemical structure and morphology of the sponges was characterized by FTIR and SEM. The swelling ability, in vitro drug release and degradation behaviors, and an in vivo animal test were employed to confirm the applicability of this sponge as a wound dressing material. As the chitosan content in the sponge decreased, the swelling ability decreased. All types of the sponges exhibited biodegradable properties. The release of curcumin from the sponges could be controlled by the crosslinking degree. Curcumin could be released from the sponges in an extended period for up to 20 days. An in vivo animal test using SD rat showed that sponge had better effect than cotton gauze, and adding curcumin into the sponge enhanced the therapeutic healing effect

Crossref

Directory of Open Access Journals

PubMed Central

Precipitable water vapour retrieval from GPS precise point positioning and NCEP CFSv2 dataset during typhoon events

Author: Hancock Craig
Kong Yang
Ligt Huib
Quaye-Ballard Jonathan
Shi Hongkai
Tang Xu
Xiang Zhiyong
Publication venue: 'MDPI AG'
Publication date: 01/11/2018
Field of study

Radiosonde is extensively used for understanding meteorological parameters in the vertical direction. Four typhoon events, including three landfalls (MERANTI, NEPARTAK, and MEGI) and one non-landfall (MALAKAS), were chosen in analysing the precipitable water vapour (PWV) characteristics in this study. The spatial distribution of the three radiosonde stations in Zhejiang province does not meet the requirement in analysing changes in PWV during typhoon event. Global position system (GPS) observations are an alternative method for deriving the PWV. This enables improvements in the temporal⁻spatial resolution of PWV computed by the radiosonde measurements. The National Centers for Environmental Prediction (NCEP) re-analysed data were employed for interpolating temperature and atmosphere pressure at the GPS antennas height. The PWV computed from GPS observations and NCEP re-analysed data were then compared with the true PWV. The maximum difference of radiosonde and GPS PWV was not more than 30 mm at Taiz station. The Root-Mean-Square (RMS) of PWV differences between radiosonde and GPS was not more than 5 mm in January, February, March, November, and December. It was slightly greater than 5 mm in April. High RMS in May, June, July, August, September, and October implies that differences in GPS and radiosonde PWVs are evident in these months. Correlation coefficients of GPS and radiosonde PWVs were more than 0.9, indicating that the changes in GPS and radiosonde PWVs are similar. Radiosonde calculated PWVs were used for GPS PWV calibration for understanding the PWV changes during the period of a typhoon event. The results from three landfall typhoons show that the average PWV over Zhejiang province is increasing and approaching China mainland. In contrast, MALAKAS did not make landfall and shows a decreasing PWV trend, although it was heading to China mainland. Generally, the PWV change can be used to predict whether the typhoon will make landfall in these cases. PWV spatial distribution of MERANTI shows that PWV peaks change along the typhoon epicenter over Zhejiang province

Nottingham ePrints

Nottingham eTheses

Directory of Open Access Journals