Search CORE

79 research outputs found

Lifelong Sequence Generation with Dynamic Module Expansion and Adaptation

Author: Chen Chen
Joty Shafiq
Qin Chengwei
Publication venue
Publication date: 22/11/2023
Field of study

Lifelong sequence generation (LSG), a problem in continual learning, aims to continually train a model on a sequence of generation tasks to learn constantly emerging new generation patterns while avoiding the forgetting of previous knowledge. Existing LSG methods mainly focus on maintaining old knowledge while paying little attention to knowledge transfer across tasks. In contrast, humans can better learn new tasks by leveraging previously acquired knowledge from similar tasks. Inspired by the learning paradigm of humans, we propose Dynamic Module Expansion and Adaptation (DMEA), which enables the model to dynamically determine the architecture for acquiring new knowledge based on task correlation and select the most similar previous tasks to facilitate adaptation to new tasks. In addition, as the learning process can easily be biased towards the current task which might cause more severe forgetting of previously learned knowledge, we propose dynamic gradient scaling to balance the learning of the current task and replayed tasks. With extensive experiments, we demonstrate that DMEA can consistently outperform existing methods in different LSG settings

arXiv.org e-Print Archive

Learning to Initialize: Can Meta Learning Improve Cross-task Generalization in Prompt Tuning?

Author: Joty Shafiq
Li Qian
Qin Chengwei
Zhao Ruochen
Publication venue
Publication date: 16/02/2023
Field of study

Prompt tuning (PT) which only tunes the embeddings of an additional sequence of tokens per task, keeping the pre-trained language model (PLM) frozen, has shown remarkable performance in few-shot learning. Despite this, PT has been shown to rely heavily on good initialization of the prompt embeddings. In this work, we study meta prompt tuning (MPT) to systematically explore how meta-learning can help improve (if it can) cross-task generalization in PT through learning to initialize the prompt embeddings from other relevant tasks. We empirically analyze a representative set of meta learning algorithms in a wide range of adaptation settings with different source/target task configurations on a large set of few-shot tasks. With extensive experiments and analysis, we demonstrate the effectiveness of MPT. We find the improvement to be significant particularly on classification tasks. For other kinds of tasks such as question answering, we observe that while MPT can outperform PT in most cases, it does not always outperform multi-task learning. We further provide an in-depth analysis from the perspective of task similarity

arXiv.org e-Print Archive

In-Context Learning with Iterative Demonstration Selection

Author: Dagar Anirudh
Qin Chengwei
Ye Wenming
Zhang Aston
Publication venue
Publication date: 22/10/2023
Field of study

Spurred by advancements in scale, large language models (LLMs) have demonstrated strong few-shot learning ability via in-context learning (ICL). However, the performance of ICL has been shown to be highly sensitive to the selection of few-shot demonstrations. Selecting the most suitable examples as context remains an ongoing challenge and an open problem. Existing literature has highlighted the importance of selecting examples that are diverse or semantically similar to the test sample while ignoring the fact that the optimal selection dimension, i.e., diversity or similarity, is task-specific. Leveraging the merits of both dimensions, we propose Iterative Demonstration Selection (IDS). Using zero-shot chain-of-thought reasoning (Zero-shot-CoT), IDS iteratively selects examples that are diverse but still strongly correlated with the test sample as ICL demonstrations. Specifically, IDS applies Zero-shot-CoT to the test sample before demonstration selection. The output reasoning path is then used to choose demonstrations that are prepended to the test sample for inference. The generated answer is accompanied by its corresponding reasoning path for extracting a new set of demonstrations in the next iteration. After several iterations, IDS adopts majority voting to obtain the final result. Through extensive experiments on tasks including commonsense reasoning, question answering, topic classification, and sentiment analysis, we demonstrate that IDS can consistently outperform existing ICL demonstration selection methods

arXiv.org e-Print Archive

Green and low-cost synthesis of LiNi0.8Co0.15Al0.05O2 cathode material for Li-ion batteries

Author: Gao Chengwei
Liu Wanmin
Qin M.L.
Yu Donghong
Yue Yuanzheng
Publication venue: 'Elsevier BV'
Publication date: 01/07/2019
Field of study

VBN

Genome-wide survey of putative Serine/Threonine protein kinases in cyanobacteria

Author: Guan Xiangyu
Liang Chengwei
Qin Song
Yang Yu
Zhang Xiaowen
Zhao Fangqing
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Is ChatGPT a General-Purpose Natural Language Processing Task Solver?

Author: Chen Jiaao
Qin Chengwei
Yang Diyi
Yasunaga Michihiro
Zhang Aston
Zhang Zhuosheng
Publication venue
Publication date: 19/11/2023
Field of study

Spurred by advancements in scale, large language models (LLMs) have demonstrated the ability to perform a variety of natural language processing (NLP) tasks zero-shot -- i.e., without adaptation on downstream data. Recently, the debut of ChatGPT has drawn a great deal of attention from the natural language processing (NLP) community due to the fact that it can generate high-quality responses to human input and self-correct previous mistakes based on subsequent conversations. However, it is not yet known whether ChatGPT can serve as a generalist model that can perform many NLP tasks zero-shot. In this work, we empirically analyze the zero-shot learning ability of ChatGPT by evaluating it on 20 popular NLP datasets covering 7 representative task categories. With extensive empirical studies, we demonstrate both the effectiveness and limitations of the current version of ChatGPT. We find that ChatGPT performs well on many tasks favoring reasoning capabilities (e.g., arithmetic reasoning) while it still faces challenges when solving specific tasks such as sequence tagging. We additionally provide in-depth analysis through qualitative case studies

arXiv.org e-Print Archive

Hearing Lips in Noise : Universal Viseme-Phoneme Mapping and Transfer for Robust Audio-Visual Speech Recognition

Author: Chen Chen
Chng Eng Siong
Hu Yuchen
Li Ruizhe
Qin Chengwei
Zhu Qiu-Shi
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/07/2023
Field of study

PreprintPublisher PD

Aberdeen University Research

Shaping a subwavelength needle with ultra-long focal length by focusing azimuthally polarized light

Author: Hong Minghui
Huang Kun
Jiao Jiao
Luo Xiangang
Qin Fei
Qiu Chengwei
Wu Jianfeng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

10.1038/srep09977Scientific Reports

PubMed Central

ScholarBank@NUS

Solution and type curves for the seepage model of the water-bearing coalbed with leakage recharge

Author: Chen Wangang
Li Yanyan
Sun Hansen
Wen Qin
Yang Yu
Zhang Chengwei
Publication venue: Universidad Nacional de Colombia - Sede Bogotá - Facultad de Ciencias - Departamento de Geociencia
Publication date: 01/01/2017
Field of study

To analyze the effects of the leakage recharge of the aquifer on the initial dewatering of coalbed methane wells, the mathematical seepage model of water in the coalbed considering the aquifer leakage was established by using the leakage coefficient according to the unsteady seepage theory. The model was solved after Laplace transform and the Stehfest numerical reverse inversion was used to obtain the solution in right space. Then, the log-log type curves of pressure and pressure derivative were created with new combinations of parameters. Based on the natural seepage mechanism, the influence of aquifer leakage on curve shape was judged. It is found that the radial flow ends earlier as the leakage coefficient increases. Moreover, it was proposed to obtain reservoir permeability, skin factor, and leakage coefficient by using type curve matching. The type curves are useful for quantitatively evaluating the level of leakage, thereby guiding the adjustment of the following production system for CBM wells.Este estudio estableció el modelo matemático de filtración de agua en una capa carbonífera al estimar la salida acuífera con el uso del coeficiente de fuga, de acuerdo con la teoría de filtración inestable, para analizar los efectos en la recarga de pérdida de fluidos de un acuífero en el drenado inicial para pozos de gas metano. El modelo se resolvió tras usar la transformación Laplace y la inversión numérica Stehfest para encontrar la respuesta en el lugar indicado. Luego, se creó la representación algorítmica de la presión y la presión derivativa con nuevas combinaciones de parámetros. Se evaluó la influencia de la pérdida de fluido del acuífero en la forma de la curva con base al mecanismo físico de filtración. Se estableció que el flujo radial finaliza antes de que el coeficiente de pérdida de fluido se incremente. Además, se propone el uso de la curva tipo correspondiente para obtener la permeabilidad del reservorio, el factor de daño y el coeficiente de pérdida de fluido. Las curvas tipo son útiles para evaluar cuantitativamente el nivel de la pérdida de fluido, y de esta manera guiar el ajuste de un sistema de producción consecuente para pozos de gas metano de carbón

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Directory of Open Access Journals

Universidad Nacional De Colombia - Repositorio Institucional UN

Retrieving Multimodal Information for Augmented Generation: A Survey

Author: Chen Hailin
Ding Bosheng
Do Xuan Long
Guo Xiaobao
Jiao Fangkai
Joty Shafiq
Li Minzhi
Li Xingxuan
Qin Chengwei
Wang Weishi
Zhao Ruochen
Publication venue
Publication date: 30/11/2023
Field of study

As Large Language Models (LLMs) become popular, there emerged an important trend of using multimodality to augment the LLMs' generation ability, which enables LLMs to better interact with the world. However, there lacks a unified perception of at which stage and how to incorporate different modalities. In this survey, we review methods that assist and augment generative models by retrieving multimodal knowledge, whose formats range from images, codes, tables, graphs, to audio. Such methods offer a promising solution to important concerns such as factuality, reasoning, interpretability, and robustness. By providing an in-depth review, this survey is expected to provide scholars with a deeper understanding of the methods' applications and encourage them to adapt existing techniques to the fast-growing field of LLMs

arXiv.org e-Print Archive