Search CORE

56 research outputs found

Statistical NLG for Generating the Content and Form of Referring Expressions

Author: Li Xiao
Lin Chenghua
Van Deemter Kees Jacobus
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/11/2018
Field of study

Acknowledgments We gratefully acknowledge the anonymous reviewers for their very helpful comments.Publisher PD

Aberdeen University Research

Improving Variational Autoencoder for Text Modelling with Timestep-Wise Regularisation

Author: Chen Guanyi
Li Ruizhe
Li Xiao
Lin Chenghua
Publication venue
Publication date: 02/11/2020
Field of study

Accepted by COLING 2020, final camera ready versionPreprin

Aberdeen University Research

Effective Distillation of Table-based Reasoning Ability from LLMs

Author: Lin Chenghua
Tang Chen
Xiao Chenghao
Yang Bohao
Zhao Kun
Publication venue
Publication date: 22/09/2023
Field of study

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their enormous parameter size and extremely high requirements for compute power pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models through distillation. Some studies explore the potential of leveraging LLMs to perform table-based reasoning. However, there has been no prior work focusing on table reasoning skills in smaller models specifically tailored for scientific table-to-text generation tasks. In this paper, we propose a novel table-based reasoning distillation approach, with the aim of distilling LLMs into tailored smaller models. Our experimental results have shown that a 220 million parameter model (Flan-T5-base) fine-tuned using distilled data, not only achieves a significant improvement compared to traditionally fine-tuned baselines, but also surpasses specific LLMs on a scientific table-to-text generation dataset. Our code is available at https://github.com/Bernard-Yang/DistillTableCoT

The University of Manchester - Institutional Repository

A Dual-Attention Hierarchical Recurrent Neural Network for Dialogue Act Classification

Author: Chen Guanyi
Collinson Matthew
Li Ruizhe
Li Xiao
Lin Chenghua
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Acknowledgment This work is supported by the award made by the UK Engineering and Physical Sciences Research Council (Grant number: EP/P011829/1).PreprintPublisher PD

arXiv.org e-Print Archive

Aberdeen University Research

Crossref

Utrecht University Repository

White Rose Research Online

Effective Distillation of Table-based Reasoning Ability from LLMs

Author: Lin Chenghua
Tang Chen
Xiao Chenghao
Yang Bohao
Zhao Kun
Publication venue
Publication date: 22/09/2023
Field of study

Large Language Models (LLMs) have demonstrated remarkable performance across a wide range of natural language processing tasks. However, their remarkable parameter size and their impressive high requirement of computing resources pose challenges for their practical deployment. Recent research has revealed that specific capabilities of LLMs, such as numerical reasoning, can be transferred to smaller models through distillation. Some studies explore the potential of leveraging LLMs to perform table-based reasoning. Nevertheless, prior to our work, there has been no investigation into the prospect of specialising table reasoning skills in smaller models specifically tailored for table-to-text generation tasks. In this paper, we propose a novel table-based reasoning distillation, with the aim of distilling distilling LLMs into tailored, smaller models specifically designed for table-based reasoning task. Experimental results have shown that a 0.22 billion parameter model (Flan-T5-base) fine-tuned using distilled data, not only achieves a significant improvement compared to traditionally fine-tuned baselines but also surpasses specific LLMs like gpt-3.5-turbo on the scientific table-to-text generation dataset (SciGen). The code and data are released in https://github.com/Bernard-Yang/TableDistill

arXiv.org e-Print Archive

Length is a Curse and a Blessing for Document-level Semantics

Author: Hudson G Thomas
Li Yizhi
Lin Chenghua
Moubayed Noura Al
Xiao Chenghao
Publication venue
Publication date: 24/10/2023
Field of study

In recent years, contrastive learning (CL) has been extensively utilized to recover sentence and document-level encoding capability from pre-trained language models. In this work, we question the length generalizability of CL-based models, i.e., their vulnerability towards length-induced semantic shift. We verify not only that length vulnerability is a significant yet overlooked research gap, but we can devise unsupervised CL methods solely depending on the semantic signal provided by document length. We first derive the theoretical foundations underlying length attacks, showing that elongating a document would intensify the high intra-document similarity that is already brought by CL. Moreover, we found that isotropy promised by CL is highly dependent on the length range of text exposed in training. Inspired by these findings, we introduce a simple yet universal document representation learning framework, LA(SER)

^{3}

: length-agnostic self-reference for semantically robust sentence representation learning, achieving state-of-the-art unsupervised performance on the standard information retrieval benchmark.Comment: Accepted at EMNLP 2023. Our code is publicly available at https://github.com/gowitheflow-1998/LA-SER-cube

arXiv.org e-Print Archive

Audio Contrastive based Fine-tuning

Author: Li Yizhi
Liang Qibin
Lin Chenghua
Moubayed Noura Al
Wang Yang
Xiao Chenghao
Publication venue
Publication date: 22/09/2023
Field of study

Audio classification plays a crucial role in speech and sound processing tasks with a wide range of applications. There still remains a challenge of striking the right balance between fitting the model to the training data (avoiding overfitting) and enabling it to generalise well to a new domain. Leveraging the transferability of contrastive learning, we introduce Audio Contrastive-based Fine-tuning (AudioConFit), an efficient approach characterised by robust generalisability. Empirical experiments on a variety of audio classification tasks demonstrate the effectiveness and robustness of our approach, which achieves state-of-the-art results in various settings.Comment: Under revie

arXiv.org e-Print Archive

Improving variational autoencoder for text modelling with timestep-wise regularisation

Author: Bel Nuria
Chen Guanyi
Li Ruizhe
Li Xiao
Lin Chenghua
Scott Donia
Zong Chengqing
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 02/11/2020
Field of study

The Variational Autoencoder (VAE) is a popular and powerful model applied to text modelling to generate diverse sentences. However, an issue known as posterior collapse (or KL loss vanishing) happens when the VAE is used in text modelling, where the approximate posterior collapses to the prior, and the model will totally ignore the latent variables and be degraded to a plain language model during text generation. Such an issue is particularly prevalent when RNN-based VAE models are employed for text modelling. In this paper, we propose a simple, generic architecture called Timestep-Wise Regularisation VAE (TWR-VAE), which can effectively avoid posterior collapse and can be applied to any RNN-based VAE models. The effectiveness and versatility of our model are demonstrated in different tasks, including language modelling and dialogue response generation

arXiv.org e-Print Archive

Aberdeen University Research

White Rose Research Online

Utrecht University Repository

DGST : a dual-generator network for text style transfer

Author: Chen Guanyi
Cohn Trevor
He Yulan
Li Ruizhe
Li Xiao
Lin Chenghua
Liu Yang
Webber Bonnie
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

We propose DGST, a novel and simple Dual-Generator network architecture for text Style Transfer. Our model employs two generators only, and does not rely on any discriminators or parallel corpus for training. Both quantitative and qualitative experiments on the Yelp and IMDb datasets show that our model gives competitive performance compared to several strong baselines with more complicated architecture designs

Crossref

White Rose Research Online

Utrecht University Repository

Ethyne Reducing Metal-Organic Frameworks to Control Fabrications of Core/shell Nanoparticles as Catalysts

Author: Change Qiang
Guo Xiaoxue
Li Ke
Li Yongwang
Liu Suyao
Liu Xi
Ma Caiping
Wen Xiadong
Xiao Bo
Xu Yiqun
Yang Yong
Yuan Qingchun
Zhang Chenghua
Zhang Rongle
Publication venue: 'American Chemical Society (ACS)'
Publication date: 20/06/2018
Field of study

An approach using cobalt metal-organic frameworks (Co-MOF) as precursors is established for the fabrication of cobalt nanoparticles in porous carbon shells (core/shell Co@C). Chemical vapor deposition of ethyne is used for controlling the reduction of cobalt nanoclusters in the MOF and the spontaneous formation of the porous carbon shells. The metallic cobalt cores formed are up to 4 - 6 nm with the crystal phase varying between hexagonally-close-packed (hcp) and face-centre-packed (fcc). The porous carbon shells change from amorphous to graphene with the ethyne deposition temperature increasing from 400 to 600 oC. The core/shell Co@C nanoparticles exhibit high catalytic activity in selectively converting syngas (CTY: 254.1 - 312.1 μmolCO·gCo-1·s-1) into hydrocarbons (4.0 - 5.2 gHC·g-cat-1·h-1) at 260 oC. As well as the crystal size and phase, the coordination numbers of the cobalt to oxygen and to other cobalt atoms on the surface of the cobalt nanoparticles, and the permeability of the porous carbon shell have been related to the catalytic performance in FTS reactions

Queen's University Belfast Research Portal

Crossref

Aston Publications Explorer

FigShare