Search CORE

62 research outputs found

How to Fine-Tune BERT for Text Classification?

Author: Huang Xuanjing
Qiu Xipeng
Sun Chi
Xu Yige
Publication venue
Publication date: 05/02/2020
Field of study

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets

arXiv.org e-Print Archive

Crossref

SDCL: Self-Distillation Contrastive Learning for Chinese Spell Checking

Author: Qiu Xipeng
Sun Yu
Yan Hang
Zhang Xiaotian
Publication venue
Publication date: 07/11/2022
Field of study

Due to the ambiguity of homophones, Chinese Spell Checking (CSC) has widespread applications. Existing systems typically utilize BERT for text encoding. However, CSC requires the model to account for both phonetic and graphemic information. To adapt BERT to the CSC task, we propose a token-level self-distillation contrastive learning method. We employ BERT to encode both the corrupted and corresponding correct sentence. Then, we use contrastive learning loss to regularize corrupted tokens' hidden states to be closer to counterparts in the correct sentence. On three CSC datasets, we confirmed our method provides a significant improvement above baselines

arXiv.org e-Print Archive

PVC-LOT-008-J-038

Author: Chua Chee Kai
Sun Zhongji
Tan Xipeng
Tor Shu Beng
Publication venue: Digital Kenyon: Research, Scholarship, and Creative Exchange
Publication date: 31/03/2000
Field of study

Laser-based powder-bed fusion additive manufacturing or three-dimensional printing technology has gained tremendous attention due to its controllable, digital, and automated manufacturing process, which can afford a refined microstructure and superior strength. However, it is a major challenge to additively manufacture metal parts with satisfactory ductility and toughness. Here we report a novel selective laser melting process to simultaneously enhance the strength and ductility of stainless steel 316L by in-process engineering its microstructure into a crystallographic texture. We find that the tensile strength and ductility of SLM-built stainless steel 316L samples could be enhanced by ~16% and ~40% respectively, with the engineered textured microstructure compared to the common textured microstructure. This is because the favorable nano-twinning mechanism was significantly more activated in the textured stainless steel 316L samples during plastic deformation. In addition, kinetic simulations were performed to unveil the relationship between the melt pool geometry and crystallographic texture. The new additive manufacturing strategy of engineering the crystallographic texture can be applied to other metals and alloys with twinning-induced plasticity. This work paves the way to additively manufacture metal parts with high strength and high ductility.NRF (Natl Research Foundation, S’pore)Published versio

Directory of Open Access Journals

Kenyon College: Digital Kenyon - Research, Scholarship, and Creative Exchange

Do Large Language Models Know What They Don't Know?

Author: Guo Qipeng
Huang Xuanjing
Qiu Xipeng
Sun Qiushi
Wu Jiawen
Yin Zhangyue
Publication venue
Publication date: 30/05/2023
Field of study

Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks. Current research focuses on enhancing their performance within their existing knowledge. Despite their vast knowledge, LLMs are still limited by the amount of information they can accommodate and comprehend. Therefore, the ability to understand their own limitations on the unknows, referred to as self-knowledge, is of paramount importance. This study aims to evaluate LLMs' self-knowledge by assessing their ability to identify unanswerable or unknowable questions. We introduce an automated methodology to detect uncertainty in the responses of these models, providing a novel measure of their self-knowledge. We further introduce a unique dataset, SelfAware, consisting of unanswerable questions from five diverse categories and their answerable counterparts. Our extensive analysis, involving 20 LLMs including GPT-3, InstructGPT, and LLaMA, discovering an intrinsic capacity for self-knowledge within these models. Moreover, we demonstrate that in-context learning and instruction tuning can further enhance this self-knowledge. Despite this promising insight, our findings also highlight a considerable gap between the capabilities of these models and human proficiency in recognizing the limits of their knowledge.Comment: 10 pages, 9 figures, accepted by Findings of ACL202

arXiv.org e-Print Archive