Search CORE

109 research outputs found

HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Author: Cheng Xiaoxue
Li Junyi
Nie Jian-Yun
Wen Ji-Rong
Zhao Wayne Xin
Publication venue
Publication date: 22/10/2023
Field of study

Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, i.e., content that conflicts with the source or cannot be verified by the factual knowledge. To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation benchmark for Large Language Models (HaluEval), a large collection of generated and human-annotated hallucinated samples for evaluating the performance of LLMs in recognizing hallucination. To generate these samples, we propose a ChatGPT-based two-step framework, i.e., sampling-then-filtering. Besides, we also hire some human labelers to annotate the hallucinations in ChatGPT responses. The empirical results suggest that ChatGPT is likely to generate hallucinated content in specific topics by fabricating unverifiable information (i.e., about

19.5\%

responses). Moreover, existing LLMs face great challenges in recognizing the hallucinations in texts. However, our experiments also prove that providing external knowledge or adding reasoning steps can help LLMs recognize hallucinations. Our benchmark can be accessed at https://github.com/RUCAIBox/HaluEval.Comment: Accepted to EMNLP 2023 Main Conference (Long Paper

arXiv.org e-Print Archive

The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models

Author: Chen Jie
Cheng Xiaoxue
Li Junyi
Nie Jian-Yun
Ren Ruiyang
Wen Ji-Rong
Zhao Wayne Xin
Publication venue
Publication date: 06/01/2024
Field of study

In the era of large language models (LLMs), hallucination (i.e., the tendency to generate factually incorrect content) poses great challenge to trustworthy and reliable deployment of LLMs in real-world applications. To tackle the LLM hallucination, three key questions should be well studied: how to detect hallucinations (detection), why do LLMs hallucinate (source), and what can be done to mitigate them (mitigation). To address these challenges, this work presents a systematic empirical study on LLM hallucination, focused on the the three aspects of hallucination detection, source and mitigation. Specially, we construct a new hallucination benchmark HaluEval 2.0, and designs a simple yet effective detection method for LLM hallucination. Furthermore, we zoom into the different training or utilization stages of LLMs and extensively analyze the potential factors that lead to the LLM hallucination. Finally, we implement and examine a series of widely used techniques to mitigate the hallucinations in LLMs. Our work has led to several important findings to understand the hallucination origin and mitigate the hallucinations in LLMs. Our code and data can be accessed at https://github.com/RUCAIBox/HaluEval-2.0.Comment: 24 pages, 8 figures, 13 table

arXiv.org e-Print Archive

Present-day kinematics and seismic potential of the Ganzi-Yushu fault, eastern Tibetan plateau, constrained from InSAR

Author: Jinshuo Wang
Lingyun Ji
Ningyuan Zhao
Ningyuan Zhao
Wenting Zhang
Xiaoxue Xu
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2023
Field of study

In recent years, earthquakes have occurred frequently on the southeastern edge of the Tibetan Plateau, and the seismic hazard is high. However, because of the remote location of the Ganzi-Yushu fault zone, no high-resolution geodetic measurements of this region have been made. The radar line-of-sight deformation field of the Ganzi-Yushu fault was obtained using seven-track ascending and descending Sentinel-A/B interferometric synthetic aperture radar (InSAR) data from 2014 to 2020. Using the InSAR and published Global Navigation Satellite System (GNSS) data, we calculated the 3D deformation field in the study area, investigated the segment-specific fault slip rate, and inverted the fault slip distribution pattern using the steepest descent method. We then evaluated the seismic hazard using the strain rate field and slip deficit rate. The main findings of this study include the following. 1) The slip rate of the Ganzi-Yushu fault gradually increases from 2.5 to 6.8 mm/yr from northwest to southeast. 2) A high-resolution strain rate map shows high-value anomalies in the Yushu and Dangjiang areas. 3) Our comprehensive analysis suggests that the seismic hazard of the Dangjiang and Dengke segments with high slip deficits cannot be ignored

Directory of Open Access Journals

A review on fundamentals for designing hydrogen evolution electrocatalyst

Author: Du Shangfeng
Farid Muhammad Asim
Huang Zhen-Feng
Qadeer Muhammad Abdul
Tahir Muhammad
Tanveer M.
Yan Yichang
Zhang Xiaoxue
Zou Ji-Jun
Publication venue
Publication date: 01/09/2024
Field of study

As a clean, efficient, and renewable energy source, hydrogen has always been recognized as a favourable replacement of fossil fuel. A primary challenge is an efficient generation of hydrogen to fulfil the requirements of hydrogen on a commercial scale. The electrocatalytic process of HER (hydrogen evolution reaction), as primary phase in water electrolytic process for H2 production, has undergone comprehensive observation from recent decades. Electrolytic water splitting presents a promised route to attain efficient hydrogen generation concerning energy conversion and storage, with electrolysis or catalysis playing a pivotal role. The advancement of catalyst or electrocatalysts that are effective, enduring and economical is necessary prerequisite for realizing the intended electrolytic hydrogen generation from water splitting for applicable considerations, embodying the primary emphasis of this article. In this extensive review, we initially summarize the basics of the Hydrogen evolution reaction and examine the latest cutting-edge progress in economical and highly efficiency catalysts utilizing both non-noble and noble metals. Moreover, the recent breakthroughs over the preceding years in electrolytic HER employing more affordable and widely available nanoparticles with a specific center of attention on economical and non-platinum electrocatalysts rooted in metal free (MF) and transition metal composite catalysts are deliberated here

University of Birmingham Research Portal

TextBox 2.0: A Text Generation Library with Pre-trained Language Models

Author: Chen Zhipeng
Cheng Xiaoxue
Dai Wenxun
Dong Zican
Hu Yiwen
Li Junyi
Nie Jian-Yun
Tang Tianyi
Wang Yuhao
Wen Ji-Rong
Yu Zhuohao
Zhao Wayne Xin
Publication venue
Publication date: 25/12/2022
Field of study

To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers

13

common text generation tasks and their corresponding

83

datasets and further incorporates

45

PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement

4

efficient training strategies and provide

4

generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.Comment: Accepted by EMNLP 202

arXiv.org e-Print Archive

Integrated single-cell and bulk RNA sequencing analyses reveal a prognostic signature of cancer-associated fibroblasts in head and neck squamous cell carcinoma

Author: Ben Ma
Ben Ma
Litao Han
Ning Qu
Ning Qu
Qinghai Ji
Qinghai Ji
Tian Liao
Tian Liao
Weibo Xu
Wenjun Wei
Wenjun Wei
Xiaoxue Du
Xiaoxue Du
Yichen Yang
Yichen Yang
Yu Wang
Yu Wang
Publication venue: 'Frontiers Media SA'
Publication date: 01/12/2022
Field of study

Objectives: To identify a prognosis-related subtype of cancer-associated fibroblasts (CAFs) in head and neck squamous cell carcinoma (HNSCC) and comprehend its contributions to molecular characteristics, immune characteristics, and their potential benefits in immunotherapy and chemotherapy for HNSCC.Materials and Methods: We performed single-cell RNA sequencing (scRNA-seq) analysis of CAFs from the samples of HNSCC patients derived from Gene Expression Omnibus (GEO), to identify the prognosis-related subtype of CAFs. CAFs were clustered into five subtypes, and a prognosis-related subtype was identified. Univariate and multivariate cox regression analyses were performed on the cohort selected from The Cancer Genome Atlas (TCGA) to determine signature construction, which was validated in GSE65858 and GSE42743. A prognostic signature based on 4 genes was constructed, which were derived from prognosis-related CAFs. The molecular characteristics, immune characteristics as well as the predicted chemosensitivity and immunotherapeutic response in the signature-defined subgroups were analyzed subsequently.Results: The patients with higher CAF scores correlated with poor survival outcomes. Additionally, a high CAF score correlated with lower infiltration levels of many immune cells including M1 macrophages, CD8+ T cells, follicular T helper cells, monocytes, and naïve B cells. High CAF score also demonstrated different enrichment pathways, mutation genes and copy number variated genes. Furthermore, patients with high CAF scores showed lower sensitivity for chemotherapy and immunotherapy than those with low CAF scores.Conclusion: The results of our study indicate the potential of the CAF signature as a biomarker for the prognosis of HNSCC patients. Furthermore, the signature could be a prospective therapeutic target in HNSCC

Directory of Open Access Journals

Fixed-Time Synchronization for Different Dimensional Complex Network Systems with Unknown Parameters via Adaptive Control

Author: Shan Su
Xiaoxue Bai
Yude Ji
Yunli Gong
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2021
Field of study

This article is related to the issue of fixed-time synchronization of different dimensional complex network systems with unknown parameters. Two suitable adaptive controllers and dynamic parameter estimations are proposed such that the complex network driving and response systems can be synchronized in the settling time. Based on fixed-time control theory and Lyapunov functional method, novel sufficient conditions are provided to guarantee the synchronization within the fixed times, and the settling times are explicitly evaluated, which are independent of the initial synchronization errors. Finally, a numerical example is given to illustrate the effectiveness of the proposed control algorithms

Directory of Open Access Journals

Sequential FISH and GISH karyotypes of M8003 (a), Austrian rye (b), N9116H (c) and N9116M (d).

Author: Changyou Wang (703780)
Hao Li (31608)
Wanquan Ji (703781)
Xiaoxue Guo (703779)
Publication venue
Publication date
Field of study

<p>(a, c, d) 4',6-diamidino-2-phenylindole (DAPI), blue fluorescence; rye genomic DNA and Oligo-pTa535, red fluorescence; Oligo-pSc119.2, green fluorescence. (b) Oligo-pSc119.2, red fluorescence. Alterations of wheat chromosomes were indicated in white box.</p

FigShare