76 research outputs found
HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models
Large language models (LLMs), such as ChatGPT, are prone to generate
hallucinations, i.e., content that conflicts with the source or cannot be
verified by the factual knowledge. To understand what types of content and to
which extent LLMs are apt to hallucinate, we introduce the Hallucination
Evaluation benchmark for Large Language Models (HaluEval), a large collection
of generated and human-annotated hallucinated samples for evaluating the
performance of LLMs in recognizing hallucination. To generate these samples, we
propose a ChatGPT-based two-step framework, i.e., sampling-then-filtering.
Besides, we also hire some human labelers to annotate the hallucinations in
ChatGPT responses. The empirical results suggest that ChatGPT is likely to
generate hallucinated content in specific topics by fabricating unverifiable
information (i.e., about responses). Moreover, existing LLMs face
great challenges in recognizing the hallucinations in texts. However, our
experiments also prove that providing external knowledge or adding reasoning
steps can help LLMs recognize hallucinations. Our benchmark can be accessed at
https://github.com/RUCAIBox/HaluEval.Comment: Accepted to EMNLP 2023 Main Conference (Long Paper
Present-day kinematics and seismic potential of the Ganzi-Yushu fault, eastern Tibetan plateau, constrained from InSAR
In recent years, earthquakes have occurred frequently on the southeastern edge of the Tibetan Plateau, and the seismic hazard is high. However, because of the remote location of the Ganzi-Yushu fault zone, no high-resolution geodetic measurements of this region have been made. The radar line-of-sight deformation field of the Ganzi-Yushu fault was obtained using seven-track ascending and descending Sentinel-A/B interferometric synthetic aperture radar (InSAR) data from 2014 to 2020. Using the InSAR and published Global Navigation Satellite System (GNSS) data, we calculated the 3D deformation field in the study area, investigated the segment-specific fault slip rate, and inverted the fault slip distribution pattern using the steepest descent method. We then evaluated the seismic hazard using the strain rate field and slip deficit rate. The main findings of this study include the following. 1) The slip rate of the Ganzi-Yushu fault gradually increases from 2.5 to 6.8Â mm/yr from northwest to southeast. 2) A high-resolution strain rate map shows high-value anomalies in the Yushu and Dangjiang areas. 3) Our comprehensive analysis suggests that the seismic hazard of the Dangjiang and Dengke segments with high slip deficits cannot be ignored
TextBox 2.0: A Text Generation Library with Pre-trained Language Models
To facilitate research on text generation, this paper presents a
comprehensive and unified library, TextBox 2.0, focusing on the use of
pre-trained language models (PLMs). To be comprehensive, our library covers
common text generation tasks and their corresponding datasets and
further incorporates PLMs covering general, translation, Chinese,
dialogue, controllable, distilled, prompting, and lightweight PLMs. We also
implement efficient training strategies and provide generation
objectives for pre-training new PLMs from scratch. To be unified, we design the
interfaces to support the entire research pipeline (from data loading to
training and evaluation), ensuring that each step can be fulfilled in a unified
way. Despite the rich functionality, it is easy to use our library, either
through the friendly Python API or command line. To validate the effectiveness
of our library, we conduct extensive experiments and exemplify four types of
research scenarios. The project is released at the link:
https://github.com/RUCAIBox/TextBox.Comment: Accepted by EMNLP 202
Integrated single-cell and bulk RNA sequencing analyses reveal a prognostic signature of cancer-associated fibroblasts in head and neck squamous cell carcinoma
Objectives: To identify a prognosis-related subtype of cancer-associated fibroblasts (CAFs) in head and neck squamous cell carcinoma (HNSCC) and comprehend its contributions to molecular characteristics, immune characteristics, and their potential benefits in immunotherapy and chemotherapy for HNSCC.Materials and Methods: We performed single-cell RNA sequencing (scRNA-seq) analysis of CAFs from the samples of HNSCC patients derived from Gene Expression Omnibus (GEO), to identify the prognosis-related subtype of CAFs. CAFs were clustered into five subtypes, and a prognosis-related subtype was identified. Univariate and multivariate cox regression analyses were performed on the cohort selected from The Cancer Genome Atlas (TCGA) to determine signature construction, which was validated in GSE65858 and GSE42743. A prognostic signature based on 4 genes was constructed, which were derived from prognosis-related CAFs. The molecular characteristics, immune characteristics as well as the predicted chemosensitivity and immunotherapeutic response in the signature-defined subgroups were analyzed subsequently.Results: The patients with higher CAF scores correlated with poor survival outcomes. Additionally, a high CAF score correlated with lower infiltration levels of many immune cells including M1 macrophages, CD8+ T cells, follicular T helper cells, monocytes, and naïve B cells. High CAF score also demonstrated different enrichment pathways, mutation genes and copy number variated genes. Furthermore, patients with high CAF scores showed lower sensitivity for chemotherapy and immunotherapy than those with low CAF scores.Conclusion: The results of our study indicate the potential of the CAF signature as a biomarker for the prognosis of HNSCC patients. Furthermore, the signature could be a prospective therapeutic target in HNSCC
Fixed-Time Synchronization for Different Dimensional Complex Network Systems with Unknown Parameters via Adaptive Control
This article is related to the issue of fixed-time synchronization of different dimensional complex network systems with unknown parameters. Two suitable adaptive controllers and dynamic parameter estimations are proposed such that the complex network driving and response systems can be synchronized in the settling time. Based on fixed-time control theory and Lyapunov functional method, novel sufficient conditions are provided to guarantee the synchronization within the fixed times, and the settling times are explicitly evaluated, which are independent of the initial synchronization errors. Finally, a numerical example is given to illustrate the effectiveness of the proposed control algorithms
Sequential FISH and GISH karyotypes of M8003 (a), Austrian rye (b), N9116H (c) and N9116M (d).
<p>(a, c, d) 4',6-diamidino-2-phenylindole (DAPI), blue fluorescence; rye genomic DNA and Oligo-pTa535, red fluorescence; Oligo-pSc119.2, green fluorescence. (b) Oligo-pSc119.2, red fluorescence. Alterations of wheat chromosomes were indicated in white box.</p
Breeding scheme showing the development of N9116H and N9116M.
<p>Breeding scheme showing the development of N9116H and N9116M.</p
Morphologic traits of M8003, Austrian rye, N9116H and N9116M.
<p>a Plant of M8003, Austrian rye, N9116M and N9116H; b Spikes of M8003, Austrian rye, N9116M and N9116H; c Spikelets and kernels of M8003, Austrian rye, N9116H and N9116M; d Resistance of M8003, Austrian rye, N9116H and N9116M for powdery mildew and stripe rust. 1–4 in figures represent M8003, Austrian rye, N9116M and N9116H, respectively.</p
- …