76 research outputs found

    HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

    Full text link
    Large language models (LLMs), such as ChatGPT, are prone to generate hallucinations, i.e., content that conflicts with the source or cannot be verified by the factual knowledge. To understand what types of content and to which extent LLMs are apt to hallucinate, we introduce the Hallucination Evaluation benchmark for Large Language Models (HaluEval), a large collection of generated and human-annotated hallucinated samples for evaluating the performance of LLMs in recognizing hallucination. To generate these samples, we propose a ChatGPT-based two-step framework, i.e., sampling-then-filtering. Besides, we also hire some human labelers to annotate the hallucinations in ChatGPT responses. The empirical results suggest that ChatGPT is likely to generate hallucinated content in specific topics by fabricating unverifiable information (i.e., about 19.5%19.5\% responses). Moreover, existing LLMs face great challenges in recognizing the hallucinations in texts. However, our experiments also prove that providing external knowledge or adding reasoning steps can help LLMs recognize hallucinations. Our benchmark can be accessed at https://github.com/RUCAIBox/HaluEval.Comment: Accepted to EMNLP 2023 Main Conference (Long Paper

    Present-day kinematics and seismic potential of the Ganzi-Yushu fault, eastern Tibetan plateau, constrained from InSAR

    Get PDF
    In recent years, earthquakes have occurred frequently on the southeastern edge of the Tibetan Plateau, and the seismic hazard is high. However, because of the remote location of the Ganzi-Yushu fault zone, no high-resolution geodetic measurements of this region have been made. The radar line-of-sight deformation field of the Ganzi-Yushu fault was obtained using seven-track ascending and descending Sentinel-A/B interferometric synthetic aperture radar (InSAR) data from 2014 to 2020. Using the InSAR and published Global Navigation Satellite System (GNSS) data, we calculated the 3D deformation field in the study area, investigated the segment-specific fault slip rate, and inverted the fault slip distribution pattern using the steepest descent method. We then evaluated the seismic hazard using the strain rate field and slip deficit rate. The main findings of this study include the following. 1) The slip rate of the Ganzi-Yushu fault gradually increases from 2.5 to 6.8 mm/yr from northwest to southeast. 2) A high-resolution strain rate map shows high-value anomalies in the Yushu and Dangjiang areas. 3) Our comprehensive analysis suggests that the seismic hazard of the Dangjiang and Dengke segments with high slip deficits cannot be ignored

    TextBox 2.0: A Text Generation Library with Pre-trained Language Models

    Full text link
    To facilitate research on text generation, this paper presents a comprehensive and unified library, TextBox 2.0, focusing on the use of pre-trained language models (PLMs). To be comprehensive, our library covers 1313 common text generation tasks and their corresponding 8383 datasets and further incorporates 4545 PLMs covering general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight PLMs. We also implement 44 efficient training strategies and provide 44 generation objectives for pre-training new PLMs from scratch. To be unified, we design the interfaces to support the entire research pipeline (from data loading to training and evaluation), ensuring that each step can be fulfilled in a unified way. Despite the rich functionality, it is easy to use our library, either through the friendly Python API or command line. To validate the effectiveness of our library, we conduct extensive experiments and exemplify four types of research scenarios. The project is released at the link: https://github.com/RUCAIBox/TextBox.Comment: Accepted by EMNLP 202

    Integrated single-cell and bulk RNA sequencing analyses reveal a prognostic signature of cancer-associated fibroblasts in head and neck squamous cell carcinoma

    Get PDF
    Objectives: To identify a prognosis-related subtype of cancer-associated fibroblasts (CAFs) in head and neck squamous cell carcinoma (HNSCC) and comprehend its contributions to molecular characteristics, immune characteristics, and their potential benefits in immunotherapy and chemotherapy for HNSCC.Materials and Methods: We performed single-cell RNA sequencing (scRNA-seq) analysis of CAFs from the samples of HNSCC patients derived from Gene Expression Omnibus (GEO), to identify the prognosis-related subtype of CAFs. CAFs were clustered into five subtypes, and a prognosis-related subtype was identified. Univariate and multivariate cox regression analyses were performed on the cohort selected from The Cancer Genome Atlas (TCGA) to determine signature construction, which was validated in GSE65858 and GSE42743. A prognostic signature based on 4 genes was constructed, which were derived from prognosis-related CAFs. The molecular characteristics, immune characteristics as well as the predicted chemosensitivity and immunotherapeutic response in the signature-defined subgroups were analyzed subsequently.Results: The patients with higher CAF scores correlated with poor survival outcomes. Additionally, a high CAF score correlated with lower infiltration levels of many immune cells including M1 macrophages, CD8+ T cells, follicular T helper cells, monocytes, and naïve B cells. High CAF score also demonstrated different enrichment pathways, mutation genes and copy number variated genes. Furthermore, patients with high CAF scores showed lower sensitivity for chemotherapy and immunotherapy than those with low CAF scores.Conclusion: The results of our study indicate the potential of the CAF signature as a biomarker for the prognosis of HNSCC patients. Furthermore, the signature could be a prospective therapeutic target in HNSCC

    Fixed-Time Synchronization for Different Dimensional Complex Network Systems with Unknown Parameters via Adaptive Control

    No full text
    This article is related to the issue of fixed-time synchronization of different dimensional complex network systems with unknown parameters. Two suitable adaptive controllers and dynamic parameter estimations are proposed such that the complex network driving and response systems can be synchronized in the settling time. Based on fixed-time control theory and Lyapunov functional method, novel sufficient conditions are provided to guarantee the synchronization within the fixed times, and the settling times are explicitly evaluated, which are independent of the initial synchronization errors. Finally, a numerical example is given to illustrate the effectiveness of the proposed control algorithms

    Sequential FISH and GISH karyotypes of M8003 (a), Austrian rye (b), N9116H (c) and N9116M (d).

    No full text
    <p>(a, c, d) 4',6-diamidino-2-phenylindole (DAPI), blue fluorescence; rye genomic DNA and Oligo-pTa535, red fluorescence; Oligo-pSc119.2, green fluorescence. (b) Oligo-pSc119.2, red fluorescence. Alterations of wheat chromosomes were indicated in white box.</p

    Breeding scheme showing the development of N9116H and N9116M.

    No full text
    <p>Breeding scheme showing the development of N9116H and N9116M.</p

    Morphologic traits of M8003, Austrian rye, N9116H and N9116M.

    No full text
    <p>a Plant of M8003, Austrian rye, N9116M and N9116H; b Spikes of M8003, Austrian rye, N9116M and N9116H; c Spikelets and kernels of M8003, Austrian rye, N9116H and N9116M; d Resistance of M8003, Austrian rye, N9116H and N9116M for powdery mildew and stripe rust. 1–4 in figures represent M8003, Austrian rye, N9116M and N9116H, respectively.</p
    • …
    corecore