41 research outputs found

    Do LLMs Know about Hallucination? An Empirical Investigation of LLM's Hidden States

    Full text link
    Large Language Models (LLMs) can make up answers that are not real, and this is known as hallucination. This research aims to see if, how, and to what extent LLMs are aware of hallucination. More specifically, we check whether and how an LLM reacts differently in its hidden states when it answers a question right versus when it hallucinates. To do this, we introduce an experimental framework which allows examining LLM's hidden states in different hallucination situations. Building upon this framework, we conduct a series of experiments with language models in the LLaMA family (Touvron et al., 2023). Our empirical findings suggest that LLMs react differently when processing a genuine response versus a fabricated one. We then apply various model interpretation techniques to help understand and explain the findings better. Moreover, informed by the empirical observations, we show great potential of using the guidance derived from LLM's hidden representation space to mitigate hallucination. We believe this work provides insights into how LLMs produce hallucinated answers and how to make them occur less often.Comment: 9 pages, 8 figures, 2 tables (13 pages, 12 figures, 13 tables including references and appendices

    Exploring the Relationship between In-Context Learning and Instruction Tuning

    Full text link
    In-Context Learning (ICL) and Instruction Tuning (IT) are two primary paradigms of adopting Large Language Models (LLMs) to downstream applications. However, they are significantly different. In ICL, a set of demonstrations are provided at inference time but the LLM's parameters are not updated. In IT, a set of demonstrations are used to tune LLM's parameters in training time but no demonstrations are used at inference time. Although a growing body of literature has explored ICL and IT, studies on these topics have largely been conducted in isolation, leading to a disconnect between these two paradigms. In this work, we explore the relationship between ICL and IT by examining how the hidden states of LLMs change in these two paradigms. Through carefully designed experiments conducted with LLaMA-2 (7B and 13B), we find that ICL is implicit IT. In other words, ICL changes an LLM's hidden states as if the demonstrations were used to instructionally tune the model. Furthermore, the convergence between ICL and IT is largely contingent upon several factors related to the provided demonstrations. Overall, this work offers a unique perspective to explore the connection between ICL and IT and sheds light on understanding the behaviors of LLM

    Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads

    Full text link
    Transformer-based pretrained large language models (PLM) such as BERT and GPT have achieved remarkable success in NLP tasks. However, PLMs are prone to encoding stereotypical biases. Although a burgeoning literature has emerged on stereotypical bias mitigation in PLMs, such as work on debiasing gender and racial stereotyping, how such biases manifest and behave internally within PLMs remains largely unknown. Understanding the internal stereotyping mechanisms may allow better assessment of model fairness and guide the development of effective mitigation strategies. In this work, we focus on attention heads, a major component of the Transformer architecture, and propose a bias analysis framework to explore and identify a small set of biased heads that are found to contribute to a PLM's stereotypical bias. We conduct extensive experiments to validate the existence of these biased heads and to better understand how they behave. We investigate gender and racial bias in the English language in two types of Transformer-based PLMs: the encoder-based BERT model and the decoder-based autoregressive GPT model. Overall, the results shed light on understanding the bias behavior in pretrained language models

    Exploring the Common Appearance-Boundary Adaptation for Nighttime Optical Flow

    Full text link
    We investigate a challenging task of nighttime optical flow, which suffers from weakened texture and amplified noise. These degradations weaken discriminative visual features, thus causing invalid motion feature matching. Typically, existing methods employ domain adaptation to transfer knowledge from auxiliary domain to nighttime domain in either input visual space or output motion space. However, this direct adaptation is ineffective, since there exists a large domain gap due to the intrinsic heterogeneous nature of the feature representations between auxiliary and nighttime domains. To overcome this issue, we explore a common-latent space as the intermediate bridge to reinforce the feature alignment between auxiliary and nighttime domains. In this work, we exploit two auxiliary daytime and event domains, and propose a novel common appearance-boundary adaptation framework for nighttime optical flow. In appearance adaptation, we employ the intrinsic image decomposition to embed the auxiliary daytime image and the nighttime image into a reflectance-aligned common space. We discover that motion distributions of the two reflectance maps are very similar, benefiting us to consistently transfer motion appearance knowledge from daytime to nighttime domain. In boundary adaptation, we theoretically derive the motion correlation formula between nighttime image and accumulated events within a spatiotemporal gradient-aligned common space. We figure out that the correlation of the two spatiotemporal gradient maps shares significant discrepancy, benefitting us to contrastively transfer boundary knowledge from event to nighttime domain. Moreover, appearance adaptation and boundary adaptation are complementary to each other, since they could jointly transfer global motion and local boundary knowledge to the nighttime domain

    Potential Blood Pressure Goals in IgA Nephropathy: Prevalence, Awareness, and Treatment Rates in Chronic Kidney Disease Among Patients with Hypertension in China (PATRIOTIC) Study

    Get PDF
    Background/Aims: IgA nephropathy is the most prevalent form of primary glomerulonephritis worldwide. Among patients with kidney disease, hypertension is one of the most important risk factors of disease progression. Considering the limited evidence regarding the appropriate blood pressure (BP) goal for patients with IgA nephropathy, our aim was to critically appraise the potential BP goal in IgA nephropathy. Methods: We performed a retrospective analysis of the BP data from 1055 patients with IgA nephropathy, extracted from the database of a nationwide, multi-center, cross-sectional study, including 61 tertiary hospitals in China. Hypertension was defined by a BP ≥140/90 mmHg. Three BP cutoff levels were evaluated as control values: < 140/90 mmHg, < 130/80 mmHg and < 125/75 mmHg. The primary outcome of our study was the prevalence of BP control among patients with a 24-h proteinuria < 1 g/d or ≥ 1 g/d. Multivariate logistic regression analysis was used to identify demographic and clinical factors associated with a decrease in renal function for the different target levels of BP. Results: The overall prevalence of hypertension was 63.3%. BP was controlled under 140/90 mmHg in 49.1% of patients, with 34.3% of patients with proteinuria < 1 g/d reaching the target BP < 130/80 mmHg and only 12.9% of patients with proteinuria > 1 g/d achieving a BP < 125/75 mmHg. Among patients with proteinuria < 1 g/d, the adjusted odds ratios (OR) and 95% confidence interval (95% CI) of a decrease in renal function, for the 3 target BP levels, were as follows (P > 0.05): < 140/90 mmHg, 0.9 (0.5 - 1.6); < 130/80 mmHg, 1.0 (0.5 - 1.8); and < 125/75 mmHg, 1.0 (0.5 - 2.0). With proteinuria ≥1 g/d, the adjusted ORs (95%CI) of attaining the BP targets of < 140/90 mmHg, < 130/80 mmHg and < 125/75 mmHg were 0.4 (0.2 - 0.6), 0.2 (0.1 - 0.4) and 0.3 (0.1 - 0.5), respectively (P < 0.05). Conclusion: Hypertension was common in IgA nephropathy and hypertensive control was suboptimal. Our result supports a benefit of intensive control of BP < 130/80 mmHg for patients with proteinuria ≥1 g/d. However, in patients with proteinuria < 1 g/d, a renoprotective effect of this BP goal was not identified

    Construction of an immunogenic cell death-based risk score prognosis model in breast cancer

    Get PDF
    Immunogenic cell death (ICD) is a form of regulated cell death that elicits immune response. Common inducers of ICD include cancer chemotherapy and radiation therapy. A better understanding of ICD might contribute to modify the current regimens of anti-cancer therapy, especially immunotherapy. This study aimed to identify ICD-related prognostic gene signatures in breast cancer (BC). An ICD-based gene prognostic signature was developed using Lasso-cox regression and Kaplan-Meier survival analysis based on datasets acquired from the Cancer Genome Atlas and Gene Expression Omnibus. A nomogram model was developed to predict the prognosis of BC patients. Gene Set Enrichment Analysis (GESA) and Gene Set Variation Analysis (GSVA) were used to explore the differentially expressed signaling pathways in high and low-risk groups. CIBERSORT and ESTIMATE algorithms were performed to investigate the difference of immune status in tumor microenvironment of different risk groups. Six genes (CALR, CLEC9A, BAX, TLR4, CXCR3, and PIK3CA) were selected for construction and validation of the prognosis model of BC based on public data. GSEA and GSVA analysis found that immune-related gene sets were enriched in low-risk group. Moreover, immune cell infiltration analysis showed that the immune features of the high-risk group were characterized by higher infiltration of tumor-associated macrophages and a lower proportion of CD8+ T cells, suggesting an immune evasive tumor microenvironment. We constructed and validated an ICD-based gene signature for predicting prognosis of breast cancer patients. Our model provides a tool with good discrimination and calibration abilities to predict the prognosis of BC, especially triple-negative breast cancer (TNBC)

    Quantum hydrogen-bond symmetrization in the superconducting hydrogen sulfide system.

    Get PDF
    The quantum nature of the proton can crucially affect the structural and physical properties of hydrogen compounds. For example, in the high-pressure phases of H2O, quantum proton fluctuations lead to symmetrization of the hydrogen bond and reduce the boundary between asymmetric and symmetric structures in the phase diagram by 30 gigapascals (ref. 3). Here we show that an analogous quantum symmetrization occurs in the recently discovered sulfur hydride superconductor with a superconducting transition temperature Tc of 203 kelvin at 155 gigapascals--the highest Tc reported for any superconductor so far. Superconductivity occurs via the formation of a compound with chemical formula H3S (sulfur trihydride) with sulfur atoms arranged on a body-centred cubic lattice. If the hydrogen atoms are treated as classical particles, then for pressures greater than about 175 gigapascals they are predicted to sit exactly halfway between two sulfur atoms in a structure with Im3m symmetry. At lower pressures, the hydrogen atoms move to an off-centre position, forming a short H-S covalent bond and a longer H···S hydrogen bond in a structure with R3m symmetry. X-ray diffraction experiments confirm the H3S stoichiometry and the sulfur lattice sites, but were unable to discriminate between the two phases. Ab initio density-functional-theory calculations show that quantum nuclear motion lowers the symmetrization pressure by 72 gigapascals for H3S and by 60 gigapascals for D3S. Consequently, we predict that the Im3m phase dominates the pressure range within which the high Tc was measured. The observed pressure dependence of Tc is accurately reproduced in our calculations for the phase, but not for the R3m phase. Therefore, the quantum nature of the proton fundamentally changes the superconducting phase diagram of H3S.We acknowledge financial support from the Spanish Ministry of Economy and Competitiveness (FIS2013- 48286-C2-2-P), French Agence Nationale de la Recherche (Grant No. ANR-13-IS10-0003- 392 01), EPSRC (UK) (Grant No. EP/J017639/1), Cambridge Commonwealth Trust, National Natural Science Foundation of China (Grants No. 11204111, 11404148, and 11274136), and 2012 Changjiang Scholars Program of China. Work at Carnegie was supported by EFree, an Energy Frontier Research Center funded by the DOE, Office of Science, Basic Energy Sciences under Award No. DE-SC-0001057. Computer facilities were provided by the PRACE project AESFT and the Donostia International Physics Center (DIPC).This is the author accepted manuscript. The final version is available from Nature Publishing Group via http://dx.doi.org/10.1038/nature1717

    The Phylogenetic Origin of oskar Coincided with the Origin of Maternally Provisioned Germ Plasm and Pole Cells at the Base of the Holometabola

    Get PDF
    The establishment of the germline is a critical, yet surprisingly evolutionarily labile, event in the development of sexually reproducing animals. In the fly Drosophila, germ cells acquire their fate early during development through the inheritance of the germ plasm, a specialized maternal cytoplasm localized at the posterior pole of the oocyte. The gene oskar (osk) is both necessary and sufficient for assembling this substance. Both maternal germ plasm and oskar are evolutionary novelties within the insects, as the germline is specified by zygotic induction in basally branching insects, and osk has until now only been detected in dipterans. In order to understand the origin of these evolutionary novelties, we used comparative genomics, parental RNAi, and gene expression analyses in multiple insect species. We have found that the origin of osk and its role in specifying the germline coincided with the innovation of maternal germ plasm and pole cells at the base of the holometabolous insects and that losses of osk are correlated with changes in germline determination strategies within the Holometabola. Our results indicate that the invention of the novel gene osk was a key innovation that allowed the transition from the ancestral late zygotic mode of germline induction to a maternally controlled establishment of the germline found in many holometabolous insect species. We propose that the ancestral role of osk was to connect an upstream network ancestrally involved in mRNA localization and translational control to a downstream regulatory network ancestrally involved in executing the germ cell program

    Space advanced technology demonstration satellite

    Get PDF
    The Space Advanced Technology demonstration satellite (SATech-01), a mission for low-cost space science and new technology experiments, organized by Chinese Academy of Sciences (CAS), was successfully launched into a Sun-synchronous orbit at an altitude of similar to 500 km on July 27, 2022, from the Jiuquan Satellite Launch Centre. Serving as an experimental platform for space science exploration and the demonstration of advanced common technologies in orbit, SATech-01 is equipped with 16 experimental payloads, including the solar upper transition region imager (SUTRI), the lobster eye imager for astronomy (LEIA), the high energy burst searcher (HEBS), and a High Precision Magnetic Field Measurement System based on a CPT Magnetometer (CPT). It also incorporates an imager with freeform optics, an integrated thermal imaging sensor, and a multi-functional integrated imager, etc. This paper provides an overview of SATech-01, including a technical description of the satellite and its scientific payloads, along with their on-orbit performance