9 research outputs found

    Evaluating Human-Language Model Interaction

    Full text link
    Many real-world applications of language models (LMs), such as writing assistance and code autocomplete, involve human-LM interaction. However, most benchmarks are non-interactive in that a model produces output without human involvement. To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics. Compared to standard, non-interactive evaluation, HALIE captures (i) the interactive process, not only the final output; (ii) the first-person subjective experience, not just a third-party assessment; and (iii) notions of preference beyond quality (e.g., enjoyment and ownership). We then design five tasks to cover different forms of interaction: social dialogue, question answering, crossword puzzles, summarization, and metaphor generation. With four state-of-the-art LMs (three variants of OpenAI's GPT-3 and AI21 Labs' Jurassic-1), we find that better non-interactive performance does not always translate to better human-LM interaction. In particular, we highlight three cases where the results from non-interactive and interactive metrics diverge and underscore the importance of human-LM interaction for LM evaluation.Comment: Authored by the Center for Research on Foundation Models (CRFM) at the Stanford Institute for Human-Centered Artificial Intelligence (HAI

    Genome-Wide Association Study Confirming Association of HLA-DP with Protection against Chronic Hepatitis B and Viral Clearance in Japanese and Korean

    Get PDF
    Hepatitis B virus (HBV) infection can lead to serious liver diseases, including liver cirrhosis (LC) and hepatocellular carcinoma (HCC); however, about 85–90% of infected individuals become inactive carriers with sustained biochemical remission and very low risk of LC or HCC. To identify host genetic factors contributing to HBV clearance, we conducted genome-wide association studies (GWAS) and replication analysis using samples from HBV carriers and spontaneously HBV-resolved Japanese and Korean individuals. Association analysis in the Japanese and Korean data identified the HLA-DPA1 and HLA-DPB1 genes with Pmeta = 1.89×10−12 for rs3077 and Pmeta = 9.69×10−10 for rs9277542. We also found that the HLA-DPA1 and HLA-DPB1 genes were significantly associated with protective effects against chronic hepatitis B (CHB) in Japanese, Korean and other Asian populations, including Chinese and Thai individuals (Pmeta = 4.40×10−19 for rs3077 and Pmeta = 1.28×10−15 for rs9277542). These results suggest that the associations between the HLA-DP locus and the protective effects against persistent HBV infection and with clearance of HBV were replicated widely in East Asian populations; however, there are no reports of GWAS in Caucasian or African populations. Based on the GWAS in this study, there were no significant SNPs associated with HCC development. To clarify the pathogenesis of CHB and the mechanisms of HBV clearance, further studies are necessary, including functional analyses of the HLA-DP molecule

    Additional file 1: of An adaptive detection method for fetal chromosomal aneuploidy using cell-free DNA from 447 Korean women

    Get PDF
    Figure S1 showed optimally adaptive reference samples extracted from all reference samples. Figure S2 showed that GC correction played an important role in reducing the CV. Figures S3.1, S3.2, S4.1, S4.2, S5 and S6 represented similar results to our adaptive sample selection. Figure S7 represented the relationship of the reads fractions and the GC contents of samples. (DOCX 2063 kb
    corecore