79 research outputs found

    Safe RLHF: Safe Reinforcement Learning from Human Feedback

    Full text link
    With the development of large language models (LLMs), striking a balance between the performance and safety of AI systems has never been more critical. However, the inherent tension between the objectives of helpfulness and harmlessness presents a significant challenge during LLM training. To address this issue, we propose Safe Reinforcement Learning from Human Feedback (Safe RLHF), a novel algorithm for human value alignment. Safe RLHF explicitly decouples human preferences regarding helpfulness and harmlessness, effectively avoiding the crowdworkers' confusion about the tension and allowing us to train separate reward and cost models. We formalize the safety concern of LLMs as an optimization task of maximizing the reward function while satisfying specified cost constraints. Leveraging the Lagrangian method to solve this constrained problem, Safe RLHF dynamically adjusts the balance between the two objectives during fine-tuning. Through a three-round fine-tuning using Safe RLHF, we demonstrate a superior ability to mitigate harmful responses while enhancing model performance compared to existing value-aligned algorithms. Experimentally, we fine-tuned the Alpaca-7B using Safe RLHF and aligned it with collected human preferences, significantly improving its helpfulness and harmlessness according to human evaluations

    Sulfur and mercury MIF suggest volcanic contributions to Earth’s atmosphere at 2.7 Ga

    Get PDF
    This study received funding from a Natural Environment Research Council Standard Grant NE/M001156/1 (ALZ, EGN), and from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (Grant 678812 to MWC).The Archean eon is associated with large-scale changes in Earth’s geosphere and biosphere, including the onset of plate tectonics and the expansion of oxygenic photosynthesis, although the full impacts of these changes on the atmosphere remain unclear. Here we present coupled records of mass independent fractionation of sulfur (S-MIF) and mercury (Hg-MIF) isotopes from well preserved sediments of the ∼2.7 billion year old (Ga) Manjeri Formation, Belingwe Greenstone Belt, Zimbabwe. These palaeoatmospheric proxies record different trends for S-MIF and odd number Hg-MIF versus even number Hg-MIF, providing novel constraints on atmospheric chemistry during this time. S-MIF and odd number Hg-MIF values are muted in comparison to values preserved in later Archean sediments, representing a combination of enhanced volcanic input and local mixing. Even number Hg-MIF is absent from these sediments, consistent with complete photo-oxidation of gaseous Hg0, which could have been driven by increased halogen emissions from arc volcanism. When considered within a global geodynamic context, these MIF data suggest an important role for subduction zone-related volcanism associated with early plate tectonics in modulating the ∼2.7 Ga atmosphere.Publisher PDFPeer reviewe

    Large AI Models in Health Informatics: Applications, Challenges, and the Future

    Full text link
    Large AI models, or foundation models, are models recently emerging with massive scales both parameter-wise and data-wise, the magnitudes of which can reach beyond billions. Once pretrained, large AI models demonstrate impressive performance in various downstream tasks. A prime example is ChatGPT, whose capability has compelled people's imagination about the far-reaching influence that large AI models can have and their potential to transform different domains of our lives. In health informatics, the advent of large AI models has brought new paradigms for the design of methodologies. The scale of multi-modal data in the biomedical and health domain has been ever-expanding especially since the community embraced the era of deep learning, which provides the ground to develop, validate, and advance large AI models for breakthroughs in health-related areas. This article presents a comprehensive review of large AI models, from background to their applications. We identify seven key sectors in which large AI models are applicable and might have substantial influence, including 1) bioinformatics; 2) medical diagnosis; 3) medical imaging; 4) medical informatics; 5) medical education; 6) public health; and 7) medical robotics. We examine their challenges, followed by a critical discussion about potential future directions and pitfalls of large AI models in transforming the field of health informatics.Comment: This article has been accepted for publication in IEEE Journal of Biomedical and Health Informatic

    Baichuan 2: Open Large-scale Language Models

    Full text link
    Large language models (LLMs) have demonstrated remarkable performance on a variety of natural language tasks based on just a few examples of natural language instructions, reducing the need for extensive feature engineering. However, most powerful LLMs are closed-source or limited in their capability for languages other than English. In this technical report, we present Baichuan 2, a series of large-scale multilingual language models containing 7 billion and 13 billion parameters, trained from scratch, on 2.6 trillion tokens. Baichuan 2 matches or outperforms other open-source models of similar size on public benchmarks like MMLU, CMMLU, GSM8K, and HumanEval. Furthermore, Baichuan 2 excels in vertical domains such as medicine and law. We will release all pre-training model checkpoints to benefit the research community in better understanding the training dynamics of Baichuan 2.Comment: Baichuan 2 technical report. Github: https://github.com/baichuan-inc/Baichuan

    Combining the Tyrosine Kinase Inhibitor Cabozantinib and the mTORC1/2 Inhibitor Sapanisertib Blocks ERK Pathway Activity and Suppresses Tumor Growth in Renal Cell Carcinoma.

    Get PDF
    UNLABELLED: Current treatment approaches for renal cell carcinoma (RCC) face challenges in achieving durable tumor responses due to tumor heterogeneity and drug resistance. Combination therapies that leverage tumor molecular profiles could offer an avenue for enhancing treatment efficacy and addressing the limitations of current therapies. To identify effective strategies for treating RCC, we selected ten drugs guided by tumor biology to test in six RCC patient-derived xenograft (PDX) models. The multitargeted tyrosine kinase inhibitor (TKI) cabozantinib and mTORC1/2 inhibitor sapanisertib emerged as the most effective drugs, particularly when combined. The combination demonstrated favorable tolerability and inhibited tumor growth or induced tumor regression in all models, including two from patients who experienced treatment failure with FDA-approved TKI and immunotherapy combinations. In cabozantinib-treated samples, imaging analysis revealed a significant reduction in vascular density, and single-nucleus RNA sequencing (snRNA-seq) analysis indicated a decreased proportion of endothelial cells in the tumors. SnRNA-seq data further identified a tumor subpopulation enriched with cell-cycle activity that exhibited heightened sensitivity to the cabozantinib and sapanisertib combination. Conversely, activation of the epithelial-mesenchymal transition pathway, detected at the protein level, was associated with drug resistance in residual tumors following combination treatment. The combination effectively restrained ERK phosphorylation and reduced expression of ERK downstream transcription factors and their target genes implicated in cell-cycle control and apoptosis. This study highlights the potential of the cabozantinib plus sapanisertib combination as a promising treatment approach for patients with RCC, particularly those whose tumors progressed on immune checkpoint inhibitors and other TKIs. SIGNIFICANCE: The molecular-guided therapeutic strategy of combining cabozantinib and sapanisertib restrains ERK activity to effectively suppress growth of renal cell carcinomas, including those unresponsive to immune checkpoint inhibitors

    Spatially restricted drivers and transitional cell populations cooperate with the microenvironment in untreated and chemo-resistant pancreatic cancer

    Get PDF
    Pancreatic ductal adenocarcinoma is a lethal disease with limited treatment options and poor survival. We studied 83 spatial samples from 31 patients (11 treatment-naĂ¯ve and 20 treated) using single-cell/nucleus RNA sequencing, bulk-proteogenomics, spatial transcriptomics and cellular imaging. Subpopulations of tumor cells exhibited signatures of proliferation, KRAS signaling, cell stress and epithelial-to-mesenchymal transition. Mapping mutations and copy number events distinguished tumor populations from normal and transitional cells, including acinar-to-ductal metaplasia and pancreatic intraepithelial neoplasia. Pathology-assisted deconvolution of spatial transcriptomic data identified tumor and transitional subpopulations with distinct histological features. We showed coordinated expression of TIGIT in exhausted and regulatory T cells and Nectin in tumor cells. Chemo-resistant samples contain a threefold enrichment of inflammatory cancer-associated fibroblasts that upregulate metallothioneins. Our study reveals a deeper understanding of the intricate substructure of pancreatic ductal adenocarcinoma tumors that could help improve therapy for patients with this disease
    • …
    corecore