173 research outputs found

    LawBench: Benchmarking Legal Knowledge of Large Language Models

    Full text link
    Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain

    GNL3L stabilizes the TRF1 complex and promotes mitotic transition

    Get PDF
    Telomeric repeat binding factor 1 (TRF1) is a component of the multiprotein complex “shelterin,” which organizes the telomere into a high-order structure. TRF1 knockout embryos suffer from severe growth defects without apparent telomere dysfunction, suggesting an obligatory role for TRF1 in cell cycle control. To date, the mechanism regulating the mitotic increase in TRF1 protein expression and its function in mitosis remains unclear. Here, we identify guanine nucleotide-binding protein-like 3 (GNL3L), a GTP-binding protein most similar to nucleostemin, as a novel TRF1-interacting protein in vivo. GNL3L binds TRF1 in the nucleoplasm and is capable of promoting the homodimerization and telomeric association of TRF1, preventing promyelocytic leukemia body recruitment of telomere-bound TRF1, and stabilizing TRF1 protein by inhibiting its ubiquitylation and binding to FBX4, an E3 ubiquitin ligase for TRF1. Most importantly, the TRF1 protein-stabilizing activity of GNL3L mediates the mitotic increase of TRF1 protein and promotes the metaphase-to-anaphase transition. This work reveals novel aspects of TRF1 modulation by GNL3L

    Potential of UAV-Based Active Sensing for Monitoring Rice Leaf Nitrogen Status

    Get PDF
    Unmanned aerial vehicle (UAV) based active canopy sensors can serve as a promising sensing solution for the estimation of crop nitrogen (N) status with great applicability and flexibility. This study was endeavored to determine the feasibility of UAV-based active sensing to monitor the leaf N status of rice (Oryza sativa L.) and to examine the transferability of handheld-based predictive models to UAV-based active sensing. In this 3-year multi-locational study, varied N-rates (0–405 kg N ha−1) field experiments were conducted using five rice varieties. Plant samples and sensing data were collected at critical growth stages for growth analysis and monitoring. The portable active canopy sensor RapidSCAN CS-45 with red, red edge, and near infrared wavebands was used in handheld mode and aerial mode on a gimbal under a multi-rotor UAV. The results showed the great potential of UAV-based active sensing for monitoring rice leaf N status. The vegetation index-based regression models were built and evaluated based on Akaike information criterion and independent validation to predict rice leaf dry matter, leaf area index, and leaf N accumulation. Vegetation indices composed of near-infrared and red edge bands (NDRE or RERVI) acquired at a 1.5 m aviation height had a good performance for the practical application. Future studies are needed on the proper operation mode and means for precision N management with this system

    Comparison of Peptide Array Substrate Phosphorylation of c-Raf and Mitogen Activated Protein Kinase Kinase Kinase 8

    Get PDF
    Kinases are pivotal regulators of cellular physiology. The human genome contains more than 500 putative kinases, which exert their action via the phosphorylation of specific substrates. The determinants of this specificity are still only partly understood and as a consequence it is difficult to predict kinase substrate preferences from the primary structure, hampering the understanding of kinase function in physiology and prompting the development of technologies that allow easy assessment of kinase substrate consensus sequences. Hence, we decided to explore the usefulness of phosphorylation of peptide arrays comprising of 1176 different peptide substrates with recombinant kinases for determining kinase substrate preferences, based on the contribution of individual amino acids to total array phosphorylation. Employing this technology, we were able to determine the consensus peptide sequences for substrates of both c-Raf and Mitogen Activated Protein Kinase Kinase Kinase 8, two highly homologous kinases with distinct signalling roles in cellular physiology. The results show that although consensus sequences for these two kinases identified through our analysis share important chemical similarities, there is still some sequence specificity that could explain the different biological action of the two enzymes. Thus peptide arrays are a useful instrument for deducing substrate consensus sequences and highly homologous kinases can differ in their requirement for phosphorylation events

    Secrets of RLHF in Large Language Models Part I: PPO

    Full text link
    Large language models (LLMs) have formulated a blueprint for the advancement of artificial general intelligence. Its primary objective is to function as a human-centric (helpful, honest, and harmless) assistant. Alignment with humans assumes paramount significance, and reinforcement learning with human feedback (RLHF) emerges as the pivotal technological paradigm underpinning this pursuit. Current technical routes usually include \textbf{reward models} to measure human preferences, \textbf{Proximal Policy Optimization} (PPO) to optimize policy model outputs, and \textbf{process supervision} to improve step-by-step reasoning capabilities. However, due to the challenges of reward design, environment interaction, and agent training, coupled with huge trial and error cost of large language models, there is a significant barrier for AI researchers to motivate the development of technical alignment and safe landing of LLMs. The stable training of RLHF has still been a puzzle. In the first report, we dissect the framework of RLHF, re-evaluate the inner workings of PPO, and explore how the parts comprising PPO algorithms impact policy agent training. We identify policy constraints being the key factor for the effective implementation of the PPO algorithm. Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model. Based on our main results, we perform a comprehensive analysis of RLHF abilities compared with SFT models and ChatGPT. The absence of open-source implementations has posed significant challenges to the investigation of LLMs alignment. Therefore, we are eager to release technical reports, reward models and PPO code

    The LDL Receptor Clustering Motif Interacts with the Clathrin Terminal Domain in a Reverse Turn Conformation

    Get PDF
    Previously the hexapeptide motif FXNPXY807 in the cytoplasmic tail of the LDL receptor was shown to be essential for clustering in clathrin-coated pits. We used nuclear magnetic resonance line-broadening and transferred nuclear Overhauser effect measurements to identify the molecule in the clathrin lattice that interacts with this hexapeptide, and determined the structure of the bound motif. The wild-type peptide bound in a single conformation with a reverse turn at residues NPVY. Tyr807Ser, a peptide that harbors a mutation that disrupts receptor clustering, displayed markedly reduced interactions. Clustering motif peptides interacted with clathrin cages assembled in the presence or absence of AP2, with recombinant clathrin terminal domains, but not with clathrin hubs. The identification of terminal domains as the primary site of interaction for FXNPXY807 suggests that adaptor molecules are not required for receptor-mediated endocytosis of LDL, and that at least two different tyrosine-based internalization motifs exist for clustering receptors in coated pits

    Nucleostemin prevents telomere damage by promoting PML-IV recruitment to SUMOylated TRF1

    Get PDF
    Continuously dividing cells must be protected from telomeric and nontelomeric DNA damage in order to maintain their proliferative potential. Here, we report a novel telomere-protecting mechanism regulated by nucleostemin (NS). NS depletion increased the number of telomere damage foci in both telomerase-active (TA(+)) and alternative lengthening of telomere (ALT) cells and decreased the percentage of damaged telomeres associated with ALT-associated PML bodies (APB) and the number of APB in ALT cells. Mechanistically, NS could promote the recruitment of PML-IV to SUMOylated TRF1 in TA(+) and ALT cells. This event was stimulated by DNA damage. Supporting the importance of NS and PML-IV in telomere protection, we demonstrate that loss of NS or PML-IV increased the frequency of telomere damage and aberration, reduced telomeric length, and perturbed the TRF2(ΔBΔM)-induced telomeric recruitment of RAD51. Conversely, overexpression of either NS or PML-IV protected ALT and TA(+) cells from telomere damage. This work reveals a novel mechanism in telomere protection

    CKI isoforms α and ε regulate Star–PAP target messages by controlling Star–PAP poly(A) polymerase activity and phosphoinositide stimulation

    Get PDF
    Star–PAP is a non-canonical, nuclear poly(A) polymerase (PAP) that is regulated by the lipid signaling molecule phosphatidylinositol 4,5 bisphosphate (PI4,5P2), and is required for the expression of a select set of mRNAs. It was previously reported that a PI4,5P2 sensitive CKI isoform, CKIα associates with and phosphorylates Star–PAP in its catalytic domain. Here, we show that the oxidative stress-induced by tBHQ treatment stimulates the CKI mediated phosphorylation of Star–PAP, which is critical for both its polyadenylation activity and stimulation by PI4,5P2. CKI activity was required for the expression and efficient 3′-end processing of its target mRNAs in vivo as well as the polyadenylation activity of Star–PAP in vitro. Specific CKI activity inhibitors (IC261 and CKI7) block in vivo Star–PAP activity, but the knockdown of CKIα did not equivalently inhibit the expression of Star–PAP targets. We show that in addition to CKIα, Star–PAP associates with another CKI isoform, CKIε in the Star–PAP complex that phosphorylates Star–PAP and complements the loss of CKIα. Knockdown of both CKI isoforms (α and ε) resulted in the loss of expression and the 3′-end processing of Star–PAP targets similar to the CKI activity inhibitors. Our results demonstrate that CKI isoforms α and ε modulate Star–PAP activity and regulates Star–PAP target messages

    Deciphering the Arginine-Binding Preferences at the Substrate-Binding Groove of Ser/Thr Kinases by Computational Surface Mapping

    Get PDF
    Protein kinases are key signaling enzymes that catalyze the transfer of γ-phosphate from an ATP molecule to a phospho-accepting residue in the substrate. Unraveling the molecular features that govern the preference of kinases for particular residues flanking the phosphoacceptor is important for understanding kinase specificities toward their substrates and for designing substrate-like peptidic inhibitors. We applied ANCHORSmap, a new fragment-based computational approach for mapping amino acid side chains on protein surfaces, to predict and characterize the preference of kinases toward Arginine binding. We focus on positions P−2 and P−5, commonly occupied by Arginine (Arg) in substrates of basophilic Ser/Thr kinases. The method accurately identified all the P−2/P−5 Arg binding sites previously determined by X-ray crystallography and produced Arg preferences that corresponded to those experimentally found by peptide arrays. The predicted Arg-binding positions and their associated pockets were analyzed in terms of shape, physicochemical properties, amino acid composition, and in-silico mutagenesis, providing structural rationalization for previously unexplained trends in kinase preferences toward Arg moieties. This methodology sheds light on several kinases that were described in the literature as having non-trivial preferences for Arg, and provides some surprising departures from the prevailing views regarding residues that determine kinase specificity toward Arg. In particular, we found that the preference for a P−5 Arg is not necessarily governed by the 170/230 acidic pair, as was previously assumed, but by several different pairs of acidic residues, selected from positions 133, 169, and 230 (PKA numbering). The acidic residue at position 230 serves as a pivotal element in recognizing Arg from both the P−2 and P−5 positions
    corecore