33 research outputs found

    Decentralized Non-Convex Learning with Linearly Coupled Constraints

    Full text link
    Motivated by the need for decentralized learning, this paper aims at designing a distributed algorithm for solving nonconvex problems with general linear constraints over a multi-agent network. In the considered problem, each agent owns some local information and a local variable for jointly minimizing a cost function, but local variables are coupled by linear constraints. Most of the existing methods for such problems are only applicable for convex problems or problems with specific linear constraints. There still lacks a distributed algorithm for such problems with general linear constraints and under nonconvex setting. In this paper, to tackle this problem, we propose a new algorithm, called "proximal dual consensus" (PDC) algorithm, which combines a proximal technique and a dual consensus method. We build the theoretical convergence conditions and show that the proposed PDC algorithm can converge to an ϵ\epsilon-Karush-Kuhn-Tucker solution within O(1/ϵ)\mathcal{O}(1/\epsilon) iterations. For computation reduction, the PDC algorithm can choose to perform cheap gradient descent per iteration while preserving the same order of O(1/ϵ)\mathcal{O}(1/\epsilon) iteration complexity. Numerical results are presented to demonstrate the good performance of the proposed algorithms for solving a regression problem and a classification problem over a network where agents have only partial observations of data features

    LawBench: Benchmarking Legal Knowledge of Large Language Models

    Full text link
    Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain

    DomPep—A General Method for Predicting Modular Domain-Mediated Protein-Protein Interactions

    Get PDF
    Protein-protein interactions (PPIs) are frequently mediated by the binding of a modular domain in one protein to a short, linear peptide motif in its partner. The advent of proteomic methods such as peptide and protein arrays has led to the accumulation of a wealth of interaction data for modular interaction domains. Although several computational programs have been developed to predict modular domain-mediated PPI events, they are often restricted to a given domain type. We describe DomPep, a method that can potentially be used to predict PPIs mediated by any modular domains. DomPep combines proteomic data with sequence information to achieve high accuracy and high coverage in PPI prediction. Proteomic binding data were employed to determine a simple yet novel parameter Ligand-Binding Similarity which, in turn, is used to calibrate Domain Sequence Identity and Position-Weighted-Matrix distance, two parameters that are used in constructing prediction models. Moreover, DomPep can be used to predict PPIs for both domains with experimental binding data and those without. Using the PDZ and SH2 domain families as test cases, we show that DomPep can predict PPIs with accuracies superior to existing methods. To evaluate DomPep as a discovery tool, we deployed DomPep to identify interactions mediated by three human PDZ domains. Subsequent in-solution binding assays validated the high accuracy of DomPep in predicting authentic PPIs at the proteome scale. Because DomPep makes use of only interaction data and the primary sequence of a domain, it can be readily expanded to include other types of modular domains

    Phosphorylation of Mcm2 modulates Mcm2–7 activity and affects the cell’s response to DNA damage

    Get PDF
    The S-phase kinase, DDK controls DNA replication through phosphorylation of the replicative helicase, Mcm2–7. We show that phosphorylation of Mcm2 at S164 and S170 is not essential for viability. However, the relevance of Mcm2 phosphorylation is demonstrated by the sensitivity of a strain containing alanine at these positions (mcm2AA) to methyl methanesulfonate (MMS) and caffeine. Consistent with a role for Mcm2 phosphorylation in response to DNA damage, the mcm2AA strain accumulates more RPA foci than wild type. An allele with the phosphomimetic mutations S164E and S170E (mcm2EE) suppresses the MMS and caffeine sensitivity caused by deficiencies in DDK function. In vitro, phosphorylation of Mcm2 or Mcm2EE reduces the helicase activity of Mcm2–7 while increasing DNA binding. The reduced helicase activity likely results from the increased DNA binding since relaxing DNA binding with salt restores helicase activity. The finding that the ATP site mutant mcm2K549R has higher DNA binding and less ATPase than mcm2EE, but like mcm2AA results in drug sensitivity, supports a model whereby a specific range of Mcm2–7 activity is required in response to MMS and caffeine. We propose that phosphorylation of Mcm2 fine-tunes the activity of Mcm2–7, which in turn modulates DNA replication in response to DNA damage

    Identification and Characterization of a Leucine-Rich Repeat Kinase 2 (LRRK2) Consensus Phosphorylation Motif

    Get PDF
    Mutations in LRRK2 (leucine-rich repeat kinase 2) have been identified as major genetic determinants of Parkinson's disease (PD). The most prevalent mutation, G2019S, increases LRRK2's kinase activity, therefore understanding the sites and substrates that LRRK2 phosphorylates is critical to understanding its role in disease aetiology. Since the physiological substrates of this kinase are unknown, we set out to reveal potential targets of LRRK2 G2019S by identifying its favored phosphorylation motif. A non-biased screen of an oriented peptide library elucidated F/Y-x-T-x-R/K as the core dependent substrate sequence. Bioinformatic analysis of the consensus phosphorylation motif identified several novel candidate substrates that potentially function in neuronal pathophysiology. Peptides corresponding to the most PD relevant proteins were efficiently phosphorylated by LRRK2 in vitro. Interestingly, the phosphomotif was also identified within LRRK2 itself. Autophosphorylation was detected by mass spectrometry and biochemical means at the only F-x-T-x-R site (Thr 1410) within LRRK2. The relevance of this site was assessed by measuring effects of mutations on autophosphorylation, kinase activity, GTP binding, GTP hydrolysis, and LRRK2 multimerization. These studies indicate that modification of Thr1410 subtly regulates GTP hydrolysis by LRRK2, but with minimal effects on other parameters measured. Together the identification of LRRK2's phosphorylation consensus motif, and the functional consequences of its phosphorylation, provide insights into downstream LRRK2-signaling pathways

    Computational Prediction and Experimental Verification of New MAP Kinase Docking Sites and Substrates Including Gli Transcription Factors

    Get PDF
    In order to fully understand protein kinase networks, new methods are needed to identify regulators and substrates of kinases, especially for weakly expressed proteins. Here we have developed a hybrid computational search algorithm that combines machine learning and expert knowledge to identify kinase docking sites, and used this algorithm to search the human genome for novel MAP kinase substrates and regulators focused on the JNK family of MAP kinases. Predictions were tested by peptide array followed by rigorous biochemical verification with in vitro binding and kinase assays on wild-type and mutant proteins. Using this procedure, we found new ‘D-site’ class docking sites in previously known JNK substrates (hnRNP-K, PPM1J/PP2Czeta), as well as new JNK-interacting proteins (MLL4, NEIL1). Finally, we identified new D-site-dependent MAPK substrates, including the hedgehog-regulated transcription factors Gli1 and Gli3, suggesting that a direct connection between MAP kinase and hedgehog signaling may occur at the level of these key regulators. These results demonstrate that a genome-wide search for MAP kinase docking sites can be used to find new docking sites and substrates

    Federated Learning-Based Multi-Energy Load Forecasting Method Using CNN-Attention-LSTM Model

    No full text
    Integrated Energy Microgrid (IEM) has emerged as a critical energy utilization mechanism for alleviating environmental and economic pressures. As a part of demand-side energy prediction, multi-energy load forecasting is a vital precondition for the planning and operation scheduling of IEM. In order to increase data diversity and improve model generalization while protecting data privacy, this paper proposes a method that uses the CNN-Attention-LSTM model based on federated learning to forecast the multi-energy load of IEMs. CNN-Attention-LSTM is the global model for extracting features. Federated learning (FL) helps IEMs to train a forecasting model in a distributed manner without sharing local data. This paper examines the individual, central, and federated models with four federated learning strategies (FedAvg, FedAdagrad, FedYogi, and FedAdam). Moreover, considering that FL uses communication technology, the impact of false data injection attacks (FDIA) is also investigated. The results show that federated models can achieve an accuracy comparable to the central model while having a higher precision than individual models, and FedAdagrad has the best prediction performance. Furthermore, FedAdagrad can maintain stability when attacked by false data injection

    Progress on China nuclear data processing code system

    No full text
    China is developing the nuclear data processing code Ruler, which can be used for producing multi-group cross sections and related quantities from evaluated nuclear data in the ENDF format [1]. The Ruler includes modules for reconstructing cross sections in all energy range, generating Doppler-broadened cross sections for given temperature, producing effective self-shielded cross sections in unresolved energy range, calculating scattering cross sections in thermal energy range, generating group cross sections and matrices, preparing WIMS-D format data files for the reactor physics code WIMS-D [2]. Programming language of the Ruler is Fortran-90. The Ruler is tested for 32-bit computers with Windows-XP and Linux operating systems. The verification of Ruler has been performed by comparison with calculation results obtained by the NJOY99 [3] processing code. The validation of Ruler has been performed by using WIMSD5B code
    corecore