22 research outputs found

    BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

    Full text link
    Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License

    Tandem affinity purification to identify cytosolic and nuclear Gβγ-interacting proteins

    No full text
    It has become clear in recent years that the G beta gamma subunits of heterotrimeric proteins serve broad roles in the regulation of cellular activity and interact with many proteins in different subcellular locations including the nucleus. Protein affinity purification is a common method to identify and confirm protein interactions. When used in conjugation with mass spectrometry it can be used to identify novel protein interactions with a given bait protein. The tandem affinity purification (TAP) technique identifies partner proteins bound to tagged protein bait. Combined with protocols to enrich the nuclear fraction of whole cell lysate through sucrose cushions, TAP allows for purification of interacting proteins found specifically in the nucleus. Here we describe the use of the TAP technique on cytosolic and nuclear lysates to identify candidate proteins, through mass spectrometry, that bind to G beta(1) subunits

    Computational framework for analysis of prey–prey associations in interaction proteomics identifies novel human protein–protein interactions and networks

    No full text
    Large-scale protein-protein interaction data sets have been generated for several species including yeast and human and have enabled the identification, quantification, and prediction of cellular molecular networks. Affinity purification-mass spectrometry (AP-MS) is the preeminent methodology for large-scale analysis of protein complexes, performed by immunopurifying a specific “bait” protein and its associated “prey” proteins. The analysis and interpretation of AP-MS data sets is, however, not straightforward. In addition, although yeast AP-MS data sets are relatively comprehensive, current human AP-MS data sets only sparsely cover the human interactome. Here we develop a framework for analysis of AP-MS data sets that addresses the issues of noise, missing data, and sparsity of coverage in the context of a current, real world human AP-MS data set. Our goal is to extend and increase the density of the known human interactome by integrating bait-prey and cocomplexed preys (prey-prey associations) into networks. Our framework incorporates a score for each identified protein, as well as elements of signal processing to improve the confidence of identified protein-protein interactions. We identify many protein networks enriched in known biological processes and functions. In addition, we show that integrated bait-prey and prey-prey interactions can be used to refine network topology and extend known protein networks.<br/
    corecore