85 research outputs found

    Direct and Absolute Quantification of over 1800 Yeast Proteins via Selected Reaction Monitoring

    Get PDF
    Defining intracellular protein concentration is critical in molecular systems biology. Although strategies for determining relative protein changes are available, defining robust absolute values in copies per cell has proven significantly more challenging. Here we present a reference data set quantifying over 1800 Saccharomyces cerevisiae proteins by direct means using protein-specific stable-isotope labeled internal standards and selected reaction monitoring (SRM) mass spectrometry, far exceeding any previous study. This was achieved by careful design of over 100 QconCAT recombinant proteins as standards, defining 1167 proteins in terms of copies per cell and upper limits on a further 668, with robust CVs routinely less than 20%. The selected reaction monitoring-derived proteome is compared with existing quantitative data sets, highlighting the disparities between methodologies. Coupled with a quantification of the transcriptome by RNA-seq taken from the same cells, these data support revised estimates of several fundamental molecular parameters: a total protein count of ∼100 million molecules-per-cell, a median of ∼1000 proteins-per-transcript, and a linear model of protein translation explaining 70% of the variance in translation rate. This work contributes a “gold-standard” reference yeast proteome (including 532 values based on high quality, dual peptide quantification) that can be widely used in systems models and for other comparative studies. Reliable and accurate quantification of the proteins present in a cell or tissue remains a major challenge for post-genome scientists. Proteins are the primary functional molecules in biological systems and knowledge of their abundance and dynamics is an important prerequisite to a complete understanding of natural physiological processes, or dysfunction in disease. Accordingly, much effort has been spent in the development of reliable, accurate and sensitive techniques to quantify the cellular proteome, the complement of proteins expressed at a given time under defined conditions (1). Moreover, the ability to model a biological system and thus characterize it in kinetic terms, requires that protein concentrations be defined in absolute numbers (2, 3). Given the high demand for accurate quantitative proteome data sets, there has been a continual drive to develop methodology to accomplish this, typically using mass spectrometry (MS) as the analytical platform. Many recent studies have highlighted the capabilities of MS to provide good coverage of the proteome at high sensitivity often using yeast as a demonstrator system (4⇓⇓⇓⇓⇓–10), suggesting that quantitative proteomics has now “come of age” (1). However, given that MS is not inherently quantitative, most of the approaches produce relative quantitation and do not typically measure the absolute concentrations of individual molecular species by direct means. For the yeast proteome, epitope tagging studies using green fluorescent protein or tandem affinity purification tags provides an alternative to MS. Here, collections of modified strains are generated that incorporate a detectable, and therefore quantifiable, tag that supports immunoblotting or fluorescence techniques (11, 12). However, such strategies for copies per cell (cpc) quantification rely on genetic manipulation of the host organism and hence do not quantify endogenous, unmodified protein. Similarly, the tagging can alter protein levels - in some instances hindering protein expression completely (11). Even so, epitope tagging methods have been of value to the community, yielding high coverage quantitative data sets for the majority of the yeast proteome (11, 12). MS-based methods do not rely on such nonendogenous labels, and can reach genome-wide levels of coverage. Accurate estimation of absolute concentrations i.e. protein copy number per cell, also usually necessitates the use of (one or more) external or internal standards from which to derive absolute abundance (4). Examples include a comprehensive quantification of the Leptospira interrogans proteome that used a 19 protein subset quantified using selected reaction monitoring (SRM)1 to calibrate their label-free data (8, 13). It is worth noting that epitope tagging methods, although also absolute, rely on a very limited set of standards for the quantitative western blots and necessitate incorporation of a suitable immunogenic tag (11). Other recent, innovative approaches exploiting total ion signal and internal scaling to estimate protein cellular abundance (10, 14), avoid the use of internal standards, though they do rely on targeted proteomic data to validate their approach. The use of targeted SRM strategies to derive proteomic calibration standards highlights its advantages in comparison to label-free in terms of accuracy, precision, dynamic range and limit of detection and has gained currency for its reliability and sensitivity (3, 15⇓–17). Indeed, SRM is often referred to as the “gold standard proteomic quantification method,” being particularly well-suited when the proteins to be quantified are known, when appropriate surrogate peptides for protein quantification can be selected a priori, and matched with stable isotope-labeled (SIL) standards (18⇓–20). In combination with SIL peptide standards that can be generated through a variety of means (3, 15), SRM can be used to quantify low copy number proteins, reaching down to ∼50 cpc in yeast (5). However, although SRM methodology has been used extensively for S. cerevisiae protein quantification by us and others (19, 21, 22), it has not been used for large protein cohorts because of the requirement to generate the large numbers of attendant SIL peptide standards; the largest published data set is only for a few tens of proteins. It remains a challenge therefore to robustly quantify an entire eukaryotic proteome in absolute terms by direct means using targeted MS and this is the focus of our present study, the Census Of the Proteome of Yeast (CoPY). We present here direct and absolute quantification of nearly 2000 endogenous proteins from S. cerevisiae grown in steady state in a chemostat culture, using the SRM-based QconCAT approach. Although arguably not quantification of the entire proteome, this represents an accurate and rigorous collection of direct yeast protein quantifications, providing a gold-standard data set of endogenous protein levels for future reference and comparative studies. The highly reproducible SIL-SRM MS data, with robust CVs typically less than 20%, is compared with other extant data sets that were obtained via alternative analytical strategies. We also report a matched high quality transcriptome from the same cells using RNA-seq, which supports additional calculations including a refined estimate of the total protein content in yeast cells, and a simple linear model of translation explaining 70% of the variance between RNA and protein levels in yeast chemostat cultures. These analyses confirm the validity of our data and approach, which we believe represents a state-of-the-art absolute quantification compendium of a significant proportion of a model eukaryotic proteome

    Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images

    Get PDF
    Beyond sample curation and basic pathologic characterization, the digitized H&E-stained images of TCGA samples remain underutilized. To highlight this resource, we present mappings of tumorinfiltrating lymphocytes (TILs) based on H&E images from 13 TCGA tumor types. These TIL maps are derived through computational staining using a convolutional neural network trained to classify patches of images. Affinity propagation revealed local spatial structure in TIL patterns and correlation with overall survival. TIL map structural patterns were grouped using standard histopathological parameters. These patterns are enriched in particular T cell subpopulations derived from molecular measures. TIL densities and spatial structure were differentially enriched among tumor types, immune subtypes, and tumor molecular subtypes, implying that spatial infiltrate state could reflect particular tumor cell aberration states. Obtaining spatial lymphocytic patterns linked to the rich genomic characterization of TCGA samples demonstrates one use for the TCGA image archives with insights into the tumor-immune microenvironment

    Genomic, Pathway Network, and Immunologic Features Distinguishing Squamous Carcinomas

    Get PDF
    This integrated, multiplatform PanCancer Atlas study co-mapped and identified distinguishing molecular features of squamous cell carcinomas (SCCs) from five sites associated with smokin

    Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context

    Get PDF
    Long noncoding RNAs (lncRNAs) are commonly dys-regulated in tumors, but only a handful are known toplay pathophysiological roles in cancer. We inferredlncRNAs that dysregulate cancer pathways, onco-genes, and tumor suppressors (cancer genes) bymodeling their effects on the activity of transcriptionfactors, RNA-binding proteins, and microRNAs in5,185 TCGA tumors and 1,019 ENCODE assays.Our predictions included hundreds of candidateonco- and tumor-suppressor lncRNAs (cancerlncRNAs) whose somatic alterations account for thedysregulation of dozens of cancer genes and path-ways in each of 14 tumor contexts. To demonstrateproof of concept, we showed that perturbations tar-geting OIP5-AS1 (an inferred tumor suppressor) andTUG1 and WT1-AS (inferred onco-lncRNAs) dysre-gulated cancer genes and altered proliferation ofbreast and gynecologic cancer cells. Our analysis in-dicates that, although most lncRNAs are dysregu-lated in a tumor-specific manner, some, includingOIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergis-tically dysregulate cancer pathways in multiple tumorcontexts

    Pan-cancer Alterations of the MYC Oncogene and Its Proximal Network across the Cancer Genome Atlas

    Get PDF
    Although theMYConcogene has been implicated incancer, a systematic assessment of alterations ofMYC, related transcription factors, and co-regulatoryproteins, forming the proximal MYC network (PMN),across human cancers is lacking. Using computa-tional approaches, we define genomic and proteo-mic features associated with MYC and the PMNacross the 33 cancers of The Cancer Genome Atlas.Pan-cancer, 28% of all samples had at least one ofthe MYC paralogs amplified. In contrast, the MYCantagonists MGA and MNT were the most frequentlymutated or deleted members, proposing a roleas tumor suppressors.MYCalterations were mutu-ally exclusive withPIK3CA,PTEN,APC,orBRAFalterations, suggesting that MYC is a distinct onco-genic driver. Expression analysis revealed MYC-associated pathways in tumor subtypes, such asimmune response and growth factor signaling; chro-matin, translation, and DNA replication/repair wereconserved pan-cancer. This analysis reveals insightsinto MYC biology and is a reference for biomarkersand therapeutics for cancers with alterations ofMYC or the PMN

    Materializing digital collecting: an extended view of digital materiality

    Get PDF
    If digital objects are abundant and ubiquitous, why should consumers pay for, much less collect them? The qualities of digital code present numerous challenges for collecting, yet digital collecting can and does occur. We explore the role of companies in constructing digital consumption objects that encourage and support collecting behaviours, identifying material configuration techniques that materialise these objects as elusive and authentic. Such techniques, we argue, may facilitate those pleasures of collecting otherwise absent in the digital realm. We extend theories of collecting by highlighting the role of objects and the companies that construct them in materialising digital collecting. More broadly, we extend theories of digital materiality by highlighting processes of digital material configuration that occur in the pre-objectification phase of materialisation, acknowledging the role of marketing and design in shaping the qualities exhibited by digital consumption objects and consequently related consumption behaviours and experiences

    Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas

    Get PDF
    Sarcomas are a broad family of mesenchymal malignancies exhibiting remarkable histologic diversity. We describe the multi-platform molecular landscape of 206 adult soft tissue sarcomas representing 6 major types. Along with novel insights into the biology of individual sarcoma types, we report three overarching findings: (1) unlike most epithelial malignancies, these sarcomas (excepting synovial sarcoma) are characterized predominantly by copy-number changes, with low mutational loads and only a few genes (, , ) highly recurrently mutated across sarcoma types; (2) within sarcoma types, genomic and regulomic diversity of driver pathways defines molecular subtypes associated with patient outcome; and (3) the immune microenvironment, inferred from DNA methylation and mRNA profiles, associates with outcome and may inform clinical trials of immune checkpoint inhibitors. Overall, this large-scale analysis reveals previously unappreciated sarcoma-type-specific changes in copy number, methylation, RNA, and protein, providing insights into refining sarcoma therapy and relationships to other cancer types

    Integrated genomic characterization of pancreatic ductal adenocarcinoma

    Get PDF
    We performed integrated genomic, transcriptomic, and proteomic profiling of 150 pancreatic ductal adenocarcinoma (PDAC) specimens, including samples with characteristic low neoplastic cellularity. Deep whole-exome sequencing revealed recurrent somatic mutations in KRAS, TP53, CDKN2A, SMAD4, RNF43, ARID1A, TGFβR2, GNAS, RREB1, and PBRM1. KRAS wild-type tumors harbored alterations in other oncogenic drivers, including GNAS, BRAF, CTNNB1, and additional RAS pathway genes. A subset of tumors harbored multiple KRAS mutations, with some showing evidence of biallelic mutations. Protein profiling identified a favorable prognosis subset with low epithelial-mesenchymal transition and high MTOR pathway scores. Associations of non-coding RNAs with tumor-specific mRNA subtypes were also identified. Our integrated multi-platform analysis reveals a complex molecular landscape of PDAC and provides a roadmap for precision medicine
    corecore