8 research outputs found
Cancer proteogenomics : connecting genotype to molecular phenotype
The central dogma of molecular biology describes the one-way road from DNA to RNA and finally to protein. Yet, how this flow of information encoded in DNA as genes (genotype) is regulated in order to produce the observable traits of an individual (phenotype) remains unanswered. Recent advances in high-throughput data, i.e., ‘omics’, have allowed the quantification of DNA, RNA and protein levels leading to integrative analyses that essentially probe the central dogma along all of its constituent molecules. Evidence from these analyses suggest that mRNA abundances are at best a moderate proxy for proteins which are the main functional units of cells and thus closer to the phenotype.
Cancer proteogenomic studies consider the ensemble of proteins, the so-called proteome, as the readout of the functional molecular phenotype to investigate its influence by upstream events, for example DNA copy number alterations. In typical proteogenomic studies, however, the identified proteome is a simplification of its actual composition, as they methodologically disregard events such as splicing, proteolytic cleavage and post-translational modifications that generate unique protein species – proteoforms.
The scope of this thesis is to study the proteome diversity in terms of: a) the complex genetic background of three tumor types, i.e. breast cancer, childhood acute lymphoblastic leukemia and lung cancer, and b) the proteoform composition, describing a computational method for detecting protein species based on their distinct quantitative profiles.
In Paper I, we present a proteogenomic landscape of 45 breast cancer samples representative of the five PAM50 intrinsic subtypes. We studied the effect of copy number alterations (CNA) on mRNA and protein levels, overlaying a public dataset of drug- perturbed protein degradation.
In Paper II, we describe a proteogenomic analysis of 27 B-cell precursor acute lymphoblastic leukemia clinical samples that compares high hyperdiploid versus ETV6/RUNX1-positive cases. We examined the impact of the amplified chromosomes on mRNA and protein abundance, specifically the linear trend between the amplification level and the dosage effect. Moreover, we investigated mRNA-protein quantitative discrepancies with regard to post-transcriptional and post-translational effects such as mRNA/protein stability and miRNA targeting.
In Paper III, we describe a proteogenomic cohort of 141 non-small cell lung cancer clinical samples. We used clustering methods to identify six distinct proteome-based subtypes. We integrated the protein abundances in pathways using protein-protein correlation networks, bioinformatically deconvoluted the immune composition and characterized the neoantigen burden.
In Paper IV, we developed a pipeline for proteoform detection from bottom-up mass- spectrometry-based proteomics. Using an in-depth proteomics dataset of 18 cancer cell lines, we identified proteoforms related to splice variant peptides supported by RNA-seq data.
This thesis adds on the previous literature of proteogenomic studies by analyzing the tumor proteome and its regulation along the flow of the central dogma of molecular biology. It is anticipated that some of these findings would lead to novel insights about tumor biology and set the stage for clinical applications to improve the current cancer patient care
Proteogenomics and Hi-C reveal transcriptional dysregulation in high hyperdiploid childhood acute lymphoblastic leukemia.
Hyperdiploidy, i.e. gain of whole chromosomes, is one of the most common genetic features of childhood acute lymphoblastic leukemia (ALL), but its pathogenetic impact is poorly understood. Here, we report a proteogenomic analysis on matched datasets from genomic profiling, RNA-sequencing, and mass spectrometry-based analysis of >8,000 genes and proteins as well as Hi-C of primary patient samples from hyperdiploid and ETV6/RUNX1-positive pediatric ALL. We show that CTCF and cohesin, which are master regulators of chromatin architecture, display low expression in hyperdiploid ALL. In line with this, a general genome-wide dysregulation of gene expression in relation to topologically associating domain (TAD) borders were seen in the hyperdiploid group. Furthermore, Hi-C of a limited number of hyperdiploid childhood ALL cases revealed that 2/4 cases displayed a clear loss of TAD boundary strength and 3/4 showed reduced insulation at TAD borders, with putative leukemogenic effects
Kidney Issues Associated with COVID-19 Disease
Infection with SARS-CoV-2 and the resulting COVID-19 can cause both lung and kidney damage. SARS-CoV-2 can directly infect renal cells expressing ACE2 receptors, resulting in kidney damage, and acute kidney injury (AKI) has been reported in COVID-19 hospitalized patients. The pathophysiology of COVID-19-associated AKI is multifactorial. Local and systemic inflammation, immune system dysregulation, blood coagulation disorders, and activation of the renin-angiotensin-aldosterone system (RAAS) are factors that contribute to the development of AKI in COVID 19 disease. COVID-19 patients with kidney involvement have a poor prognosis, and patients with chronic kidney disease (CKD) infected with SARS-CoV-2 have an increased mortality risk. CKD patients with COVID-19 may develop end-stage renal disease (ESRD) requiring dialysis. In particular, patients infected with SARS-CoV-2 and requiring dialysis, as well as patients who have undergone kidney transplantation, have an increased risk of mortality and require special consideration. Nephrologists and infectious disease specialists face several clinical dilemmas in the prophylaxis and treatment of CKD patients with COVID-19. This entry presents recent data showing the effects of COVID-19 on the kidneys and CKD patients and the challenges in the management of CKD patients with COVID-19, and discusses treatment strategies for these patients
Prediction model for drug response of acute myeloid leukemia patients
Despite some encouraging successes, predicting the therapy response of acute myeloid leukemia (AML) patients remains highly challenging due to tumor heterogeneity. Here we aim to develop and validate MDREAM, a robust ensemble-based prediction model for drug response in AML based on an integration of omics data, including mutations and gene expression, and large-scale drug testing. Briefly, MDREAM is first trained in the BeatAML cohort (n = 278), and then validated in the BeatAML (n = 183) and two external cohorts, including a Swedish AML cohort (n = 45) and a relapsed/refractory acute leukemia cohort (n = 12). The final prediction is based on 122 ensemble models, each corresponding to a drug. A confidence score metric is used to convey the uncertainty of predictions; among predictions with a confidence score >0.75, the validated proportion of good responders is 77%. The Spearman correlations between the predicted and the observed drug response are 0.68 (95% CI: [0.64, 0.68]) in the BeatAML validation set, -0.49 (95% CI: [-0.53, -0.44]) in the Swedish cohort and 0.59 (95% CI: [0.51, 0.67]) in the relapsed/refractory cohort. A web-based implementation of MDREAM is publicly available at https://www.meb.ki.se/shiny/truvu/MDREAM/.Peer reviewe
Proteogenomics of non-small cell lung cancer reveals molecular subtypes associated with specific therapeutic targets and immune-evasion mechanisms
Despite major advancements in lung cancer treatment, long-term survival is still rare and a deeper understanding of molecular phenotypes would allow the identification of specific cancer dependencies and immune-evasion mechanisms. Here we performed in-depth mass-spectrometry-based proteogenomic analysis of 141 tumors representing all major histologies of non-small cell lung cancer (NSCLC). We identified six distinct proteome subtypes with striking differences in immune cell composition and subtype-specific expression of immune checkpoints. Unexpectedly, high neoantigen burden was linked to global hypomethylation and complex neoantigens mapped to genomic regions, such as endogenous retroviral elements and introns, in immune-cold subtypes. Further, we linked immune evasion with LAG-3 via STK11 mutation-dependent HNF1A activation and FGL1 expression. Finally, we develop a data-independent acquisition mass-spectrometry-based NSCLC subtype classification method, validate it in an independent cohort of 208 NSCLC cases and demonstrate its clinical utility by analyzing an additional cohort of 84 late-stage NSCLC biopsy samples
Breast cancer quantitative proteome and proteogenomic landscape
In the preceding decades, molecular characterization has revolutionized breast cancer (BC) research and therapeutic approaches. Presented herein, an unbiased analysis of breast tumor proteomes, inclusive of 9995 proteins quantified across all tumors, for the first time recapitulates BC subtypes. Additionally, poor-prognosis basal-like and luminal B tumors are further subdivided by immune component infiltration, suggesting the current classification is incomplete. Proteome-based networks distinguish functional protein modules for breast tumor groups, with co-expression of EGFR and MET marking ductal carcinoma in situ regions of normal-like tumors and lending to a more accurate classification of this poorly defined subtype. Genes included within prognostic mRNA panels have significantly higher than average mRNA-protein correlations, and gene copy number alterations are dampened at the protein-level; underscoring the value of proteome quantification for prognostication and phenotypic classification. Furthermore, protein products mapping to non-coding genomic regions are identified; highlighting a potential new class of tumor-specific immunotherapeutic targets
Proteogenomics refines the molecular classification of chronic lymphocytic leukemia
Cancer heterogeneity at the proteome level may explain differences in therapy response and prognosis beyond the currently established genomic and transcriptomic-based diagnostics. The relevance of proteomics for disease classifications remains to be established in clinically heterogeneous cancer entities such as chronic lymphocytic leukemia (CLL). Here, we characterize the proteome and transcriptome alongside genetic and ex-vivo drug response profiling in a clinically annotated CLL discovery cohort (n = 68). Unsupervised clustering of the proteome data reveals six subgroups. Five of these proteomic groups are associated with genetic features, while one group is only detectable at the proteome level. This new group is characterized by accelerated disease progression, high spliceosomal protein abundances associated with aberrant splicing, and low B cell receptor signaling protein abundances (ASB-CLL). Classifiers developed to identify ASB-CLL based on its characteristic proteome or splicing signature in two independent cohorts (n = 165, n = 169) confirm that ASB-CLL comprises about 20% of CLL patients. The inferior overall survival in ASB-CLL is also independent of both TP53- and IGHV mutation status. Our multi-omics analysis refines the classification of CLL and highlights the potential of proteomics to improve cancer patient stratification beyond genetic and transcriptomic profiling. Proteomics can be used to refine cancer classification. Here, the authors characterise chronic lymphocytic leukaemia patients by proteogenomics, and identified a subtype of patients with poor prognosis associated with aberrant B cell receptor signalling
Breast cancer quantitative proteome and proteogenomic landscape
Gene expression profiles can classify breast cancer into five clinically relevant subtypes. Here, the authors perform an in-depth quantitative profiling of the proteome of 45 breast tumors, and show they can recapitulate the transcriptome-based classifications and identify many potentially antigenic tumour-specific peptides