2,223 research outputs found

    The evolution of lung cancer and impact of subclonal selection in TRACERx

    Get PDF
    Lung cancer is the leading cause of cancer-associated mortality worldwide1. Here we analysed 1,644 tumour regions sampled at surgery or during follow-up from the first 421 patients with non-small cell lung cancer prospectively enrolled into the TRACERx study. This project aims to decipher lung cancer evolution and address the primary study endpoint: determining the relationship between intratumour heterogeneity and clinical outcome. In lung adenocarcinoma, mutations in 22 out of 40 common cancer genes were under significant subclonal selection, including classical tumour initiators such as TP53 and KRAS. We defined evolutionary dependencies between drivers, mutational processes and whole genome doubling (WGD) events. Despite patients having a history of smoking, 8% of lung adenocarcinomas lacked evidence of tobacco-induced mutagenesis. These tumours also had similar detection rates for EGFR mutations and for RET, ROS1, ALK and MET oncogenic isoforms compared with tumours in never-smokers, which suggests that they have a similar aetiology and pathogenesis. Large subclonal expansions were associated with positive subclonal selection. Patients with tumours harbouring recent subclonal expansions, on the terminus of a phylogenetic branch, had significantly shorter disease-free survival. Subclonal WGD was detected in 19% of tumours, and 10% of tumours harboured multiple subclonal WGDs in parallel. Subclonal, but not truncal, WGD was associated with shorter disease-free survival. Copy number heterogeneity was associated with extrathoracic relapse within 1 year after surgery. These data demonstrate the importance of clonal expansion, WGD and copy number instability in determining the timing and patterns of relapse in non-small cell lung cancer and provide a comprehensive clinical cancer evolutionary data resource

    ENGINEERING HIGH-RESOLUTION EXPERIMENTAL AND COMPUTATIONAL PIPELINES TO CHARACTERIZE HUMAN GASTROINTESTINAL TISSUES IN HEALTH AND DISEASE

    Get PDF
    In recent decades, new high-resolution technologies have transformed how scientists study complex cellular processes and the mechanisms responsible for maintaining homeostasis and the emergence and progression of gastrointestinal (GI) disease. These advances have paved the way for the use of primary human cells in experimental models which together can mimic specific aspects of the GI tract such as compartmentalized stem-cell zones, gradients of growth factors, and shear stress from fluid flow. The work presented in this dissertation has focused on integrating high-resolution bioinformatics with novel experimental models of the GI epithelium systems to describe the complexity of human pathophysiology of the human small intestines, colon, and stomach in homeostasis and disease. Here, I used three novel microphysiological systems and developed four computational pipelines to describe comprehensive gene expression patterns of the GI epithelium in various states of health and disease. First, I used single cell RNAseq (scRNAseq) to establish the transcriptomic landscape of the entire epithelium of the small intestine and colon from three human donors, describing cell-type specific gene expression patterns in high resolution. Second, I used single cell and bulk RNAseq to model intestinal absorption of fatty acids and show that fatty acid oxidation is a critical regulator of the flux of long- and medium-chain fatty acids across the epithelium. Third, I use bulk RNAseq and a machine learning model to describe how inflammatory cytokines can regulate proliferation of intestinal stem cells in an experimental model of inflammatory hypoxia. Finally, I developed a high throughput platform that can associate phenotype to gene expression in clonal organoids, providing unprecedented resolution into the relationship between comprehensive gene expression patterns and their accompanying phenotypic effects. Through these studies, I have demonstrated how the integration of computational and experimental approaches can measurably advance our understanding of human GI physiology.Doctor of Philosoph

    The effect of pattern recognition receptor RIG-I variant expression during mammalian- or avian-adapted influenza A infection and adaptation in the mouse.

    Get PDF
    Influenza A virus infections are common all over the world and cause substantial damage on health and economy by seasonal outbreaks. The infectious disease flu becomes even more life threatening when followed by a secondary bacterial infection, for which especially children and immune-suppressed people are susceptible. Today, the annually adapted influenza vaccination is an important tool to prevent outbreaks of flu. Medical treatments of acute infections are limited with the exceptions of therapeutically targeting the viral proteins neuraminidase and M2. A deeper understanding of the molecular mechanisms of the IAV pathogenicity and antiviral defense mechanisms, as well as their interaction, can help to refine vaccination strategies and direct therapeutic options to generate more target specific and effective antiviral drugs. The pattern recognition receptor RIG-I is one of the most important sensors of the innate immune system to detect foreign RNAs. Upon the binding of RNA ligands like the panhandle structures of influenza A virus, RIG-I amongst others mediates the expression of interferon type 1 and interleukin-1β in an ATPase dependent manner. Additionally, the binding of RIG-I to the viral panhandle structures is confirmed in vitro as another antiviral function by blocking the access for the viral polymerase, as firstly described by Friedemann Weber in the year 2015. The main aim of this thesis was to investigate the contribution of the different antiviral effects of RIG-I against the influenza A virus in a mouse infection model. Furthermore, additional insights about the RIG-I blocking function should be gained. Therefore, a mouse line deficient in RIG-I signaling and another one lacking RIG-I expression were established with the help of genetic engineering. Additionally, two recombinant influenza strains harboring an adaptation in the viral polymerase gene either to mammalian hosts or to avian hosts (polymerase subunit 2 codon 627K and 627E) were generated. Both virus strains were validated for different quality features. The recombinant virus strains were used to perform an infection study using RIG-I wild type, signaling-deficient and knockout mice. Investigating the effect of RIG-I variant expression on parameters like weight reduction, lung virus titer, loss of lung barrier integrity, interferon and cytokine concentration in response to the influenza A infection, new insights in the antiviral functions of RIG-I were gained. The established mouse lines expressing signaling-deficient or no RIG-I did not develop any detectable burden by their genotype. The RIG-I-mediated interferon-α induction was found to be abolished in bone marrow derived macrophages of mice with signaling deficiency in RIG-I as well as RIG-I knockout mice while it was intact in RIG-I wild type mouse derived cells, as expected. Hence, an unburdened mouse line with RIG-I signaling deficiency and one with a RIG-I knockout were generated. The RIG-I PM line is the first of its kind. While the generation of a mammalian-adapted recombinant influenza A strain was successful from the beginning, the generation of the avian-adapted strain was not successful in mammalian cells. A sufficient replication of both strains was achieved in the DF-1 chicken cell line. The received stock preparations showed similar abilities and the stability of their respective genotype was confirmed over several passages in different cell lines. While the infection of mice with the generated recombinant influenza A strains led to a significant change in observed infection parameters and cytokine signaling, only a weak effect of RIG-I variant expression on the infection parameters was detectable. Additionally, the results deliver hints for a RIG-I dependent induction of IFN-γ by RIG-I, which was not described in detail yet. The data also suggest that the antiviral functions of RIG-I may be potently inhibited by the viral nonstructural protein 1 and the mammalian-adapted polymerase variant. The validation of the genetic stability of the virus strain with the avian-adapted polymerase variant in vivo indicates a significant back mutation to the original mammalian-adapted genotype over the course of the infection. This was significantly affected by the time post infection and the type of RIG I variant expression. A lower rate of back mutation than in RIG-I wild type mice was detected in mice with RIG-I signaling deficiency and the lowest in mice with RIG-I knockout. These findings suggests that both the RIG-I signaling functions and the RIG-I blocking function mediate a selective pressure on the influenza A polymerase subunit 2 codon 627. This conclusion is supported by the results of an in vitro competitive infection assay with both virus variants together, showing a replication advantage of the mammalian-adapted polymerase variant over the avian-adapted variant that is affected by the type of RIG-I expression. Taken together, the results of this study deliver deep insights into the interaction between the innate pattern recognition receptor RIG-I and the influenza A virus. The findings suggest that the presence of RIG-I forces viral polymerase variants common in avian to adapt to a mammalian host. Further, the study delivers additional data confirming a RIG-I-mediated antiviral effect by blocking the access of the viral polymerase to the panhandle structures of viral RNAs. Additionally, the data suggests a high potency of the nonstructural protein 1 and the mammalian-adapted polymerase subunit 2 codon 627K to prevent the effects of RIG-I. The finding that RIG-I may contribute to interferon-γ release could be interesting and should be investigated in future studies, since this interaction is poorly described in the literature, but connects two important features of the innate immune system

    Computational Approaches to Drug Profiling and Drug-Protein Interactions

    Get PDF
    Despite substantial increases in R&D spending within the pharmaceutical industry, denovo drug design has become a time-consuming endeavour. High attrition rates led to a long period of stagnation in drug approvals. Due to the extreme costs associated with introducing a drug to the market, locating and understanding the reasons for clinical failure is key to future productivity. As part of this PhD, three main contributions were made in this respect. First, the web platform, LigNFam enables users to interactively explore similarity relationships between ‘drug like’ molecules and the proteins they bind. Secondly, two deep-learning-based binding site comparison tools were developed, competing with the state-of-the-art over benchmark datasets. The models have the ability to predict offtarget interactions and potential candidates for target-based drug repurposing. Finally, the open-source ScaffoldGraph software was presented for the analysis of hierarchical scaffold relationships and has already been used in multiple projects, including integration into a virtual screening pipeline to increase the tractability of ultra-large screening experiments. Together, and with existing tools, the contributions made will aid in the understanding of drug-protein relationships, particularly in the fields of off-target prediction and drug repurposing, helping to design better drugs faster

    Using machine learning to predict pathogenicity of genomic variants throughout the human genome

    Get PDF
    Geschätzt mehr als 6.000 Erkrankungen werden durch Veränderungen im Genom verursacht. Ursachen gibt es viele: Eine genomische Variante kann die Translation eines Proteins stoppen, die Genregulation stören oder das Spleißen der mRNA in eine andere Isoform begünstigen. All diese Prozesse müssen überprüft werden, um die zum beschriebenen Phänotyp passende Variante zu ermitteln. Eine Automatisierung dieses Prozesses sind Varianteneffektmodelle. Mittels maschinellem Lernen und Annotationen aus verschiedenen Quellen bewerten diese Modelle genomische Varianten hinsichtlich ihrer Pathogenität. Die Entwicklung eines Varianteneffektmodells erfordert eine Reihe von Schritten: Annotation der Trainingsdaten, Auswahl von Features, Training verschiedener Modelle und Selektion eines Modells. Hier präsentiere ich ein allgemeines Workflow dieses Prozesses. Dieses ermöglicht es den Prozess zu konfigurieren, Modellmerkmale zu bearbeiten, und verschiedene Annotationen zu testen. Der Workflow umfasst außerdem die Optimierung von Hyperparametern, Validierung und letztlich die Anwendung des Modells durch genomweites Berechnen von Varianten-Scores. Der Workflow wird in der Entwicklung von Combined Annotation Dependent Depletion (CADD), einem Varianteneffektmodell zur genomweiten Bewertung von SNVs und InDels, verwendet. Durch Etablierung des ersten Varianteneffektmodells für das humane Referenzgenome GRCh38 demonstriere ich die gewonnenen Möglichkeiten Annotationen aufzugreifen und neue Modelle zu trainieren. Außerdem zeige ich, wie Deep-Learning-Scores als Feature in einem CADD-Modell die Vorhersage von RNA-Spleißing verbessern. Außerdem werden Varianteneffektmodelle aufgrund eines neuen, auf Allelhäufigkeit basierten, Trainingsdatensatz entwickelt. Diese Ergebnisse zeigen, dass der entwickelte Workflow eine skalierbare und flexible Möglichkeit ist, um Varianteneffektmodelle zu entwickeln. Alle entstandenen Scores sind unter cadd.gs.washington.edu und cadd.bihealth.org frei verfügbar.More than 6,000 diseases are estimated to be caused by genomic variants. This can happen in many possible ways: a variant may stop the translation of a protein, interfere with gene regulation, or alter splicing of the transcribed mRNA into an unwanted isoform. It is necessary to investigate all of these processes in order to evaluate which variant may be causal for the deleterious phenotype. A great help in this regard are variant effect scores. Implemented as machine learning classifiers, they integrate annotations from different resources to rank genomic variants in terms of pathogenicity. Developing a variant effect score requires multiple steps: annotation of the training data, feature selection, model training, benchmarking, and finally deployment for the model's application. Here, I present a generalized workflow of this process. It makes it simple to configure how information is converted into model features, enabling the rapid exploration of different annotations. The workflow further implements hyperparameter optimization, model validation and ultimately deployment of a selected model via genome-wide scoring of genomic variants. The workflow is applied to train Combined Annotation Dependent Depletion (CADD), a variant effect model that is scoring SNVs and InDels genome-wide. I show that the workflow can be quickly adapted to novel annotations by porting CADD to the genome reference GRCh38. Further, I demonstrate the integration of deep-neural network scores as features into a new CADD model, improving the annotation of RNA splicing events. Finally, I apply the workflow to train multiple variant effect models from training data that is based on variants selected by allele frequency. In conclusion, the developed workflow presents a flexible and scalable method to train variant effect scores. All software and developed scores are freely available from cadd.gs.washington.edu and cadd.bihealth.org

    Investigating tricky nodes in the Tree of Life

    Get PDF

    Discovering circulating protein biomarkers through in-depth plasma proteomics

    Get PDF
    Plasma, i.e., the liquid component of blood, is one of the most clinically used samples for biomarker measurement. Despite that plasma proteins and metabolites are the most frequently analysed biomarkers in practice, identifying and implementing new circulating protein biomarkers for diagnosis, treatment prediction, prognosis, and disease monitoring has been limited. This PhD thesis compiles the discovery of systemic alterations in the blood plasma proteome and potential biomarkers related to disease status, prognosis, or treatment through plasma proteomics. We analysed plasma and serum samples with global proteomics by high-resolution isoelectric focusing (HiRIEF) and liquid chromatography coupled with mass-spectrometry (LC-MS/MS), and targeted proteomics by antibody-based proximity extension assays (PEA) in three diseases that would benefit from blood biomarkers: stage IV metastatic cutaneous melanoma (mCM), glioblastoma (GBM), and coronavirus disease 2019 (COVID-19). Specifically: a.) New treatment options for mCM substantially prolong overall survival (OS), but multiple patients do not respond to treatment or develop treatment resistance, thus having shorter progression free survival (PFS). Corroborated by the presence of multiple metastases, which makes biomarker sampling difficult, circulating proteins derived from the tumour and in response to treatment could serve as predictive and prognostic biomarkers in mCM. b.) GBM is the most malignant primary brain tumour with limited treatment options and notoriously short OS. Sampling biomarkers for GBM requires an invasive surgical intervention on the skull, which makes GBM a good candidate for circulating protein biomarkers for prognosis and monitoring. c.) COVID-19 is an inflammation-driven infectious disease that affects multiple organs and systems, thus making the plasma proteome a good source to explore systemic biological processes occurring in COVID-19. In papers I and II, using HiRIEF LC-MS/MS and PEA, we explored the treatment-driven plasma proteome alterations in mCM patients treated with anti-PD-1 immune checkpoint inhibitors (ICI) and MAPK-inhibitors (MAPKi), respectively, and identified potential treatment predictive and monitoring biomarkers. mCM patients treated with anti-PD-1 ICI had a strong increase in soluble PD-1 levels during treatment, and upregulation of proteins involved in T-cell response. BRAF[V600]-mutated mCM patients treated with MAPKi had deregulation in proteins involved in immune response and proteolysis. CPB1 had the highest increase in patients treated with BRAF- and MEK-inhibitors and was associated with longer PFS. Higher levels of several proteins involved in inflammation before treatment were associated with shorter PFS regardless of ICI or MAPKi treatment. In paper III, using HiRIEF LC-MS/MS and PEA, we longitudinally analysed the plasma proteome dynamics of GBM patients, collecting plasma samples before surgery and at three timepoints after surgery. Through consensus clustering, based on treatment-naïve plasma protein levels, we identified two patient clusters that differed in median OS. The association between the cluster membership and OS remained consistent after adjustment for age, sex, and treatment. Through machine learning, we identified protein panels that separated the patient clusters and may serve as prognostic biomarkers. The largest alterations in the plasma proteome of GBM patients occurred within two months after surgery, whereas the plasma protein levels at later timepoints had no difference compared to pre- surgery levels. We observed a decrease in glioma-elevated proteins in the blood after surgery, identifying potential monitoring biomarkers. In paper IV, using HiRIEF LC-MS/MS, we analysed serum proteome alterations in hospitalised COVID-19 patients in comparison to healthy controls, and identified a strong upregulation in inflammatory, interferon-induced, and proteasomal proteins. Several protein groups showed association with clinical parameters of COVID-19 severity, including proteasomal proteins. Serum proteome alterations were traceable to proteome alterations induced in a lung adenocarcinoma cell line (Calu-3) by infection with SARS-CoV-2. Finally, we performed the first meta-analysis of global proteomics studies of the soluble blood proteome in COVID-19, providing estimates of standardised mean differences and summary receiver operating characteristics curves. We demonstrate the high accuracy and precision of HiRIEF LC-MS/MS when compared to the meta-analysis estimates and pinpoint proteins that may serve as biomarkers of COVID-19. In summary, this thesis postulates that new circulating protein biomarkers would be clinically useful. By combining mass-spectrometry- and antibody-based-proteomics, we demonstrate the potential of in-depth analyses of the plasma proteome in capturing systemic alterations related to treatment, survival, and disease status, pinpointing potentially novel biomarkers that require validation in larger cohorts

    Molecular signals of arms race evolution between RNA viruses and their hosts

    Get PDF
    Viruses are intracellular parasites that hijack their hosts’ cellular machinery to replicate themselves. This creates an evolutionary “arms race” between hosts and viruses, where the former develop mechanisms to restrict viral infection and the latter evolve ways to circumvent these molecular barriers. In this thesis, I explore examples of this virus-host molecular interplay, focusing on events in the evolutionary histories of both viruses and hosts. The thesis begins by examining how recombination, the exchange of genetic material between related viruses, expands the genomic diversity of the Sarbecovirus subgenus, which includes SARS-CoV responsible for the 2002 SARS epidemic and SARS-CoV-2 responsible for the COVID-19 pandemic. On the host side, I examine the evolutionary interaction between RNA viruses and two interferon-stimulated genes expressed in hosts. First, I show how the 2′-5′-oligoadenylate synthetase 1 (OAS1) gene of horseshoe bats (Rhinolophoidea), the reservoir host of sarbecoviruses, lost its anti-coronaviral activity at the base of this bat superfamily. By reconstructing the Rhinolophoidea common ancestor OAS1 protein, I first validate the loss of antiviral function and highlight the implications of this event in the virus-host association between sarbecoviruses and horseshoe bat hosts. Second, I focus on the evolution of the human butyrophilin subfamily 3 member A3 (BTN3A3) gene which restricts infection by avian influenza A viruses (IAV). The evolutionary analysis reveals that BTN3A3’s anti-IAV function was gained within the primates and that specific amino acid substitutions need to be acquired in IAVs’ NP protein to evade the human BTN3A3 activity. Gain of BTN3A3-evasion-conferring substitutions correlate with all major human IAV pandemics and epidemics, making these NP residues key markers for IAV transmissibility potential to humans. In the final part of the thesis, I present a novel approach for evaluating dinucleotide compositional biases in virus genomes. An application of my metric on the Flaviviridae virus family uncovers how ancestral host shifts of these viruses correlate with adaptive shifts in their genomes’ dinucleotide representation. Collectively, the contents of this thesis extend our understanding of how viruses interact with their hosts along their intertangled evolution and provide insights into virus host switching and pandemic preparedness

    Gut-brain interactions affecting metabolic health and central appetite regulation in diabetes, obesity and aging

    Get PDF
    The central aim of this thesis was to study the effects of gut microbiota on host energy metabolism and central regulation of appetite. We specifically studied the interaction between gut microbiota-derived short-chain fatty acids (SCFAs), postprandial glucose metabolism and central regulation of appetite. In addition, we studied probable determinants that affect this interaction, specifically: host genetics, bariatric surgery, dietary intake and hypoglycemic medication.First, we studied the involvement of microbiota-derived short-chain fatty acids in glucose tolerance. In an observational study we found an association of intestinal availability of SCFAs acetate and butyrate with postprandial insulin and glucose responses. Hereafter, we performed a clinical trial, administering acetate intravenously at a constant rate and studied the effects on glucose tolerance and central regulation of appetite. The acetate intervention did not have a significant effect on these outcome measures, suggesting the association between increased gastrointestinal SCFAs and metabolic health, as observed in the observational study, is not paralleled when inducing acute plasma elevations.Second, we looked at other determinants affecting gut-brain interactions in metabolic health and central appetite signaling. Therefore, we studied the relation between the microbiota and central appetite regulation in identical twin pairs discordant for BMI. Second, we studied the relation between microbial composition and post-surgery gastrointestinal symptoms upon bariatric surgery. Third, we report the effects of increased protein intake on host microbiota composition and central regulation of appetite. Finally, we explored the effects of combination therapy with GLP-1 agonist exenatide and SGLT2 inhibitor dapagliflozin on brain responses to food stimuli
    corecore