728 research outputs found
A Network Integration Approach to Predict Conserved Regulators Related to Pathogenicity of Influenza and SARS-CoV Respiratory Viruses
Respiratory infections stemming from influenza viruses and the Severe Acute Respiratory Syndrome corona virus (SARS-CoV) represent a serious public health threat as emerging pandemics. Despite efforts to identify the critical interactions of these viruses with host machinery, the key regulatory events that lead to disease pathology remain poorly targeted with therapeutics. Here we implement an integrated network interrogation approach, in which proteome and transcriptome datasets from infection of both viruses in human lung epithelial cells are utilized to predict regulatory genes involved in the host response. We take advantage of a novel "crowd-based" approach to identify and combine ranking metrics that isolate genes/proteins likely related to the pathogenicity of SARS-CoV and influenza virus. Subsequently, a multivariate regression model is used to compare predicted lung epithelial regulatory influences with data derived from other respiratory virus infection models. We predicted a small set of regulatory factors with conserved behavior for consideration as important components of viral pathogenesis that might also serve as therapeutic targets for intervention. Our results demonstrate the utility of integrating diverse 'omic datasets to predict and prioritize regulatory features conserved across multiple pathogen infection models
Die Integration von Multiskalen- und Multi-Omik-Daten zur Erforschung von Wirt-Pathogen-Interaktionen am Beispiel von pathogenen Pilzen
The ongoing development and improvement of novel measurement techniques for scientific research result in a huge amount of available data coming from hetero- geneous sources. Amongst others, these sources comprise diverse temporal and spatial scales including different omics levels. The integration of such multiscale and multi-omics data enables a comprehensive understanding of the complexity and dynamics of biological systems and their processes. However, due to the biologically and methodically induced data heterogeneity, the integration process is a well-known challenge in nowadays life science. Applying several computational integration approaches, the present doctoral thesis aimed at gaining new insights into the field of infection biology regarding host- pathogen interactions. In this context, the focus was on fungal pathogens causing a variety of local and systemic infections. Based on current examples of research, on the one hand, several well-established approaches for the analysis of multiscale and multi- omics data have been presented. On the other hand, the novel ModuleDiscoverer approach was introduced to identify regulatory modules in protein-protein interac- tion networks. It has been shown that ModuleDiscoverer effectively supports the integration of multi-omics data and, in addition, allows the detection of potential key factors that cannot be detected by other classical approaches. This thesis provides deeper insights into the complex relationships and dynamics of biological systems and, thus, represents an important contribution to the investigation of host-pathogen interactions. Due to the interactions complexity and the limitations of the currently available knowledge databases as well as the bioinformatic tools, further research is necessary to gain a comprehensive understanding of the complexity of biological systems
Computational Methods for the Analysis of Genomic Data and Biological Processes
In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality
Recommended from our members
Investigating normal human gene expression in tissues with high-throughput transcriptomic and proteomic data.
With the improvement of high-throughput technologies during the last decade, several studies exploring the normal gene expression in human tissues have been published. Many studies examine the transcriptome with RNA sequencing (RNA-Seq), and others probe the proteome with unlabelled bottom-up Mass Spectrometry. As the sampling of undiseased tissues is difficult, the community often refers to expression atlases, which are collating these studies, to support or validate new findings.
Despite many overlapping tissues between the studies, few atlases attempt to integrate all the data.
In this thesis, I investigate the consistency of gene expression across tissues and studies in human with the help of transcriptomics captured with high-throughput sequencing (RNA-Seq) and proteomics generated
with label-free bottom-up Mass Spectrometry (MS).
After describing the transcriptomic and proteomic data and their state-of-art processing (Chapter 2), I review several identified sources of biases and my approaches to limit their effects (Chapter 3).
The integration of the various transcriptomic datasets (Chapter 4) shows that the biological signal dominates the technical noise for RNA-Seq data. Tissue samples display higher levels of correlation for identical tissues in other studies than for other tissues in the same datasets. In other words, interstudy correlations for identical tissues are higher than correlations between different tissues within the same study. Globally, genes show similar expression profiles across studies for a given set of tissues. All genes categories are involved, including the tissue-specific genes and the ubiquitously expressed ones.
After briefly discussing comparisons of proteomic data, I introduce a new proteomic quantification method, PPKM (Chapter 5). The PPKM method allows me to quantify about twice as many proteins compared to usual methods.
Limited numbers of previous studies have shown various correlation levels between the expression of protein and mRNA in studies combining high-throughput transcriptomics and proteomics. I show that, for most tissues, we can observe quite good correlation levels (i.e. significantly better than expected by chance), even when the samples have different biological and technical backgrounds as they have been independently sourced. Many genes share similar patterns of expression between the two biological layers, e.g. genes that have a protein detected in a single tissue are more likely to have their mRNA showing specificity for the same tissue. Additionally, three groups of genes present functional enrichments of biological processes. Genes having highly correlated protein and mRNA expressions across tissues are enriched in catabolic processes. Genes having the most anticorrelated expressions are enriched for ribosomes and ncRNAs regulation. Genes with a protein detected in a single tissue are enriched in signalling processes.
Overall, this thesis describes a global picture of the current consolidated knowledge we can extract from the joint study of public transcriptomic and proteomic data. Beyond confirming or improving observations reported in the literature, this work provides new insights into the ubiquitous and tissue-specific genes. To the best of my knowledge, this work has also established the most extensive list of genes with robust
transcriptomic and proteomic expression across tissues and studies. Furthermore, it shows that joint study approaches can help the development of new methods, like the new proteomic PPKM quantification method. Finally, the highlighting of distinct functional enrichment profiles for groups of genes across tissues and studies lays a framework for further research.EMBL International PhD Programm
Multi-omics characterization of pancreatic neuroendocrine neoplasms
Pancreatic neuroendocrine neoplasms (PNENs) are biologically and clinically heterogeneous neoplasms in which pathogenic alterations are often indiscernible. Treatments for PNENs are insufficient in part due to lack of alternatives once current options are exhausted. Despite previous efforts to characterize PNENs at the molecular level, there remains a lack of molecular subgroups and molecular features with clinical utility for PNENs. In this work, I describe the identification and characterization of four molecularly distinct subgroups from primary PNEN specimens using whole-exome sequencing, RNA-sequencing and global proteome profiling. A Proliferative subgroup with molecular features of proliferating cells was associated with an inferior overall survival probability. A PDX1-high subgroup consisted of PNENs demonstrating genetic and transcriptomic indications of NRAS or HRAS activation. An Alpha cell-like subgroup, enriched in PNENs with deleterious MEN1 and DAXX mutations, bore transcriptomic similarity to pancreatic α-cells and harbored proteomic cues of dysregulated metabolism involving glutamine and arginine. Lastly, a Stromal/Mesenchymal subgroup exhibited increased expression and activation of the Hippo signaling pathway effectors YAP1 and WWTR1 that are of emerging interest as potentially actionable targets in other cancer types. Whole-genome and whole-transcriptome analysis of PNEN metastases identified novel molecular events likely contributing to pathogenesis, including one case presumably driven by MYCN amplification. In agreement with the findings in primary PNENs, four of the metastatic PNENs displayed a substantial Alpha cell-like subgroup signature and all harboured concurrent mutations in MEN1 and DAXX. Collectively, the identified subgroups present a potential stratification scheme that facilitates the identification of therapeutic vulnerabilities amidst PNEN heterogeneity to improve the effective management of PNENs
Recommended from our members
Discovering Master Regulators of Single-Cell Transcriptional States in the Tumor Immune Microenvironment to Reveal Immuno-Therapeutic Targets and Synergistic Treatments
The development of checkpoint immunotherapy has been a paradigm shift in the treatment of cancer, leading to dramatic improvement in treatment outcomes across a broad range of tumor types. Nevertheless, our current understanding of the tumor immune microenvironment and mediators of resistance to therapy are limited. The recent development of high-throughput single-cell RNA-Sequencing (scRNA-Seq) technology has opened up an unprecedented window into the transcriptional states of distinct tumor-infiltrating immune and stromal cells. However, even this technology has its biological limitations, with very high levels of data dropout induced by low total mRNA molecules and capture efficiency. This thesis explores the application of a transcriptional regulatory protein activity inference approach to single-cell data in order to resolve gene dropout and more deeply characterize upstream drivers of cell state within the micro-environment of several distinct tumor types.
To this end, algorithms for inference of protein activity, drug sensitivity, and cell-cell interaction have been adapted to scRNA-Seq data, along with an approach for querying enrichment of single-cell-derived population marker gene sets patient-by-patient in larger bulk-RNA-Seq cohorts. By applying these tools systematically, we have identified distinct cellular sub-populations associated with clinical outcome in different tumor types, including a novel population of C1Q+/TREM2+/APOE+ macrophages associated with post-surgical tumor recurrence in clear cell renal carcinoma, a sub-population of fibroblasts associated with improved response to immunotherapy in head and neck squamous cell carcinoma, tumor cell subpopulations with distinct inferred drug sensitivities in cholangiocarcinoma and prostate cancer, as well as tumor-specific regulatory T-cells (Tregs), active as a mechanism of immunotherapy resistance across a range of tumor types. In ongoing clinical trials from both primary and metastatic prostate cancer as well as clear cell renal carcinoma, we are able to assess which of these populations are enriched in non-responders to checkpoint immunotherapy. The proteomic master regulators of each of these single-cell types have direct utility as potential biomarkers for treatment response, but they may also be therapeutically modulated as novel targets for combination immunotherapy, potentially improving treatment response rates and treatment outcomes in future clinical trials.
Finally, this thesis also presents a discovery-to-validation platform to accelerate micro-environment-directed drug repurposing in the context of immunotherapy resistance and rapid CRISPRko validation of novel therapeutic targets. This platform has been developed specifically to validate newly identified master regulators of tumor-specific immunosuppressive regulatory T-cells (Tregs), resulting in discovery of low-dose gemcitabine as a tumor-specific Treg-modulating drug synergistic with anti-PD1 checkpoint immunotherapy and TRPS1 as a proteomic master regulator with clinically significant effect on tumor Treg-infiltrating and tumor growth rate. However, the platform itself may be readily extended in future work to prioritize agents against immunosuppressive macrophage and fibroblast populations for clinical development and trials. As we have discovered, different cancers have different populations of cells driving therapy response and resistance. Taken together, the analytical and validation tools presented in this thesis represent an opportunity to tailor future immuno-therapies at the single-cell level to particular tumor types and to individual patients
Evaluation of dormancy states in cancer and associated therapeutic opportunities
Tumour mass dormancy and cancer cell quiescence represent the two facets of cancer dormancy and play key roles in cancer development and progression. Quiescence describes the reversible, proliferative arrest of individual cancer cells that has been observed as a contributing factor of resistance to chemotherapy and other treatments targeting cycling cells. In contrast, tumour mass dormancy describes the state of no net tumour growth, which can arise due to inadequate tumour vascularisation or anti- tumour immune response, during which tumours can acquire additional mutations and establish a microenvironment permissive for growth. Currently, both dormancy states remain poorly characterised. This thesis presents computational frameworks for evaluating the two states and comprehensively profiles their abundance and associated genomic and cellular features across 31 solid cancers from the Cancer Genome Atlas. Using machine learning approaches, I demonstrate that cancer cell quiescence preferentially arises in less mutated tumours with intact TP53 and DNA damage repair pathways. I also highlight novel genomic dependencies, such as CEP89 amplification, which drive an impairment of quiescence. Similarly, mutations within CASP8 and HRAS oncogenes are shown to be enriched and positively selected in samples with tumour mass dormancy. I also highlight an association between APOBEC mutagenesis and both dormancy states. Moreover, tumour mass dormancy is shown to be associated with infiltration with macrophages and cytotoxic and regulatory T cells but a decreased infiltration with Th17 cells. Lastly, using single-cell data, I demonstrate that quiescence underlies resistance to a wide range of therapies, including treatments targeting cell cycle regulation, proliferative kinase signalling and epigenetic regulation. Ultimately, this analysis sheds light on the underlying biology of cancer dormancy states, potentially highlighting vulnerabilities that can be targeted in the clinic. It also provides a transcriptional signature of therapy-tolerant quiescent cells that could be explored further in the clinic to monitor patient therapy response
- …