32,134 research outputs found

    Model based approaches to characterize heterogeneity in gene regulation across cells and disease types

    Get PDF
    Access to large genome-wide biological datasets has now enabled computational researchers to tackle long-standing questions in Biomedicine through the lens of Machine Learning (ML) and Artificial Intelligence (AI). The potential benefits of such computational approaches to biological research are immense. For example, efficient, and yet interpretable, machine learning models of disease/drug response/phenotype can impact our life at both personal and social levels. However, heterogeneity is found at multiple scales in biology, manifested as the context-specificity of biological processes. This context-specific heterogeneity poses a major challenge to ML models. Even though context-specific models are often trained, this is mostly done without the benefit of mechanistic insights about the biological processes being modeled, and as such do not help improve our biological understanding. This dissertation addresses these challenges and their limitations by: a) designing appropriate features and ML models motivated by the current biological hypothesis at hand, b) building pipelines to analyze multiple context-specific models together, and c) developing data integration and imputation methods to address the problems of insufficient and missing data. The first project studies loss of methylation or hypo-methylation in large blocks causing aberrant gene activity, a well-known phenomenon in cancer. To find the associated markers, I designed a classification model of hypo-methylated block boundaries and non-boundaries in colon cancer. The second project models binding of transcription factor (TF) to specific DNA element to the genome, one of the principal components of gene regulation. Since condition specificity of TF binding is not yet well understood, this dissertation examines a design of cell type-specific models for transcription factor (TF) binding using ChIPSeq data. A meta-analysis pipeline, called TRISECT, is applied for multiple TF binding models to understand heterogeneity of cell specificity across those models. Next, models for breast cancer metastasis using gene expression data are discussed. In breast cancer metastasis, the affinity towards distant tissues called secondary tissues has not been comprehended. Therefore, going beyond mere discriminatory models, I propose another meta-analysis pipeline, MONTAGE intending to understand the organotropism of breast cancer metastasis across secondary tissues. Building ML models can be hindered by the data size, specially, for rare diseases. Therefore, by necessity, molecular data have been merged across multiple studies, and across multiple technical platforms which has vulnerability of so called batch effects diluting the actual biological signal. Existing methods are not capable of removing multi-variate confounding artifacts leading to inaccurate models. To circumvent this issue, this dissertation examines a deep learning based technique (deepSavior) which ‘translates’ the gene expression profile from samples of one technical platform to another platform. To summarize, this dissertation makes three distinct contributions, a) designing effective ML model to explore the determinants of cancer-associated hypomethlation, b) designing meta-analysis pipelines to compare multiple related but context-specific ML models to understand heterogeneous relations among biological processes, and b) developing new method to overcome the data integration and imputation challenges

    A multiscale model for collagen alignment in wound healing

    Get PDF
    It is thought that collagen alignment plays a significant part in scar tissue formation during dermal wound healing. We present a multiscale model for collagen deposition and alignment during this process. We consider fibroblasts as discrete units moving within an extracellular matrix of collagen and fibrin modelled as continua. Our model includes flux induced alignment of collagen by fibroblasts, and contact guidance of fibroblasts by collagen fibres. We can use the model to predict the effects of certain manipulations, such as varying fibroblast speed, or placing an aligned piece of tissue in the wound. We also simulate experiments which alter the TGF-β concentrations in a healing dermal wound and use the model to offer an explanation of the observed influence of this growth factor on scarring

    Integrative analysis identifies candidate tumor microenvironment and intracellular signaling pathways that define tumor heterogeneity in NF1

    Get PDF
    Neurofibromatosis type 1 (NF1) is a monogenic syndrome that gives rise to numerous symptoms including cognitive impairment, skeletal abnormalities, and growth of benign nerve sheath tumors. Nearly all NF1 patients develop cutaneous neurofibromas (cNFs), which occur on the skin surface, whereas 40-60% of patients develop plexiform neurofibromas (pNFs), which are deeply embedded in the peripheral nerves. Patients with pNFs have a ~10% lifetime chance of these tumors becoming malignant peripheral nerve sheath tumors (MPNSTs). These tumors have a severe prognosis and few treatment options other than surgery. Given the lack of therapeutic options available to patients with these tumors, identification of druggable pathways or other key molecular features could aid ongoing therapeutic discovery studies. In this work, we used statistical and machine learning methods to analyze 77 NF1 tumors with genomic data to characterize key signaling pathways that distinguish these tumors and identify candidates for drug development. We identified subsets of latent gene expression variables that may be important in the identification and etiology of cNFs, pNFs, other neurofibromas, and MPNSTs. Furthermore, we characterized the association between these latent variables and genetic variants, immune deconvolution predictions, and protein activity predictions

    Cancer modelling: Getting to the heart of the problem

    Get PDF
    Paradoxically, improvements in healthcare that have enhanced the life expectancy of humans in the Western world have, indirectly, increased the prevalence of certain types of cancer such as prostate and breast. It remains unclear whether this phenomenon should be attributed to the ageing process itself or the cumulative effect of prolonged exposure to harmful environmental stimuli such as ultraviolet light, radiation and carcinogens (Franks and Teich, 1988). Equally, there is also compelling evidence that certain genetic abnormalities can predispose individuals to specific cancers (Ilyas et al., 1999). The variety of factors that have been implicated in the development of solid tumours stems, to a large extent, from the fact that ‘cancer’ is a generic term, often used to characterize a series of disorders that share common features. At this generic level of description, cancer may be viewed as a cellular disease in which controls that usually regulate growth and maintain homeostasis are disrupted. Cancer is typically initiated by genetic mutations that lead to enhanced mitosis of a cell lineage and the formation of an avascular tumour. Since it receives nutrients by diffusion from the surrounding tissue, the size of an avascular tumour is limited to several millimeters in diameter. Further growth relies on the tumour acquiring the ability to stimulate the ingrowth of a new, circulating blood supply from the host vasculature via a process termed angiogenesis (Folkman, 1974). Once vascularised, the tumour has access to a vast nutrient source and rapid growth ensues. Further, tumour fragments that break away from the primary tumour, on entering the vasculature, may be transported to other organs in which they may establish secondary tumours or metastases that further compromise the host. Invasion is another key feature of solid tumours whereby contact with the tissue stimulates the production of enzymes that digest the tissue, liberating space into which the tumour cells migrate. Thus, cancer is a complex, multiscale process. The spatial scales of interest range from the subcellular level, to the cellular and macroscopic (or tissue) levels while the timescales may vary from seconds (or less) for signal transduction pathways to months for tumour doubling times The variety of phenomena involved, the range of spatial and temporal scales over which they act and the complex way in which they are inter-related mean that the development of realistic theoretical models of solid tumour growth is extremely challenging. While there is now a large literature focused on modelling solid tumour growth (for a review, see, for example, Preziosi, 2003), existing models typically focus on a single spatial scale and, as a result, are unable to address the fundamental problem of how phenomena at different scales are coupled or to combine, in a systematic manner, data from the various scales. In this article, a theoretical framework will be presented that is capable of integrating a hierarchy of processes occurring at different scales into a detailed model of solid tumour growth (Alarcon et al., 2004). The model is formulated as a hybrid cellular automaton and contains interlinked elements that describe processes at each spatial scale: progress through the cell cycle and the production of proteins that stimulate angiogenesis are accounted for at the subcellular level; cell-cell interactions are treated at the cellular level; and, at the tissue scale, attention focuses on the vascular network whose structure adapts in response to blood flow and angiogenic factors produced at the subcellular level. Further coupling between the different spatial scales arises from the transport of blood-borne oxygen into the tissue and its uptake at the cellular level. Model simulations will be presented to illustrate the effect that spatial heterogeneity induced by blood flow through the vascular network has on the tumour’s growth dynamics and explain how the model may be used to compare the efficacy of different anti-cancer treatment protocols

    INTEGRATIVE ANALYSIS OF OMICS DATA IN ADULT GLIOMA AND OTHER TCGA CANCERS TO GUIDE PRECISION MEDICINE

    Get PDF
    Transcriptomic profiling and gene expression signatures have been widely applied as effective approaches for enhancing the molecular classification, diagnosis, prognosis or prediction of therapeutic response towards personalized therapy for cancer patients. Thanks to modern genome-wide profiling technology, scientists are able to build engines leveraging massive genomic variations and integrating with clinical data to identify “at risk” individuals for the sake of prevention, diagnosis and therapeutic interventions. In my graduate work for my Ph.D. thesis, I have investigated genomic sequencing data mining to comprehensively characterise molecular classifications and aberrant genomic events associated with clinical prognosis and treatment response, through applying high-dimensional omics genomic data to promote the understanding of gene signatures and somatic molecular alterations contributing to cancer progression and clinical outcomes. Following this motivation, my dissertation has been focused on the following three topics in translational genomics. 1) Characterization of transcriptomic plasticity and its association with the tumor microenvironment in glioblastoma (GBM). I have integrated transcriptomic, genomic, protein and clinical data to increase the accuracy of GBM classification, and identify the association between the GBM mesenchymal subtype and reduced tumorpurity, accompanied with increased presence of tumor-associated microglia. Then I have tackled the sole source of microglial as intrinsic tumor bulk but not their corresponding neurosphere cells through both transcriptional and protein level analysis using a panel of sphere-forming glioma cultures and their parent GBM samples.FurthermoreI have demonstrated my hypothesis through longitudinal analysis of paired primary and recurrent GBM samples that the phenotypic alterations of GBM subtypes are not due to intrinsic proneural-to-mesenchymal transition in tumor cells, rather it is intertwined with increased level of microglia upon disease recurrence. Collectively I have elucidated the critical role of tumor microenvironment (Microglia and macrophages from central nervous system) contributing to the intra-tumor heterogeneity and accurate classification of GBM patients based on transcriptomic profiling, which will not only significantly impact on clinical perspective but also pave the way for preclinical cancer research. 2) Identification of prognostic gene signatures that stratify adult diffuse glioma patientsharboring1p/19q co-deletions. I have compared multiple statistical methods and derived a gene signature significantly associated with survival by applying a machine learning algorithm. Then I have identified inflammatory response and acetylation activity that associated with malignant progression of 1p/19q co-deleted glioma. In addition, I showed this signature translates to other types of adult diffuse glioma, suggesting its universality in the pathobiology of other subset gliomas. My efforts on integrative data analysis of this highly curated data set usingoptimizedstatistical models will reflect the pending update to WHO classification system oftumorsin the central nervous system (CNS). 3) Comprehensive characterization of somatic fusion transcripts in Pan-Cancers. I have identified a panel of novel fusion transcripts across all of TCGA cancer types through transcriptomic profiling. Then I have predicted fusion proteins with kinase activity and hub function of pathway network based on the annotation of genetically mobile domains and functional domain architectures. I have evaluated a panel of in -frame gene fusions as potential driver mutations based on network fusion centrality hypothesis. I have also characterised the emerging complexity of genetic architecture in fusion transcripts through integrating genomic structure and somatic variants and delineating the distinct genomic patterns of fusion events across different cancer types. Overall my exploration of the pathogenetic impact and clinical relevance of candidate gene fusions have provided fundamental insights into the management of a subset of cancer patients by predicting the oncogenic signalling and specific drug targets encoded by these fusion genes. Taken together, the translational genomic research I have conducted during my Ph.D. study will shed new light on precision medicine and contribute to the cancer research community. The novel classification concept, gene signature and fusion transcripts I have identified will address several hotly debated issues in translational genomics, such as complex interactions between tumor bulks and their adjacent microenvironments, prognostic markers for clinical diagnostics and personalized therapy, distinct patterns of genomic structure alterations and oncogenic events in different cancer types, therefore facilitating our understanding of genomic alterations and moving us towards the development of precision medicine

    Quantifying cancer epithelial-mesenchymal plasticity and its association with stemness and immune response

    Full text link
    Cancer cells can acquire a spectrum of stable hybrid epithelial/mesenchymal (E/M) states during epithelial-mesenchymal transition (EMT). Cells in these hybrid E/M phenotypes often combine epithelial and mesenchymal features and tend to migrate collectively commonly as small clusters. Such collectively migrating cancer cells play a pivotal role in seeding metastases and their presence in cancer patients indicates an adverse prognostic factor. Moreover, cancer cells in hybrid E/M phenotypes tend to be more associated with stemness which endows them with tumor-initiation ability and therapy resistance. Most recently, cells undergoing EMT have been shown to promote immune suppression for better survival. A systematic understanding of the emergence of hybrid E/M phenotypes and the connection of EMT with stemness and immune suppression would contribute to more effective therapeutic strategies. In this review, we first discuss recent efforts combining theoretical and experimental approaches to elucidate mechanisms underlying EMT multi-stability (i.e. the existence of multiple stable phenotypes during EMT) and the properties of hybrid E/M phenotypes. Following we discuss non-cell-autonomous regulation of EMT by cell cooperation and extracellular matrix. Afterwards, we discuss various metrics that can be used to quantify EMT spectrum. We further describe possible mechanisms underlying the formation of clusters of circulating tumor cells. Last but not least, we summarize recent systems biology analysis of the role of EMT in the acquisition of stemness and immune suppression.Comment: 50 pages, 6 figure
    corecore