1,274 research outputs found

    A NOVEL COMPUTATIONAL FRAMEWORK FOR TRANSCRIPTOME ANALYSIS WITH RNA-SEQ DATA

    Get PDF
    The advance of high-throughput sequencing technologies and their application on mRNA transcriptome sequencing (RNA-seq) have enabled comprehensive and unbiased profiling of the landscape of transcription in a cell. In order to address the current limitation of analyzing accuracy and scalability in transcriptome analysis, a novel computational framework has been developed on large-scale RNA-seq datasets with no dependence on transcript annotations. Directly from raw reads, a probabilistic approach is first applied to infer the best transcript fragment alignments from paired-end reads. Empowered by the identification of alternative splicing modules, this framework then performs precise and efficient differential analysis at automatically detected alternative splicing variants, which circumvents the need of full transcript reconstruction and quantification. Beyond the scope of classical group-wise analysis, a clustering scheme is further described for mining prominent consistency among samples in transcription, breaking the restriction of presumed grouping. The performance of the framework has been demonstrated by a series of simulation studies and real datasets, including the Cancer Genome Atlas (TCGA) breast cancer analysis. The successful applications have suggested the unprecedented opportunity in using differential transcription analysis to reveal variations in the mRNA transcriptome in response to cellular differentiation or effects of diseases

    Master of Science

    Get PDF
    thesisAcquired drug resistance is a frequent challenge in breast cancer. A tumor may initially respond to chemotherapy, but later become resistant and relapse. This is largely due to intra-tumoral heterogeneity; tumor genetic subclones with higher fitness in response to chemotherapy survive, and the patient’s cancer becomes drug-resistant. In additional, non-genetic phenotypic alterations in response to chemotherapy may promote drug resistance. How breast cancers evolve to become drug-resistant is unclear. A better understanding of how this occurs could lead to alternative therapeutic regimens to treat drug-resistant breast cancer or prevent its development. To address this problem, we performed genomic and phenotypic analysis of four breast cancers through 2 to 15 years of diverse treatments, using a unique set of longitudinal samples from these patients. This revealed genetic events likely leading to drug resistance, including acquisition of BRCA2 reversions and ABCB1 fusions. Further, cancer phenotypes evolved dramatically after treatment, including increased post-treatment mesenchymal, receptor tyrosine kinase, and immune avoidance gene expression profiles. In one patient, treatment of cultured patient cells with drugs targeting the receptor tyrosine kinase phenotype was more effective in post-treatment, receptor tyrosine kinase-high cells compared to the pre-treatment cells. Thus, we have identified both mutations and phenotypes that may promote breast cancer drug resistance, some of which may be targeted to treat drug-resistant cancers

    Combating subclonal evolution of resistant cancer phenotypes

    Get PDF
    Metastatic breast cancer remains challenging to treat, and most patients ultimately progress on therapy. This acquired drug resistance is largely due to drug-refractory sub-populations (subclones) within heterogeneous tumors. Here, we track the genetic and phenotypic subclonal evolution of four breast cancers through years of treatment to better understand how breast cancers become drug-resistant. Recurrently appearing post-chemotherapy mutations are rare. However, bulk and single-cell RNA sequencing reveal acquisition of malignant phenotypes after treatment, including enhanced mesenchymal and growth factor signaling, which may promote drug resistance, and decreased antigen presentation and TNF-α signaling, which may enable immune system avoidance. Some of these phenotypes pre-exist in pre-treatment subclones that become dominant after chemotherapy, indicating selection for resistance phenotypes. Post-chemotherapy cancer cells are effectively treated with drugs targeting acquired phenotypes. These findings highlight cancer's ability to evolve phenotypically and suggest a phenotype-targeted treatment strategy that adapts to cancer as it evolves

    Unravelling the Genomic Landscape of Metastatic Prostate Cancer

    Get PDF

    An EMT-Driven Alternative Splicing Program Occurs in Human Breast Cancer and Modulates Cellular Phenotype

    Get PDF
    Epithelial-mesenchymal transition (EMT), a mechanism important for embryonic development, plays a critical role during malignant transformation. While much is known about transcriptional regulation of EMT, alternative splicing of several genes has also been correlated with EMT progression, but the extent of splicing changes and their contributions to the morphological conversion accompanying EMT have not been investigated comprehensively. Using an established cell culture model and RNA–Seq analyses, we determined an alternative splicing signature for EMT. Genes encoding key drivers of EMT–dependent changes in cell phenotype, such as actin cytoskeleton remodeling, regulation of cell–cell junction formation, and regulation of cell migration, were enriched among EMT–associated alternatively splicing events. Our analysis suggested that most EMT–associated alternative splicing events are regulated by one or more members of the RBFOX, MBNL, CELF, hnRNP, or ESRP classes of splicing factors. The EMT alternative splicing signature was confirmed in human breast cancer cell lines, which could be classified into basal and luminal subtypes based exclusively on their EMT–associated splicing pattern. Expression of EMT–associated alternative mRNA transcripts was also observed in primary breast cancer samples, indicating that EMT–dependent splicing changes occur commonly in human tumors. The functional significance of EMT–associated alternative splicing was tested by expression of the epithelial-specific splicing factor ESRP1 or by depletion of RBFOX2 in mesenchymal cells, both of which elicited significant changes in cell morphology and motility towards an epithelial phenotype, suggesting that splicing regulation alone can drive critical aspects of EMT–associated phenotypic changes. The molecular description obtained here may aid in the development of new diagnostic and prognostic markers for analysis of breast cancer progression.National Institutes of Health (U.S.) (R01-HG002439)National Science Foundation (U.S.) (equipment grant)National Institutes of Health (U.S.) (Integrative Cancer Biology Program Grant U54-CA112967)David H. Koch Institute for Integrative Cancer Research at MIT (Ludwig Center for Metastasis Research)David H. Koch Institute for Integrative Cancer Research at MITMassachusetts Institute of Technology (Croucher Scholarship)Massachusetts Institute of Technology (Ludwig Fund postdoctoral fellowship)National Institutes of Health (U.S.) (NIH CA100324)National Institutes of Health (U.S.) (AECC9526-5267

    Machine learning and computational methods to identify molecular and clinical markers for complex diseases – case studies in cancer and obesity

    Get PDF
    In biomedical research, applied machine learning and bioinformatics are the essential disciplines heavily involved in translating data-driven findings into medical practice. This task is especially accomplished by developing computational tools and algorithms assisting in detection and clarification of underlying causes of the diseases. The continuous advancements in high-throughput technologies coupled with the recently promoted data sharing policies have contributed to presence of a massive wealth of data with remarkable potential to improve human health care. In concordance with this massive boost in data production, innovative data analysis tools and methods are required to meet the growing demand. The data analyzed by bioinformaticians and computational biology experts can be broadly divided into molecular and conventional clinical data categories. The aim of this thesis was to develop novel statistical and machine learning tools and to incorporate the existing state-of-the-art methods to analyze bio-clinical data with medical applications. The findings of the studies demonstrate the impact of computational approaches in clinical decision making by improving patients risk stratification and prediction of disease outcomes. This thesis is comprised of five studies explaining method development for 1) genomic data, 2) conventional clinical data and 3) integration of genomic and clinical data. With genomic data, the main focus is detection of differentially expressed genes as the most common task in transcriptome profiling projects. In addition to reviewing available differential expression tools, a data-adaptive statistical method called Reproducibility Optimized Test Statistic (ROTS) is proposed for detecting differential expression in RNA-sequencing studies. In order to prove the efficacy of ROTS in real biomedical applications, the method is used to identify prognostic markers in clear cell renal cell carcinoma (ccRCC). In addition to previously known markers, novel genes with potential prognostic and therapeutic role in ccRCC are detected. For conventional clinical data, ensemble based predictive models are developed to provide clinical decision support in treatment of patients with metastatic castration resistant prostate cancer (mCRPC). The proposed predictive models cover treatment and survival stratification tasks for both trial-based and realworld patient cohorts. Finally, genomic and conventional clinical data are integrated to demonstrate the importance of inclusion of genomic data in predictive ability of clinical models. Again, utilizing ensemble-based learners, a novel model is proposed to predict adulthood obesity using both genetic and social-environmental factors. Overall, the ultimate objective of this work is to demonstrate the importance of clinical bioinformatics and machine learning for bio-clinical marker discovery in complex disease with high heterogeneity. In case of cancer, the interpretability of clinical models strongly depends on predictive markers with high reproducibility supported by validation data. The discovery of these markers would increase chance of early detection and improve prognosis assessment and treatment choice

    Unravelling the Genomic Landscape of Metastatic Prostate Cancer

    Get PDF

    Maintenance Of Mammary Epithelial Phenotype By Transcription Factor Runx1 Through Mitotic Gene Bookmarking

    Get PDF
    Breast cancer arises from a series of acquired mutations that disrupt normal mammary epithelial homeostasis and create multi-potent cancer stem cells that can differentiate into clinically distinct breast cancer subtypes. Despite improved therapies and advances in early detection, breast cancer remains the leading diagnosed cancer in women. A predominant mechanism initiating invasion and migration for a variety of cancers including breast, is epithelial-to-mesenchymal transition (EMT). EMT— a trans-differentiation process through which mammary epithelial cells acquire a more aggressive mesenchymal phenotype—is a regulated process during early mammary gland development and involves many transcription factors involved in cell lineage commitment, proliferation, and growth. Despite accumulating evidence for a broad understanding of EMT regulation, the mechanism(s) by which mammary epithelial cells maintain their phenotype is unknown. Mitotic gene bookmarking, i.e., transcription factor binding to target genes during mitosis for post mitotic regulation, is a key epigenetic mechanism to convey regulatory information for cell proliferation, growth, and identity through successive cell divisions. Many phenotypic transcription factors, including the hematopoietic Runt Related Transcription Factor 1 (RUNX1/AML1), bookmark target genes during mitosis. Despite growing evidence, a role for mitotic gene bookmarking in maintaining mammary epithelial phenotype has not been investigated. RUNX1 has been recently identified to play key roles in breast cancer development and progression. Importantly, RUNX1 stabilizes the normal breast epithelial phenotype and prevents EMT through repression of EMT-initiating pathways. Findings reported in this thesis demonstrate that RUNX1 mitotically bookmarks both RNA Pol I and II transcribed genes involved in proliferation, growth, and mammary epithelial phenotype maintenance. Inhibition of RUNX1 DNA binding by a specific small molecule inhibitor led to phenotypic changes, apoptosis, differences in global protein synthesis, and differential expression of ribosomal RNA as well as protein coding genes and long non-coding RNA genes involved in cellular phenotype. Together these findings reveal a novel epigenetic regulatory role of RUNX1 in normal-like breast epithelial cells and strongly suggest that mitotic bookmarking of target genes by RUNX1 is required to maintain breast epithelial phenotype. Disruption of RUNX1 bookmarking results in initiation of epithelial to mesenchymal transition, an essential first step in the onset of breast cancer

    Modeling Complex Patterns of Differential DNA Methylation That Associate with Expression Change

    Get PDF
    Gene expression is driven by specific combinations of transcription factors binding to regulatory sequences to define cell type expression profiles. Changes in DNA sequence alter transcription factor binding affinities and gene expression, and DNA methylation is an additional source of variation that is maintained throughout cellular division. Numerous genomic studies are underway to determine which genes are abnormally regulated by DNA methylation in disease. However, we have a poor understanding of how disease-specific methylation variation affects expression. Global DNA demethylation agents have been clinically approved for use in cancer, which has spurred interest in identifying genes which would be most susceptible for targeted demethylation therapies. In this work, I developed multiple tools to increase our knowledge about the relationship between methylation and gene expression in both tissue specificity and disease. I first developed a computational strategy to identify amplifications and deletions from restriction enzyme-based methylation datasets. In a model of endocrine therapy resistant breast cancer, I identify ESR1 as the most amplified genomic region in response to estrogen deprivation. I develop a qPCR-based assay to probe the amplification in cell lines, formalin-fixed paraffin embedded samples, patient tumors, and xenograft samples. This data is consistent with the hypothesis that in a subset of patients, the ESR1 amplification results in increased levels of ER. These are produced in response to estrogen deprivation to sensitize breast cancer to low available quantities of estrogen for cellular growth. Next, to explain specific variation in methylation that associates with expression change in both disease and tissue-specificity, I developed an integrative analysis tool, Methylation-based Gene Expression Classification (ME-Class). This model captures the complexity of methylation changes around a gene promoter. Using whole-genome bisulfite sequencing and RNA-seq datasets from different tissue samples, ME-Class significantly outperforms published methods using methylation to predict differential gene expression change. To demonstrate its utility, I used ME-Class to analyze different hematopoietic cell types, and identified that expressionassociated methylation changes were predominantly found when comparing cells from distantly related lineages, implying that changes in the cell’s transcriptional program precede associated methylation changes. Training ME-Class on normal-tumor pairs indicated that cancer-specific expression-associated methylation changes differ from tissue-specific changes. I further show that ME-Class can detect functionally relevant cancer-specific, expression-associated methylation changes that are reversed upon the removal of methylation in a model of colon cancer. Lastly, I extended ME-Class to incorporate 5-hydroxymethylcytosine and uncovered gene regulatory logic involving 5hmC and 5mC in mammalian development and disease. As more large-scale, genome-wide, differential DNA methylation studies become available, tools such as ME-class will prove invaluable to understand how specific methylation changes affect transcription. Our results show this toolset can identify genes that are dysregulated by methylation in disease, and could be used to facilitate the identification of patients who may benefit from clinically-approved demethylating therapeutics
    • …
    corecore