2,827 research outputs found

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Integrative methods for analyzing big data in precision medicine

    Get PDF
    We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of “Big Data” in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face

    Integrative methods for analysing big data in precision medicine

    Get PDF
    We provide an overview of recent developments in big data analyses in the context of precision medicine and health informatics. With the advance in technologies capturing molecular and medical data, we entered the area of “Big Data” in biology and medicine. These data offer many opportunities to advance precision medicine. We outline key challenges in precision medicine and present recent advances in data integration-based methods to uncover personalized information from big data produced by various omics studies. We survey recent integrative methods for disease subtyping, biomarkers discovery, and drug repurposing, and list the tools that are available to domain scientists. Given the ever-growing nature of these big data, we highlight key issues that big data integration methods will face

    Transcriptomic data integration for precision medicine in leukemia

    Get PDF
    This thesis is comprised of three studies demonstrating the application of different statistical and bioinformatic approaches to address distinct challenges of implementing precision medicine strategies for hematological malignancies. The approaches focus on the analysis of next-generation sequencing data, including both genomic and transcriptomics, to deconvolute disease biology and underlying mechanisms of drug sensitivities and resistance. The outcomes of the studies have clinical implications for advancing current diagnosis and treatment paradigms in patients with hematological diseases. Study I, RNA sequencing has not been widely adopted in a clinical diagnostic setting due to continuous development and lack of standardization. Here, the aim was to evaluate the efficiency of two different RNA-seq library preparation protocols applied to cells collected from acute myeloid leukemia (AML) and acute lymphoblastic leukemia (ALL) patients. The poly-A-tailed mRNA selection (PA) and ribo- depletion (RD) based RNA-seq library preparation protocols were compared and evaluated for detection of gene fusions, variant calling and gene expression profiling. Overall, both protocols produced broadly consistent results and similar outcomes. However, the PA protocol was more efficient in quantifying expression of leukemia marker genes and drug targets. It also provided higher sensitivity and specificity for expression-based classification of leukemia. In contrast, the RD protocol was more suitable for gene fusion detection and captured a greater number of transcripts. Importantly, high technical variations were observed in samples from two leukemia patient cases suggesting further development of strategies for transcriptomic quantification and data analysis. Study II, the BCL-2 inhibitor venetoclax is an approved and effective agent in combination with hypomethylating agents or low dose cytarabine for AML patients, unfit for intensive induction chemotherapy. However, a limited number of patients responding to venetoclax and development of resistance to the treatment presents a challenge for using the drug to benefit the majority of the AML patients. The aim was to investigate genomic and transcriptomic biomarkers for venetoclax sensitivity and enable identification of the patients who are most responsive to venetoclax treatment. We found that venetoclax sensitive samples are enriched with WT1 and IDH1/IDH2 mutations. Intriguingly, HOX family genes, including HOXB9, HOXA5, HOXB3, HOXB4, were found to be significantly overexpressed in venetoclax sensitive patients. Thus, these HOX-cluster genes expression biomarkers can be explored in a clinical trial setting to stratify AML patients responding to venetoclax based therapies. Study III, venetoclax treatment does not benefit all AML patients that demands identifying biomarkers to exclude the patients from venetoclax based therapies. The aim was to investigate transcriptomic biomarkers for ex vivo venetoclax resistance in AML patients. The correlation of ex vivo venetoclax response with gene expression profiles using a machine learning approach revealed significant overexpression of S100 family genes, S100A8 and S100A9. Moreover, high expression ofS100A9was found to be associated with birabresib (BET inhibitor) sensitivity. The overexpression of S100A8 and S100A9 could potentially be used to detect and monitor venetoclax resistance. The combination of BCL-2 and BET inhibitors may sensitize AML cells to venetoclax upon BET inhibition and block leukemic cell survival.In this thesis, the aim was to utilize gene expression information for advanced precision medicine outcomes in patients with hematological malignancies. In the study, I, the contemporary mainstream library preparation protocols, Ribo-depletion and PolyA enrichment used for RNA sequencing, were compared in order to select the protocol that suffices the goal of the experiment, especially in patients with acute leukemias. In study II, we applied bioinformatics approaches to identify IDH1/2 mutation and HOX family gene expression correlated with ex vivo sensitivity to BCL-2 inhibitor venetoclax in acute myeloid leukemia (AML) patients. In study III, statistical and machine learning methods were implemented to identify S100A8/A9 gene expression biomarkers for ex vivo resistance to venetoclax in AML patients. In summary, this thesis addresses the challenges of utilizing gene expression information to stratify patients based on biomarkers to promote precision medicine practice in hematological malignancies

    Next-generation sequencing identifies mechanisms of tumourigenesis caused by loss of SMARCB1 in Malignant Rhabdoid Tumours

    Get PDF
    PhD ThesisIntroduction: Malignant Rhabdoid Tumours (MRT) are unique malignancies caused by biallelic inactivation of a single gene (SMARCB1). SMARCB1 encodes for a protein that is part of the SWI/SNF chromatin remodelling complex, responsible for the regulation of hundreds of downstream genes/pathways. Despite the simple biology of these tumours, no studies have identified the critical pathways involved in tumourigenesis. The understanding of downstream effects is essential to identifying therapeutic targets that can improve the outcome of MRT patients. Methods: RNA-seq and 450K-methylation analyses have been performed in MRT human primary malignancies (n > 39) and in 4 MRT cell lines in which lentivirus was used to re-express SMARCB1 (G401, A204, CHLA-266, and STA-WT1). The MRT cell lines were treated with 5-aza-2 -deoxycytidine followed by global gene transcription analysis (RNA-seq and 450K-methylation) to investigate how changes in methylation lead to tumourigenesis. Results: We show that primary Malignant Rhabdoid Tumours present a unique and distinct expression/methylation profile which confirms that MRT broadly constitute a single and different tumour type from other paediatric malignancies. However, despite their common cause MRT can be can sub-group by location (i.e. CNS or kidney). We observe that re-expression of SMARCB1 in MRT cell lines determines activation/inactivation of specific downstream pathways such as IL-6/TGF beta. We also observe a direct correlation between alterations in methylation and gene expression in CD44, GLI2, GLI3, CDKN1A, CDKN2A and JARID after SMARB1 re-expression. Loss of SMARCB1 also promotes expression of aberrant isoforms and novel transcripts and causes genome-wide changes in SWI/SNF binding. Conclusion: Next generation transcriptome and methylome analysis in primary MRT and in functional models give us detailed downstream effects of SMARCB1 loss in Malignant Rhabdoid Tumours. The integration of data from both primary and functional models has provided, for the first time, a genome-wide catalogue of SMARCB1 tumourigenic changes (validated using systems biology). Here we show how a single V deletion of SMARCB1 is responsible for deregulation of expression, methylation status and binding at the promoter regions of potent tumour-suppressor genes. The genes, pathways and biological mechanisms indicated as key in tumour development may ultimately be targetable therapeutically and will lead to better treatments for what is currently one of the most lethal paediatric cancers.NECCR, Children with cancer UK, Brain Trust, Love Oliver, CCLG, Karen and Iain Wark, The Smiley Ridley Fund, whose financial support made this project possible

    Undisclosed, unmet and neglected challenges in multi-omics studies

    Full text link
    [EN] Multi-omics approaches have become a reality in both large genomics projects and small laboratories. However, the multi-omics research community still faces a number of issues that have either not been sufficiently discussed or for which current solutions are still limited. In this Perspective, we elaborate on these limitations and suggest points of attention for future research. We finally discuss new opportunities and challenges brought to the field by the rapid development of single-cell high-throughput molecular technologies.This work has been funded by the Spanish Ministry of Science and Innovation with grant number BES-2016-076994 to A.A.-L.Tarazona, S.; Arzalluz-Luque, Á.; Conesa, A. (2021). Undisclosed, unmet and neglected challenges in multi-omics studies. Nature Computational Science. 1(6):395-402. https://doi.org/10.1038/s43588-021-00086-z3954021

    Kernel methods in genomics and computational biology

    Full text link
    Support vector machines and kernel methods are increasingly popular in genomics and computational biology, due to their good performance in real-world applications and strong modularity that makes them suitable to a wide range of problems, from the classification of tumors to the automatic annotation of proteins. Their ability to work in high dimension, to process non-vectorial data, and the natural framework they provide to integrate heterogeneous data are particularly relevant to various problems arising in computational biology. In this chapter we survey some of the most prominent applications published so far, highlighting the particular developments in kernel methods triggered by problems in biology, and mention a few promising research directions likely to expand in the future

    The altered transcriptome and DNA methylation profiles of docetaxel resistance in breast cancer PDX models

    Get PDF
    Taxanes are standard therapy in clinical practice for metastatic breast cancer; however, primary or acquired chemoresistance are a common cause of mortality. Breast cancer patient-derived xenografts (PDX) are powerful tools for the study of cancer biology and drug treatment response. Specific DNA methylation patterns have been associated to different breast cancer subtypes but its association with chemoresistance remains unstudied. Aiming to elucidate docetaxel resistance mechanisms, we performed genome-wide DNA methylation in breast cancer PDX models, including luminal and triple-negative breast cancer (TNBC) models sensitive to docetaxel, their matched models after emergence of chemoresistance and residual disease after short-term docetaxel treatment. We found that DNA methylation profiles from breast cancer PDX models maintain the subtype-specific methylation patterns of clinical samples. Two main DNA methylation clusters were found in TNBC PDX and remain stable during the emergence of docetaxel resistance; however, some genes/pathways were differentially methylated according to docetaxel response. A DNA methylation signature of resistance able to segregate TNBC based on chemotherapy response was identified. Transcriptomic profiling of selected sensitive/resistant pairs and integrative analysis with methylation data demonstrated correlation between some differentially methylated and expressed genes in docetaxel-resistant TNBC PDX models. Multiple gene expression changes were found after the emergence of docetaxel resistance in TNBC. DNA methylation and transcriptional changes identified between docetaxel-sensitive and -resistant TNBC PDX models or residual disease may have predictive value for chemotherapy response in TNBC. IMPLICATIONS: Subtype-specific DNA methylation patterns are maintained in breast cancer PDX models. While no global methylation changes were found, we uncovered differentially DNA methylated and expressed genes/pathways associated with the emergence of docetaxel resistance in TNBC
    corecore