9 research outputs found

    Development of statistical methods for integrative omics analysis in precision medicine

    No full text
    Precision medicine is an integrative approach to the prevention and treatment of complex diseases such as cardiovascular disease that considers an individual’s lifestyle, clinical information, and omics profile. In the last decade, the advances in omics technologies have allowed researchers to gain insight into biological systems and progress to precision medicine. Many omics technology now enables us to rapidly generate, store and analyse data at a large scale. Many efforts have attempted to integrate large-scale multi-batch and multi-omics data. While many strategies have been developed, challenges remain in developing a robust method cap- able of pre-processing large-scale datasets, handling mislabelled information, and performing integrative analysis. Pre-processing any omics data is essential to remove technical factors whilst preserving biological variance. However, many methods still struggle to mitigate the batch effect, particularly for protracted acquisitions. Furthermore, robust visualisation tools for processing, quality control diagnostics, and integrative analysis of omics data are still lacking in effective data visualisation and integration. Lastly, cell type annotation remains a key challenge in single-cell transcriptomic data analysis due to the incompleteness of our current knowledge and the human subjectivity involved in manual curation. Together, these may result in cell type mislabelling and potentially lead to false discoveries in downstream analysis. This thesis first introduces each of the above challenges in detail (Chapter 1). We then introduce novel strategies and robust methods for the removal of unwanted variation in large-scale metabolomics data (Chapter 2), visualisation tools for omics data diagnostics and integrative analysis (Chapter 3), and cell-type identification methods in single cell transcriptomics data (Chapter 4). Chapter 5 summarises the contributions of each chapter to precision medicine and concludes the thesis

    Protocol for the processing and downstream analysis of phosphoproteomic data with PhosR

    No full text
    Summary: Analysis of phosphoproteomic data requires advanced computational methodologies. To this end, we developed PhosR, a set of tools and methodologies implemented in R to allow the comprehensive analysis of phosphoproteomic data. PhosR enables processing steps such as imputation, normalization, and functional analysis such as kinase activity inference and signalome construction. Together, PhosR facilitates interpretation and discovery from large-scale phosphoproteomic data sets.For complete details on the use and execution of this protocol, please refer to Kim et al. (2021)

    PAD2: interactive exploration of transcription factor genomic colocalization using ChIP-seq data

    No full text
    Summary: Characterizing transcription factor (TF) genomic colocalization is essential for identifying cooperative binding of TFs in controlling gene expression. Here, we introduce a protocol for using PAD2, an interactive web application that enables the investigation of colocalization of various TFs and chromatin-regulating proteins from mouse embryonic stem cells at various functional genomic regions. We describe steps for accessing and searching the PAD2 database and selecting and submitting genomic regions. We then detail protein colocalization analysis using heatmap and ranked correlation plot.For complete details on the use and execution of this protocol, please refer to Kim et al. (2022).1 : Publisher’s note: Undertaking any experimental protocol requires adherence to local institutional guidelines for laboratory safety and ethics

    PhosR enables processing and functional analysis of phosphoproteomic data

    Get PDF
    Mass spectrometry (MS)-based phosphoproteomics has revolutionised our ability to profile phosphorylation-based signalling in cells and tissues on a global scale. To infer the action of kinases and signalling pathways in phosphoproteomic experiments, we present PhosR, a set of tools and methodologies implemented in a suite of R packages facilitating comprehensive analysis of phosphoproteomic data. By applying PhosR to both published and new phosphoproteomic datasets, we demonstrate capabilities in data imputation and normalisation using a novel set of ‘stably phosphorylated sites’, and in functional analysis for inferring active kinases and signalling pathways. In particular, we introduce a ‘signalome’ construction method for identifying a collection of signalling modules to summarise and visualise the interaction of kinases and their collective actions on signal transduction. Together, our data and findings demonstrate the utility of PhosR in processing and generating novel biological knowledge from MS-based phosphoproteomic data
    corecore