305 research outputs found

    Design of new algorithms for gene network reconstruction applied to in silico modeling of biomedical data

    Get PDF
    Programa de Doctorado en Biotecnología, Ingeniería y Tecnología QuímicaLínea de Investigación: Ingeniería, Ciencia de Datos y BioinformáticaClave Programa: DBICódigo Línea: 111The root causes of disease are still poorly understood. The success of current therapies is limited because persistent diseases are frequently treated based on their symptoms rather than the underlying cause of the disease. Therefore, biomedical research is experiencing a technology-driven shift to data-driven holistic approaches to better characterize the molecular mechanisms causing disease. Using omics data as an input, emerging disciplines like network biology attempt to model the relationships between biomolecules. To this effect, gene co- expression networks arise as a promising tool for deciphering the relationships between genes in large transcriptomic datasets. However, because of their low specificity and high false positive rate, they demonstrate a limited capacity to retrieve the disrupted mechanisms that lead to disease onset, progression, and maintenance. Within the context of statistical modeling, we dove deeper into the reconstruction of gene co-expression networks with the specific goal of discovering disease-specific features directly from expression data. Using ensemble techniques, which combine the results of various metrics, we were able to more precisely capture biologically significant relationships between genes. We were able to find de novo potential disease-specific features with the help of prior biological knowledge and the development of new network inference techniques. Through our different approaches, we analyzed large gene sets across multiple samples and used gene expression as a surrogate marker for the inherent biological processes, reconstructing robust gene co-expression networks that are simple to explore. By mining disease-specific gene co-expression networks we come up with a useful framework for identifying new omics-phenotype associations from conditional expression datasets.In this sense, understanding diseases from the perspective of biological network perturbations will improve personalized medicine, impacting rational biomarker discovery, patient stratification and drug design, and ultimately leading to more targeted therapies.Universidad Pablo de Olavide de Sevilla. Departamento de Deporte e Informátic

    Computational approaches for single-cell omics and multi-omics data

    Get PDF
    Single-cell omics and multi-omics technologies have enabled the study of cellular heterogeneity with unprecedented resolution and the discovery of new cell types. The core of identifying heterogeneous cell types, both existing and novel ones, relies on efficient computational approaches, including especially cluster analysis. Additionally, gene regulatory network analysis and various integrative approaches are needed to combine data across studies and different multi-omics layers. This thesis comprehensively compared Bayesian clustering models for single-cell RNAsequencing (scRNA-seq) data and selected integrative approaches were used to study the cell-type specific gene regulation of uterus. Additionally, single-cell multi-omics data integration approaches for cell heterogeneity analysis were investigated. Article I investigated analytical approaches for cluster analysis in scRNA-seq data, particularly, latent Dirichlet allocation (LDA) and hierarchical Dirichlet process (HDP) models. The comparison of LDA and HDP together with the existing state-of-art methods revealed that topic modeling-based models can be useful in scRNA-seq cluster analysis. Evaluation of the cluster qualities for LDA and HDP with intrinsic and extrinsic cluster quality metrics indicated that the clustering performance of these methods is dataset dependent. Article II and Article III focused on cell-type specific integrative analysis of uterine or decidual stromal (dS) and natural killer (dNK) cells that are important for successful pregnancy. Article II integrated the existing preeclampsia RNA-seq studies of the decidua together with recent scRNA-seq datasets in order to investigate cell-type-specific contributions of early onset preeclampsia (EOP) and late onset preeclampsia (LOP). It was discovered that the dS marker genes were enriched for LOP downregulated genes and the dNK marker genes were enriched for upregulated EOP genes. Article III presented a gene regulatory network analysis for the subpopulations of dS and dNK cells. This study identified novel subpopulation specific transcription factors that promote decidualization of stromal cells and dNK mediated maternal immunotolerance. In Article IV, different strategies and methodological frameworks for data integration in single-cell multi-omics data analysis were reviewed in detail. Data integration methods were grouped into early, late and intermediate data integration strategies. The specific stage and order of data integration can have substantial effect on the results of the integrative analysis. The central details of the approaches were presented, and potential future directions were discussed.  Laskennallisia menetelmiä yksisolusekvensointi- ja multiomiikkatulosten analyyseihin Yksisolusekvensointitekniikat mahdollistavat solujen heterogeenisyyden tutkimuksen ennennäkemättömällä resoluutiolla ja uusien solutyyppien löytämisen. Solutyyppien tunnistamisessa keskeisessä roolissa on ryhmittely eli klusterointianalyysi. Myös geenien säätelyverkostojen sekä eri molekyylidatatasojen yhdistäminen on keskeistä analyysissä. Väitöskirjassa verrataan bayesilaisia klusterointimenetelmiä ja yhdistetään eri menetelmillä kerättyjä tietoja kohdun solutyyppispesifisessä geeninsäätelyanalyysissä. Lisäksi yksisolutiedon integraatiomenetelmiä selvitetään kattavasti. Julkaisu I keskittyy analyyttisten menetelmien, erityisesti latenttiin Dirichletallokaatioon (LDA) ja hierarkkiseen Dirichlet-prosessiin (HDP) perustuvien mallien tutkimiseen yksisoludatan klusterianalyysissä. Kattava vertailu näiden kahden mallin sekä olemassa olevien menetelmien kanssa paljasti, että aihemallinnuspohjaiset menetelmät voivat olla hyödyllisiä yksisoludatan klusterianalyysissä. Menetelmien suorituskyky riippui myös kunkin analysoitavan datasetin ominaisuuksista. Julkaisuissa II ja III keskitytään naisen lisääntymisterveydelle tärkeiden kohdun stroomasolujen ja NK-immuunisolujen solutyyppispesifiseen analyysiin. Artikkelissa II yhdistettiin olemassa olevia tuloksia pre-eklampsiasta viimeisimpiin yksisolusekvensointituloksiin ja löydettiin varhain alkavan pre-eklampsian (EOP) ja myöhään alkavan pre-eklampsian (LOP) solutyyppispesifisiä vaikutuksia. Havaittiin, että erilaistuneen strooman markkerigeenien ilmentyminen vähentyi LOP:ssa ja NK-markkerigeenien ilmentyminen lisääntyi EOP:ssa. Julkaisu III analysoi strooman ja NK-solujen alapopulaatiospesifisiä geeninsäätelyverkostoja ja niiden transkriptiofaktoreita. Tutkimus tunnisti uusia alapopulaatiospesifisiä säätelijöitä, jotka edistävät strooman erilaistumista ja NK-soluvälitteistä immunotoleranssia Julkaisu IV tarkastelee yksityiskohtaisesti strategioita ja menetelmiä erilaisten yksisoludatatasojen (multi-omiikka) integroimiseksi. Integrointimenetelmät ryhmiteltiin varhaisen, myöhäisen ja välivaiheen strategioihin ja kunkin lähestymistavan menetelmiä esiteltiin tarkemmin. Lisäksi keskusteltiin mahdollisista tulevaisuuden suunnista

    Proteomic Analysis of the Unfolded Protein Response in Melanoma

    Get PDF
    Despite enormous advancements made in the last decade in the treatment of metastatic melanoma, due to drug resistance and drug toxicities, new therapies and treatment strategies are needed. Additionally, there are no prognostic biomarkers for metastatic disease able to predict outcome. Thirty-two potential biomarkers were analysed by selected reaction monitoring (SRM) in 30 stage III melanoma patients. A 14-protein panel was discovered, able to predict patients likely to have poor outcome and therefore potentially benefit from aggressive therapeutic strategies. From the above study, the Unfolded Protein Response (UPR) was revealed to be a major cellular pathway up-regulated in patients with poor outcome. The UPR is a cellular stress response, which is initiated by a build-up of unfolded protein in the endoplasmic reticulum (ER). Increased activation of the UPR is associated with several cancers however, the mechanisms used to promote tumour progression and metastases are not well understood. To characterise this stress response, the UPR was activated in melanoma cell line models. Using quantitative mass spectrometry 64 proteins were identified as differentially abundant with increased UPR activation. Among them, eight UPR-associated proteins were validated by SRM, identifying these proteins as core modulators of the UPR. An in silico analysis of the eight UPR-associated proteins in pan-cancer patient data across 16 solid tumour types revealed the eight UPR-associated proteins were markers of poor survival across cancer types. The combined data demonstrates the UPR is a major contributor to cancer progression. The study contributes to our knowledge of melanoma biology by elucidating the broad impact of the UPR on several cellular pathways and mechanisms that would promote tumour growth and increase the metastatic potential of melanoma. Furthermore, novel UPR drug targets were identified, including cooperative pathways that could be targeted in combinatorial therapie

    Multiparametric Magnetic Resonance Imaging Artificial Intelligence Pipeline for Oropharyngeal Cancer Radiotherapy Treatment Guidance

    Get PDF
    Oropharyngeal cancer (OPC) is a widespread disease and one of the few domestic cancers that is rising in incidence. Radiographic images are crucial for assessment of OPC and aid in radiotherapy (RT) treatment. However, RT planning with conventional imaging approaches requires operator-dependent tumor segmentation, which is the primary source of treatment error. Further, OPC expresses differential tumor/node mid-RT response (rapid response) rates, resulting in significant differences between planned and delivered RT dose. Finally, clinical outcomes for OPC patients can also be variable, which warrants the investigation of prognostic models. Multiparametric MRI (mpMRI) techniques that incorporate simultaneous anatomical and functional information coupled to artificial intelligence (AI) approaches could improve clinical decision support for OPC by providing immediately actionable clinical rationale for adaptive RT planning. If tumors could be reproducibly segmented, rapid response could be classified, and prognosis could be reliably determined, overall patient outcomes would be optimized to improve the therapeutic index as a function of more risk-adapted RT volumes. Consequently, there is an unmet need for automated and reproducible imaging which can simultaneously segment tumors and provide predictive value for actionable RT adaptation. This dissertation primarily seeks to explore and optimize image processing, tumor segmentation, and patient outcomes in OPC through a combination of advanced imaging techniques and AI algorithms. In the first specific aim of this dissertation, we develop and evaluate mpMRI pre-processing techniques for use in downstream segmentation, response prediction, and outcome prediction pipelines. Various MRI intensity standardization and registration approaches were systematically compared and benchmarked. Moreover, synthetic image algorithms were developed to decrease MRI scan time in an effort to optimize our AI pipelines. We demonstrated that proper intensity standardization and image registration can improve mpMRI quality for use in AI algorithms, and developed a novel method to decrease mpMRI acquisition time. Subsequently, in the second specific aim of this dissertation, we investigated underlying questions regarding the implementation of RT-related auto-segmentation. Firstly, we quantified interobserver variability for an unprecedented large number of observers for various radiotherapy structures in several disease sites (with a particular emphasis on OPC) using a novel crowdsourcing platform. We then trained an AI algorithm on a series of extant matched mpMRI datasets to segment OPC primary tumors. Moreover, we validated and compared our best model\u27s performance to clinical expert observers. We demonstrated that AI-based mpMRI OPC tumor auto-segmentation offers decreased variability and comparable accuracy to clinical experts, and certain mpMRI input channel combinations could further improve performance. Finally, in the third specific aim of this dissertation, we predicted OPC primary tumor mid-therapy (rapid) treatment response and prognostic outcomes. Using co-registered pre-therapy and mid-therapy primary tumor manual segmentations of OPC patients, we generated and characterized treatment sensitive and treatment resistant pre-RT sub-volumes. These sub-volumes were used to train an AI algorithm to predict individual voxel-wise treatment resistance. Additionally, we developed an AI algorithm to predict OPC patient progression free survival using pre-therapy imaging from an international data science competition (ranking 1st place), and then translated these approaches to mpMRI data. We demonstrated AI models could be used to predict rapid response and prognostic outcomes using pre-therapy imaging, which could help guide treatment adaptation, though further work is needed. In summary, the completion of these aims facilitates the development of an image-guided fully automated OPC clinical decision support tool. The resultant deliverables from this project will positively impact patients by enabling optimized therapeutic interventions in OPC. Future work should consider investigating additional imaging timepoints, imaging modalities, uncertainty quantification, perceptual and ethical considerations, and prospective studies for eventual clinical implementation. A dynamic version of this dissertation is publicly available and assigned a digital object identifier through Figshare (doi: 10.6084/m9.figshare.22141871)

    Improving medical care for adults with Prader-Willi syndrome

    Get PDF

    A Robust Unified Graph Model Based on Molecular Data Binning for Subtype Discovery in High-dimensional Spaces

    Get PDF
    Machine learning (ML) is a subfield of artificial intelligence (AI) that has already revolutionised the world around us. It is a widely employed process for discovering patterns and groups within datasets. It has a wide range of applications including disease subtyping, which aims to discover intrinsic subtypes of disease in large-scale unlabelled data. Whilst the groups discovered in multi-view high-dimensional data by ML algorithms are promising, their capacity to identify pertinent and meaningful groups is limited by the presence of data variability and outliers. Since outlier values represent potential but unlikely outcomes, they are statistically and philosophically fascinating. Therefore, the primary aim of this thesis was to propose a robust approach that discovers meaningful groups while considering the presence of data variability and outliers in the data. To achieve this aim, a novel robust approach (ROMDEX) was developed that utilised the proposed intermediate graph models (IMGs) for robust computation of proximity between observations in the data. Finally, a robust multi-view graph-based clustering approach was developed based on ROMDEX that improved the discovery of meaningful groups that were hidden behind the noise in the data. The proposed approach was validated on real-world, and synthetic data for disease subtyping. Additionally, the stability of the approach was assessed by evaluating its performance across different levels of noise in clustering data. The results were evaluated through Kaplan-Meier survival time analysis for disease subtyping. Also, the concordance index (CI) and normalised mutual information (NMI) are used to evaluate the predictive ability of the proposed clustering model. Additionally, the accuracy, Kappa statistic and rand index are computed to evaluate the clustering stability against various levels of Gaussian noise. The proposed approach outperformed the existing state-of-the-art approaches MRGC, PINS, SNF, Consensus Clustering, and Icluster+ on these datasets. The findings for all datasets were outstanding, demonstrating the predictive ability of the proposed unsupervised graph-based clustering approach

    Estudio de intervención en pérdida de peso y cambios en los valores de parámetros metabólicos, antropométricos y modulación de fenotipos intermedios y la aparición de fenotipos finales tras el seguimiento de una dieta mediterránea hipocalórica y actividad física

    Get PDF
    El Síndrome metabólico junto a la obesidad son las grandes pandemias del siglo XXI estimando su prevalencia global superior al 30%. El Síndrome metabólico comienza con la obesidad central, siendo ésta la que promueve la aparición de otras alteraciones y factores de riesgo (la resistencia a la insulina, la hipertensión y la dislipidemia). La prevención de la obesidad y del Síndrome metabólico permitirá reducir la incidencia actual de enfermedades crónicas, incluidas la diabetes, las enfermedades cardiovasculares y distintos tipos de cáncer, que, a su vez, son causas cada vez más importantes de discapacidad y muerte prematura. En el desarrollo del Síndrome metabólico intervienen diversos factores. En los últimos años, las ciencias ómicas han revolucionado la investigación biomédica, aumentando el potencial para investigar los mecanismos moleculares de este tipo de patologías. Actualmente existen nuevas herramientas genómicas para analizar el riesgo de enfermedad y son una gran oportunidad teniendo en cuenta que el riesgo de una persona de desarrollar una enfermedad se ve afectado no solo por su genoma (susceptibilidad genética) sino también por la exposición a factores ambientales (el llamado exposoma en una definición amplia). De esta forma, factores extrínsecos como son cambios en el estilo de vida y dieta, tratamientos con fármacos, aspectos psicosociales, etc. pueden modular estas variaciones epigenéticas y por consiguiente la expresión de estos genes vinculados al desarrollo de la enfermedad. Esta Tesis se centra en el análisis de los participantes en el estudio PREDIMED-Plus-Valencia. Se trata de una población de edad avanzada con síndrome metabólico y sobrepeso/obesidad con un elevado riesgo de sufrir enfermedades cardiovasculares y otros fenotipos cardiometabólicos relacionados. Los hábitos dietéticos de la población eran poco saludables y presentaban un elevado porcentaje de sedentarismo. Una intervención intensiva con dieta mediterránea hipocalórica junto con un incremento de la actividad física dentro de un ensayo aleatorizado de intervención, podría ser eficaz consiguiendo una pérdida de peso y una mejora de parámetros cardiometabólicos resultando una reducción del riesgo cardiovascular y de otros fenotipos relacionados, en comparación con un grupo control. Además, hemos analizado otras variables relevantes en los estudios nutricionales como la percepción del sabor y las preferencias por sabores y por alimentos, determinando que existen asociaciones importantes entre la intensidad de percepción de los distintos sabores y parámetros de adiposidad basales, difiriendo entre ellos. Es más, la creación de la denominada “total taste score” como indicador de percepción global del sabor nos ha permitido conocer que tiene una fuerte asociación inversa con la adiposidad (p<0,001). Otras variables de interés (impulsividad, cognitivas, sueño), mostraron asociaciones entre patrones de riesgo cardiometabólico y ser más impulsivo o el cronotipo vespertino, e influir también en los patrones de dieta. Del mismo modo, se determinaron resultados sobre algunos marcadores ómicos (genómicos, epigenómicos, transcriptómicos, metabolómicos), mostrando la necesidad de ser incorporados en estos estudios para conocer su contribución en los distintos fenotipos y exposiciones analizadas y valorar su posible aplicación en la denominada salud de precisión. La realización de GWAS para investigar la influencia de los marcadores genómicos en la percepción de los cinco sabores en los participantes del nodo de Valencia, ha permitido confirmar la fuerte influencia genética en la capacidad de percibir el sabor amargo, concretamente en el gen TAS2R38. Respecto a variantes genéticas asociadas con las preferencias por los distintos sabores, los resultados más relevantes los hemos obtenido con el sabor dulce. Se identificaron varios SNP en el gen PTPNR2 superando la significación estadística a nivel de GWAS. Por otra parte, en los análisis epigenómicos detectamos una importante influencia de factores del exposoma en la metilación del ADN. Esta asociación fue muy significativa entre el estado de fumador y la metilación del ADN, caracterizando la huella metilómica de exposición al tabaco (fumadores, exfumadores y nunca fumadores) en esta población. De todos los lugares de metilación estadísticamente asociados, podemos destacar como más significativo el cg21566642 en el cromosoma 2, en el que el consumo de tabaco ocasiona una hipometilación, al igual que sucede con otros lugares de metilación de la huella metilómica. Esta hipometilación, en algunos lugares CpG puede ser modulada por una mayor adherencia a la dieta mediterránea. Igualmente, hemos visto que la intervención con dieta mediterránea puede modular expresión de varios genes, tanto a nivel de genes candidatos como de transcriptoma completo en un análisis piloto. A pesar de que el periodo de intervención coincidió con la pandemia de COVID-19, las medidas de contingencia para realizar las intervenciones online y otras alternativas exitosas pudieron mitigar el impacto de la pandemia en los resultados.Metabolic syndrome together with obesity are the great pandemics of the 21st century, with an estimated global prevalence of more than 30%. Metabolic syndrome begins with abdominal obesity that often leads to the appearance of other alterations and risk factors (insulin resistance, high blood pressure and dyslipidemia). Preventing obesity and metabolic syndrome will reduce the currently high rate of chronic diseases, including diabetes, cardiovascular diseases and different types of cancer, which are all increasingly important causes of disability and premature death. Various factors are involved in the development of metabolic syndrome. In recent years, omic sciences have revolutionized biomedical research, allowing us to investigate the molecular mechanisms of this type of pathology. There are now new genomic tools for analyzing disease risk and these provide a great opportunity, as an individual's risk of developing a disease is affected not only by their genome (genetic susceptibility) but also by exposure to environmental factors (referred to as the exposome in a broad definition). Hence, extrinsic factors such as changes in lifestyle and diet, drug treatments, psychosocial aspects, etc. can modulate these epigenetic variations and, therefore, the expression of those genes linked to the development of the disease. This Thesis focuses on the analysis of participants in the PREDIMED-Plus-Valencia study. This is an elderly population with metabolic syndrome and overweight/obesity at high risk of cardiovascular disease and other related cardiometabolic phenotypes. The dietary habits of the population were not very healthy as well as presenting a high percentage of sedentary lifestyle. An intensive intervention with a hypocaloric Mediterranean diet together with an increase in physical activity in a randomized intervention trial proved to be effective, achieving weight loss and an improvement in cardiometabolic parameters, resulting in a reduction in cardiovascular risk and other related phenotypes, in comparison with the control group. In addition, we analyzed other relevant variables in nutritional studies such as taste perception and food and taste preferences, finding that there are important associations between the intensity of different taste perceptions and baseline adiposity parameters, there being differences between them. Moreover, the creation of the so-called "total taste score" as an indicator of global taste perception allowed us to observe that this has a strong inverse association with adiposity (p<0.001). Other variables of interest (impulsivity, cognitive, sleep) revealed associations between cardiometabolic risk patterns and being more impulsive or the evening chronotype, also influencing dietary patterns. Moreover, results were obtained for several omic markers (genomic, epigenomic, transcriptomic, metabolomic), so underlining the need for these to be incorporated into these studies to better understand their contribution to the different phenotypes and exposures analyzed, and to assess their possible application in so-called precision health. Carrying out GWAS to investigate the influence of genomic markers on the perception of the five tastes in participants of the Valencia field center, confirmed a strong genetic influence on the ability to perceive bitter taste, specifically in the TAS2R38 gene. Regarding genetic variants associated with different taste preferences, the most relevant results were found for sweet taste. Several SNPs were identified in the PTPNR2 gene that exceeded statistical significance at the GWAS level. Furthermore, in epigenomic analyses we detected a significant influence of exposome factors on DNA methylation. This association was highly significant between smoking status and DNA methylation, having characterized the methylomic fingerprint of tobacco exposure (smokers, ex-smokers, and never smokers) in this population. Of all the statistically associated methylation sites, we can highlight the cg21566642 on chromosome 2 as the most significant, where tobacco consumption causes hypomethylation, as it does at other methylation sites of the DNA methylation signature. This hypomethylation, at some CpG sites, can be modulated by increased adherence to the Mediterranean diet. Likewise, we observed in a pilot analysis that a Mediterranean diet intervention can modulate the expression of several genes, both at the level of candidate genes and of the whole transcriptome. Despite the intervention period coinciding with the COVID-19 pandemic, contingency measures were taken to carry out online interventions and other successful alternatives that were able to mitigate the impact of the pandemic on the results
    corecore