27 research outputs found

    Computational approaches and resources to support translational research in human diseases

    Get PDF
    In the last two decades, the volume and variety of biomedical data has dramatically increased. The data is heterogeneous and scattered across many resources. This produces bottlenecks in the analysis and extraction of knowledge from this sea of information. To overcome this hurdle, better catalogs that integrate different data types, offer easy access to users, and support automatic workflows, are needed. With this in mind, we have developed DisGeNET, a discovery platform that contains information on more than 17,000 genes related to over 14,000 diseases. We have used DisGeNET to study the properties of disease genes in the context of protein interaction networks. To produce an accurate analysis of the mesoscale properties of the human interactome, we first compared the network partitions generated by two popular clustering algorithms, to assess how this would impact the follow-up biological analysis. Using the best performing algorithm we then explored the network properties of disease genes. Then we evaluated the relationship between the network properties of different groups of disease genes and their tolerance to likely deleterious germline variants across human populations. Finally, we have developed a new network medicine approach to study disease comorbidities, and applied it to the analysis of COPD comorbidities.Los avances tecnológicos de las últimas dos décadas han producido un incremento dramático en la cantidad y la diversidad de datos biomédicos disponibles. Este proceso ha ocurrido de manera fragmentada, y en consecuencia los datos se encuentran almacenados en distintos repositorios, lo cual impone barreras a la hora de integrarlos, analizarlos y extraer conocimiento a partir de ellos. Para superar estas barreras, es necesario contar con recursos computacionales que integren esta información, y ofrezcan un fácil acceso a la misma, permitiendo al mismo tiempo su análisis automatizado. En respuesta a esta necesidad hemos desarrollado DisGeNET, una plataforma orientada a la exploración de las causas genéticas de las enfermedades humanas, que contiene actualmente información sobre más de 14.000 enfermedades y 17.000 genes. En esta tesis, describimos el uso de DisGeNET para el estudio de las propiedades de los genes asociados a enfermedades en el contexto de redes de interacción entre proteínas. Para ello, evaluamos previamente cómo la utilización de distintos algoritmos de reconocimiento de comunidades en redes afecta a los resultados de los análisis e influencia su interpretación biológica. A continuación, caracterizamos las propiedades de redes de los genes asociados a enfermedades como conjunto y también en sub-grupos, empleando diferentes criterios de clasificaciones de las enfermedades. Posteriormente, evaluamos cómo estas propiedades están relacionadas con la tolerancia a mutaciones posiblemente deletéreas en distintos grupos de genes, mediante el análisis de datos generados por las nuevas tecnologías de secuenciación. Finalmente, desarrollamos una nueva metodología de medicina de sistemas para explorar los mecanismos moleculares de la comorbilidades, y la aplicamos al estudio de las comorbilidades de la enfermedad pulmonar obstructiva crónic

    In silico models in drug development: where we are

    No full text
    The use and utility of computational models in drug development has significantly grown in the last decades, fostered by the availability of high throughput datasets and new data analysis strategies. These in silico approaches are demonstrating their ability to generate reliable predictions as well as new knowledge on the mode of action of drugs and the mechanisms underlying their side effects, altogether helping to reduce the costs of drug development. The aim of this review is to provide a panorama of developments in the field in the last two years.We acknowledge support from ISCIII-FEDER (CPII16/00026), the EU H2020 Programme 2014-2020 under grant agreements no. 681002 (EU-ToxRisk) and no. 676559 (ELIXIR-EXCELERATE), and the IMI2-JU under grants agreements no. 116030 (TransQST) and no. 777365 (eTRANSAFE), resources of which are composed of financial contribution from the EU-H2020 and EFPIA companies in kind contribution. The Research Programme on Biomedical Informatics (GRIB) is a member of the Spanish National Bioinformatics Institute (INB), funded by ISCIII and FEDER. The DCEXS is a ‘Unidad de Excelencia María de Maeztu’, funded by the MINECO (ref: MDM-2014-0370)

    In silico models in drug development: where we are

    No full text
    The use and utility of computational models in drug development has significantly grown in the last decades, fostered by the availability of high throughput datasets and new data analysis strategies. These in silico approaches are demonstrating their ability to generate reliable predictions as well as new knowledge on the mode of action of drugs and the mechanisms underlying their side effects, altogether helping to reduce the costs of drug development. The aim of this review is to provide a panorama of developments in the field in the last two years.We acknowledge support from ISCIII-FEDER (CPII16/00026), the EU H2020 Programme 2014-2020 under grant agreements no. 681002 (EU-ToxRisk) and no. 676559 (ELIXIR-EXCELERATE), and the IMI2-JU under grants agreements no. 116030 (TransQST) and no. 777365 (eTRANSAFE), resources of which are composed of financial contribution from the EU-H2020 and EFPIA companies in kind contribution. The Research Programme on Biomedical Informatics (GRIB) is a member of the Spanish National Bioinformatics Institute (INB), funded by ISCIII and FEDER. The DCEXS is a ‘Unidad de Excelencia María de Maeztu’, funded by the MINECO (ref: MDM-2014-0370)

    Benchmarking post-GWAS analysis tools in major depression: Challenges and implications

    No full text
    Our knowledge of complex disorders has increased in the last years thanks to the identification of genetic variants (GVs) significantly associated with disease phenotypes by genome-wide association studies (GWAS). However, we do not understand yet how these GVs functionally impact disease pathogenesis or their underlying biological mechanisms. Among the multiple post-GWAS methods available, fine-mapping and colocalization approaches are commonly used to identify causal GVs, meaning those with a biological effect on the trait, and their functional effects. Despite the variety of post-GWAS tools available, there is no guideline for method eligibility or validity, even though these methods work under different assumptions when accounting for linkage disequilibrium and integrating molecular annotation data. Moreover, there is no benchmarking of the available tools. In this context, we have applied two different fine-mapping and colocalization methods to the same GWAS on major depression (MD) and expression quantitative trait loci (eQTL) datasets. Our goal is to perform a systematic comparison of the results obtained by the different tools. To that end, we have evaluated their results at different levels: fine-mapped and colocalizing GVs, their target genes and tissue specificity according to gene expression information, as well as the biological processes in which they are involved. Our findings highlight the importance of fine-mapping as a key step for subsequent analysis. Notably, the colocalizing variants, altered genes and targeted tissues differed between methods, even regarding their biological implications. This contribution illustrates an important issue in post-GWAS analysis with relevant consequences on the use of GWAS results for elucidation of disease pathobiology, drug target prioritization and biomarker discovery.IMI2-JU resources which are composed of financial contributions from the European Union’s Horizon 2020 Research and Innovation Programme and EFPIA (GA: 116030 TransQST and GA: 777365 eTRANSAFE), and the EU H2020 Programme 2014–2020 (GA: 676559 Elixir-Excelerate); Project 001-P-001647—Valorisation of EGA for Industry and Society funded by the European Regional Development Fund (ERDF) and Generalitat de Catalunya; Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya (2017SGR00519), and the Institute of Health Carlos III (project IMPaCT-Data, exp. IMP/00019), co-funded by the European Union, European Regional Development Fund (ERDF, “A way to make Europe”). The Research Programme on Biomedical Informatics (GRIB) is a member of the Spanish National Bioinformatics Institute (INB), funded by ISCIII and ERDF (PRB2-ISCIII (PT13/0001/0023, of the PE I + D + i 2013–2016)). The MELIS is a ‘Unidad de Excelencia María de Maeztu’, funded by the MINECO (MDM-2014-0370). JP-G was supported by Instituto de Salud Carlos III-Fondo Social Europeo (FI18/00034). This statement is a requirement from our funding agencies and therefore has to be included in the Funding section

    ResMarkerDB: a database of biomarkers of response to antibody therapy in breast and colorectal cancer

    No full text
    The clinical efficacy of therapeutic monoclonal antibodies for breast and colorectal cancer has greatly contributed to the improvement of patients' outcomes by individualizing their treatments according to their genomic background. However, primary or acquired resistance to treatment reduces its efficacy. In this context, the identification of biomarkers predictive of drug response would support research and development of new alternative treatments. Biomarkers play a major role in the genomic revolution, supporting disease diagnosis and treatment decision-making. Currently, several molecular biomarkers of treatment response for breast and colorectal cancer have been described. However, information on these biomarkers is scattered across several resources, and needs to be identified, collected and properly integrated to be fully exploited to inform monitoring of drug response in patients. Therefore, there is a need of resources that offer biomarker data in a harmonized manner to the user to support the identification of actionable biomarkers of response to treatment in cancer. ResMarkerDB was developed as a comprehensive resource of biomarkers of drug response in colorectal and breast cancer. It integrates data of biomarkers of drug response from existing repositories, and new data extracted and curated from the literature (referred as ResCur). ResMarkerDB currently features 266 biomarkers of diverse nature. Twenty-five percent of these biomarkers are exclusive of ResMarkerDB. Furthermore, ResMarkerDB is one of the few resources offering non-coding DNA data in response to drug treatment. The database contains more than 500 biomarker-drug-tumour associations, covering more than 100 genes. ResMarkerDB provides a web interface to facilitate the exploration of the current knowledge of biomarkers of response in breast and colorectal cancer. It aims to enhance translational research efforts in identifying actionable biomarkers of drug response in cancer.Instituto de Salud Carlos III-Fondo Europeo de Desarrollo Regional [grant numbers: PIE15/00008, CP10/00524, CPII16/00026]; Instituto de Salud Carlos III-Fondo Social Europeo [FI18/00034]; and the European Commission Horizon 2020 Programme 2014–2020 under grant agreements MedBioinformatics [grant number: 634143] and Elixir-Excelerate [grant number: 676559]. The Research Programme on Biomedical Informatics is a member of the Spanish National Bioinformatics Institute, Plataforma de Recursos Biomoleculares y Bioinformáticos-Instituto de Salud Carlos III [grant number: PT13/0001/0023], of the PE I + D + i 2013–2016, funded by Instituto de Salud Carlos III and Fondo Europeo de Desarrollo Regional. The Departamento de Ciencias Experimentales y de la Salud is a Unidad de Excelencia María de Maeztu, funded by the Ministerio de Economía y Competitividad (reference number: MDM-2014-0370)

    ResMarkerDB: a database of biomarkers of response to antibody therapy in breast and colorectal cancer

    No full text
    The clinical efficacy of therapeutic monoclonal antibodies for breast and colorectal cancer has greatly contributed to the improvement of patients' outcomes by individualizing their treatments according to their genomic background. However, primary or acquired resistance to treatment reduces its efficacy. In this context, the identification of biomarkers predictive of drug response would support research and development of new alternative treatments. Biomarkers play a major role in the genomic revolution, supporting disease diagnosis and treatment decision-making. Currently, several molecular biomarkers of treatment response for breast and colorectal cancer have been described. However, information on these biomarkers is scattered across several resources, and needs to be identified, collected and properly integrated to be fully exploited to inform monitoring of drug response in patients. Therefore, there is a need of resources that offer biomarker data in a harmonized manner to the user to support the identification of actionable biomarkers of response to treatment in cancer. ResMarkerDB was developed as a comprehensive resource of biomarkers of drug response in colorectal and breast cancer. It integrates data of biomarkers of drug response from existing repositories, and new data extracted and curated from the literature (referred as ResCur). ResMarkerDB currently features 266 biomarkers of diverse nature. Twenty-five percent of these biomarkers are exclusive of ResMarkerDB. Furthermore, ResMarkerDB is one of the few resources offering non-coding DNA data in response to drug treatment. The database contains more than 500 biomarker-drug-tumour associations, covering more than 100 genes. ResMarkerDB provides a web interface to facilitate the exploration of the current knowledge of biomarkers of response in breast and colorectal cancer. It aims to enhance translational research efforts in identifying actionable biomarkers of drug response in cancer.Instituto de Salud Carlos III-Fondo Europeo de Desarrollo Regional [grant numbers: PIE15/00008, CP10/00524, CPII16/00026]; Instituto de Salud Carlos III-Fondo Social Europeo [FI18/00034]; and the European Commission Horizon 2020 Programme 2014–2020 under grant agreements MedBioinformatics [grant number: 634143] and Elixir-Excelerate [grant number: 676559]. The Research Programme on Biomedical Informatics is a member of the Spanish National Bioinformatics Institute, Plataforma de Recursos Biomoleculares y Bioinformáticos-Instituto de Salud Carlos III [grant number: PT13/0001/0023], of the PE I + D + i 2013–2016, funded by Instituto de Salud Carlos III and Fondo Europeo de Desarrollo Regional. The Departamento de Ciencias Experimentales y de la Salud is a Unidad de Excelencia María de Maeztu, funded by the Ministerio de Economía y Competitividad (reference number: MDM-2014-0370)

    Mining the modular structure of protein interaction networks.

    No full text
    BACKGROUND: Cluster-based descriptions of biological networks have received much attention in recent years fostered by accumulated evidence of the existence of meaningful correlations between topological network clusters and biological functional modules. Several well-performing clustering algorithms exist to infer topological network partitions. However, due to respective technical idiosyncrasies they might produce dissimilar modular decompositions of a given network. In this contribution, we aimed to analyze how alternative modular descriptions could condition the outcome of follow-up network biology analysis. METHODOLOGY: We considered a human protein interaction network and two paradigmatic cluster recognition algorithms, namely: the Clauset-Newman-Moore and the infomap procedures. We analyzed to what extent both methodologies yielded different results in terms of granularity and biological congruency. In addition, taking into account Guimera's cartographic role characterization of network nodes, we explored how the adoption of a given clustering methodology impinged on the ability to highlight relevant network meso-scale connectivity patterns. RESULTS: As a case study we considered a set of aging related proteins and showed that only the high-resolution modular description provided by infomap, could unveil statistically significant associations between them and inter/intra modular cartographic features. Besides reporting novel biological insights that could be gained from the discovered associations, our contribution warns against possible technical concerns that might affect the tools used to mine for interaction patterns in network biology studies. In particular our results suggested that sub-optimal partitions from the strict point of view of their modularity levels might still be worth being analyzed when meso-scale features were to be explored in connection with external source of biological knowledge.This project has been made possible by CONICET (grant PIP0087), UBACyT (grant 20020110200314), ISCIII-FEDER (PI13/00082 and CP10/00524), IMI JU (grant agreements n° [115002] (eTOX) and n° [115191] (Open PHACTS)], resources of which are composed of financial contribution from the EU's FP7 (FP7/2007–2013) and EFPIA companies’ in kind contribution

    Mining the modular structure of protein interaction networks.

    No full text
    BACKGROUND: Cluster-based descriptions of biological networks have received much attention in recent years fostered by accumulated evidence of the existence of meaningful correlations between topological network clusters and biological functional modules. Several well-performing clustering algorithms exist to infer topological network partitions. However, due to respective technical idiosyncrasies they might produce dissimilar modular decompositions of a given network. In this contribution, we aimed to analyze how alternative modular descriptions could condition the outcome of follow-up network biology analysis. METHODOLOGY:/nWe considered a human protein interaction network and two paradigmatic cluster recognition algorithms, namely: the Clauset-Newman-Moore and the infomap procedures. We analyzed to what extent both methodologies yielded different results in terms of granularity and biological congruency. In addition, taking into account Guimera's cartographic role characterization of network nodes, we explored how the adoption of a given clustering methodology impinged on the ability to highlight relevant network meso-scale connectivity patterns. RESULTS: As a case study we considered a set of aging related proteins and showed that only the high-resolution modular description provided by infomap, could unveil statistically significant associations between them and inter/intra modular cartographic features. Besides reporting novel biological insights that could be gained from the discovered associations, our contribution warns against possible technical concerns that might affect the tools used to mine for interaction patterns in network biology studies. In particular our results suggested that sub-optimal partitions from the strict point of view of their modularity levels might still be worth being analyzed when meso-scale features were to be explored in connection with external source of biological knowledge.: CONICET (grant PIP0087), UBACyT (grant 20020110200314), ISCIII-FEDER (PI13/00082 and CP10/00524), IMI JU (grant agreements n° [115002] (eTOX) and n° [115191] (Open PHACTS)], resources of which are composed of financial contribution from the EU's FP7 (FP7/2007–2013) and EFPIA companies’ in kind contribution

    Functional genomics analysis to disentangle the role of genetic variants in major depression

    No full text
    Understanding the molecular basis of major depression is critical for identifying new potential biomarkers and drug targets to alleviate its burden on society. Leveraging available GWAS data and functional genomic tools to assess regulatory variation could help explain the role of major depression-associated genetic variants in disease pathogenesis. We have conducted a fine-mapping analysis of genetic variants associated with major depression and applied a pipeline focused on gene expression regulation by using two complementary approaches: cis-eQTL colocalization analysis and alteration of transcription factor binding sites. The fine-mapping process uncovered putative causally associated variants whose proximal genes were linked with major depression pathophysiology. Four colocalizing genetic variants altered the expression of five genes, highlighting the role of SLC12A5 in neuronal chlorine homeostasis and MYRF in nervous system myelination and oligodendrocyte differentiation. The transcription factor binding analysis revealed the potential role of rs62259947 in modulating P4HTM expression by altering the YY1 binding site, altogether regulating hypoxia response. Overall, our pipeline could prioritize putative causal genetic variants in major depression. More importantly, it can be applied when only index genetic variants are available. Finally, the presented approach enabled the proposal of mechanistic hypotheses of these genetic variants and their role in disease pathogenesis.IMI2-JU resources which are composed of financial contributions from the European Union’s Horizon 2020 Research and Innovation Programme and EFPIA [GA: 116030 TransQST and GA: 777365 eTRANSAFE], and the EU H2020 Programme 2014–2020 [GA: 676559 Elixir-Excelerate]; Project 001-P-001647—Valorisation of EGA for Industry and Society funded by the European Regional Development Fund (ERDF) and Generalitat de Catalunya; Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya [2017SGR00519], and the Institute of Health Carlos III (project IMPaCT-Data, exp. IMP/00019), co-funded by the European Union, European Regional Development Fund (ERDF, “A way to make Europe”). The Research Programme on Biomedical Informatics (GRIB) is a member of the Spanish National Bioinformatics Institute (INB), funded by ISCIII and ERDF (PRB2-ISCIII [PT13/0001/0023, of the PE I + D + i 2013–2016]). The MELIS is a ‘Unidad de Excelencia María de Maeztu’, funded by the MINECO [MDM-2014-0370]. AMR was supported by CONACYT-FORDECYT-PRONACES grant no. [11311], and Programa de Apoyo a Proyectos de Investigación e Innovación Tecnológica–Universidad Nacional Autónoma de México (PAPIIT-UNAM) grant nos. IA203021. JPG was supported by Instituto de Salud Carlos III-Fondo Social Europeo [FI18/00034]; Instituto de Salud Carlos III [MV20]. This work reflects only the author’s view and that the IMI2-JU is not responsible for any use that may be made of the information it contains

    Uncovering disease mechanisms through network biology in the era of next generation sequencing.

    No full text
    Characterizing the behavior of disease genes in the context of biological networks has the potential to shed light on disease mechanisms, and to reveal both new candidate disease genes and therapeutic targets. Previous studies addressing the network properties of disease genes have produced contradictory results. Here we have explored the causes of these discrepancies and assessed the relationship between the network roles of disease genes and their tolerance to deleterious germline variants in human populations leveraging on: the abundance of interactome resources, a comprehensive catalog of disease genes and exome variation data. We found that the most salient network features of disease genes are driven by cancer genes and that genes related to different types of diseases play network roles whose centrality is inversely correlated to their tolerance to likely deleterious germline mutations. This proved to be a multiscale signature, including global, mesoscopic and local network centrality features. Cancer driver genes, the most sensitive to deleterious variants, occupy the most central positions, followed by dominant disease genes and then by recessive disease genes, which are tolerant to variants and isolated within their network modules.We received support from UBACyT (20020130100582BA) and MinCyT (PICT2014-2701), ISCIII-FEDER (PI13/00082, CP10/00524), IMI-JU under grants agreements n° 115002 (eTOX), n° 115191 (Open PHACTS)], n° 115372 (EMIF) and n° 115735 (iPiE), resources of which are composed of financial contribution from the European Union’s Seventh Framework Programme (FP7/2007-2013) and EFPIA companies’ in kind contribution, and the EU H2020 Programme 2014-2020 under grant agreements no. 634143 (MedBioinformatics) and no. 676559 (Elixir-Excelerate). The Research Programme on Biomedical Informatics (GRIB) is a node of the Spanish National Institute of Bioinformatics (INB). A.G.-P. is supported by a Ramon y Cajal scholarship funded by the Spanish Ministry of Economy. The authors would like to thank the Exome Aggregation Consortium and the groups that provided exome variant data for comparison. A full list of contributing groups can be found at http://exac.broadinstitute.org/about
    corecore