14 research outputs found

    CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources

    Get PDF
    During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase.The Spanish Ministry of Science and Innovation (MICINN) [BIO2011-27069]; the Conselleria de Educacio of the Valencian Community [PROMETEO/2010/001]; National Institute of Bioinformatics (www.inab.org); CIBER de Enfermedades Raras (CIBERER), ISCIII and MICINN; Red Tematica de Investigacion Cooperativa en Cancer (RTICC) [RD06/0020/1019] ISCIII, MICINN and INNPACTO [IPT-010000-2010-43], MICINN. Funding for open access charge: MICINN [BIO2011-27069].Bleda, M.; Tarraga, J.; De María, A.; Salavert, F.; García-Alonso, L.; Celma Giménez, M.; Martín Mayordomo, A.... (2012). CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. Nucleic Acids Research. 40(1):609-614. https://doi.org/10.1093/nar/gks575S609614401Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., … Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology, 25(11), 1251-1255. doi:10.1038/nbt1346Flicek, P., Amode, M. R., Barrell, D., Beal, K., Brent, S., Carvalho-Silva, D., … Fitzgerald, S. (2011). Ensembl 2012. Nucleic Acids Research, 40(D1), D84-D90. doi:10.1093/nar/gkr991(2011). Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Research, 40(D1), D71-D75. doi:10.1093/nar/gkr981Kozomara, A., & Griffiths-Jones, S. (2010). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39(Database), D152-D157. doi:10.1093/nar/gkq1027Xiao, F., Zuo, Z., Cai, G., Kang, S., Gao, X., & Li, T. (2009). miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Research, 37(Database), D105-D110. doi:10.1093/nar/gkn851Hsu, S.-D., Lin, F.-M., Wu, W.-Y., Liang, C., Huang, W.-C., Chan, W.-L., … Huang, H.-D. (2010). miRTarBase: a database curates experimentally validated microRNA–target interactions. Nucleic Acids Research, 39(suppl_1), D163-D169. doi:10.1093/nar/gkq1107Friedman, R. C., Farh, K. K.-H., Burge, C. B., & Bartel, D. P. (2008). Most mammalian mRNAs are conserved targets of microRNAs. Genome Research, 19(1), 92-105. doi:10.1101/gr.082701.108Betel, D., Wilson, M., Gabow, A., Marks, D. S., & Sander, C. (2007). The microRNA.org resource: targets and expression. Nucleic Acids Research, 36(Database), D149-D153. doi:10.1093/nar/gkm995Dreszer, T. R., Karolchik, D., Zweig, A. S., Hinrichs, A. S., Raney, B. J., Kuhn, R. M., … James Kent, W. (2011). The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Research, 40(D1), D918-D923. doi:10.1093/nar/gkr1055Hunter, S., Jones, P., Mitchell, A., Apweiler, R., Attwood, T. K., Bateman, A., … Yong, S.-Y. (2011). InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Research, 40(D1), D306-D312. doi:10.1093/nar/gkr948Sherry, S. T. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Research, 29(1), 308-311. doi:10.1093/nar/29.1.308(2010). Integrating common and rare genetic variation in diverse human populations. Nature, 467(7311), 52-58. doi:10.1038/nature09298(2010). A map of human genome variation from population-scale sequencing. Nature, 467(7319), 1061-1073. doi:10.1038/nature09534Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S., & Manolio, T. A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences, 106(23), 9362-9367. doi:10.1073/pnas.0903103106Stenson, P. D., Ball, E. V., Mort, M., Phillips, A. D., Shiel, J. A., Thomas, N. S. T., … Cooper, D. N. (2003). Human Gene Mutation Database (HGMD®): 2003 update. Human Mutation, 21(6), 577-581. doi:10.1002/humu.10212Johnson, A. D., & O’Donnell, C. J. (2009). An Open Access Database of Genome-wide Association Results. BMC Medical Genetics, 10(1). doi:10.1186/1471-2350-10-6Forbes, S. A., Bindal, N., Bamford, S., Cole, C., Kok, C. Y., Beare, D., … Futreal, P. A. (2010). COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Research, 39(Database), D945-D950. doi:10.1093/nar/gkq929Kerrien, S., Aranda, B., Breuza, L., Bridge, A., Broackes-Carter, F., Chen, C., … Hermjakob, H. (2011). The IntAct molecular interaction database in 2012. Nucleic Acids Research, 40(D1), D841-D846. doi:10.1093/nar/gkr1088Croft, D., O’Kelly, G., Wu, G., Haw, R., Gillespie, M., Matthews, L., … Stein, L. (2010). Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Research, 39(Database), D691-D697. doi:10.1093/nar/gkq1018Demir, E., Cary, M. P., Paley, S., Fukuda, K., Lemer, C., Vastrik, I., … Luciano, J. (2010). The BioPAX community standard for pathway data sharing. Nature Biotechnology, 28(9), 935-942. doi:10.1038/nbt.1666Bhagat, J., Tanoh, F., Nzuobontane, E., Laurent, T., Orlowski, J., Roos, M., … Goble, C. A. (2010). BioCatalogue: a universal catalogue of web services for the life sciences. Nucleic Acids Research, 38(Web Server), W689-W694. doi:10.1093/nar/gkq394Dowell, R. D., Jokerst, R. M., Day, A., Eddy, S. R., & Stein, L. (2001). BMC Bioinformatics, 2(1), 7. doi:10.1186/1471-2105-2-

    VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy

    Get PDF
    Background -- The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use. Results -- Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org. Conclusions -- Because it uses novel mapping algorithms VISMapper is remarkably faster than previous available programs. It also provides a useful graphical interface to analyze the integration sites found in the genomic context

    Epigenetic Modifications in the Biology of Nonalcoholic Fatty Liver Disease: The Role of DNA-Hydroxymethylation and TET Proteins

    Get PDF
    The 5-Hydroxymethylcytosine (5-hmC) is an epigenetic modification whose role in the pathogenesis of metabolic-related complex diseases remains unexplored; 5-hmC appears to be prevalent in the mitochondrial genome. The Ten-Eleven-Translocation (TET) family of proteins is responsible for catalyzing the conversion of 5-methylcytosine to 5-hmC. We hypothesized that epigenetic editing by 5-hmC might be a novel mechanism through which nonalcoholic fatty liver disease (NAFLD)-associated molecular traits could be explained. Hence, we performed an observational study to explore global levels of 5-hmC in fresh liver samples of patients with NAFLD and controls (n = 90) using an enzyme-linked-immunosorbent serologic assay and immunohistochemistry. We also screened for genetic variation in TET 1–3 loci by next generation sequencing to explore its contribution to the disease biology. The study was conducted in 2 stages (discovery and replication) and included 476 participants. We observed that the amount of 5-hmC in the liver of both NAFLD patients and controls was relatively low (up to 0.1%); a significant association was found with liver mitochondrial DNA copy number (R = 0.50, P = 0.000382) and PPARGC1A-mRNA levels (R = −0.57, P = 0.04). We did not observe any significant difference in the 5-hmC nuclear immunostaining score between NAFLD patients and controls; nevertheless, we found that patients with NAFLD (0.4 ± 0.5) had significantly lower nonnuclear-5-hmC staining compared with controls (1.8 ± 0.8), means ± standard deviation, P = 0.028. The missense p.Ile1123Met variant (TET1-rs3998860) was significantly associated with serum levels of caspase-generated CK-18 fragment-cell death biomarker in the discovery and replication stage, and the disease severity (odds ratio: 1.47, 95% confidence interval: 1.10–1.97; P = 0.005). The p.Ile1762Val substitution (TET2-rs2454206) was associated with liver PPARGC1A-methylation and transcriptional levels, and Type 2 diabetes. Our results suggest that 5-hmC might be involved in the pathogenesis of NAFLD by regulating liver mitochondrial biogenesis and PPARGC1A expression. Genetic diversity at TET loci suggests an “epigenetic” regulation of programmed liver-cell death and a TET-mediated fine-tuning of the liver PPARGC1A-transcriptional program.Fil: Pirola, Carlos José. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Scian, Romina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Fernández Gianotti, Tomás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Dopazo, Hernán Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Ecología, Genética y Evolución de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Ecología, Genética y Evolución de Buenos Aires; ArgentinaFil: Rohr, Cristian Oscar. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Ecología, Genética y Evolución de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Ecología, Genética y Evolución de Buenos Aires; ArgentinaFil: San Martino, Julio. Provincia de Buenos Aires. Ministerio de Salud. Hospital Municipal Dr. Diego Thompson; ArgentinaFil: Castaño, Gustavo Osvaldo. Gobierno de la Ciudad de Buenos Aires. Hospital "Dr. Abel Zubizarreta"; ArgentinaFil: Sookoian, Silvia Cristina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; Argentin

    Characterization of genetic variants in 70 portuguese individuals

    Get PDF
    Dissertação de mestrado em BioinformáticaA análise genómica das populações tem contribuído significativamente para o aumento do número de SNVs descritos em bases de dados. Estudos populacionais prévios têm contribuído com 18 a 57% novas variantes. A nova informação genética é particularmente relevante enquanto referência para propósitos clínicos. Iniciativas à escala global como o 1000 Genomes Project (1kG) incluem populações Ibéricas, contudo, nenhum indivíduo Português foi incluído no mesmo grupo. Tanto quanto se sabe, nenhum indivíduo Português foi incluído no projeto gnomAD, o maior conjunto de dados genómicos atual. Acreditamos que uma coleção de informação genómica referente à população Portuguesa poderia trazer grandes benefícios ao diagnóstico molecular em pacientes Portugueses. As alterações genéticas detetadas em 70 indivíduos Portugueses foram inseridas em uma base de dados não-relacional. A informação publicada pelos projetos 1kG e gnomAD para cada alteração incluída nas mesmas foi adicionada à referida base de dados. Frequências alélicas reportadas para sete populações incluídas na base de dados do gnomAD, cinco populações do 1kG e 5 subpopulações Europeias do mesmo projeto foram comparadas contra os valores calculados para os nossos dados. As diferenças das distribuições alélicas foram testadas com o Fisher’s Exact test. Os p-values obtidos foram corrigidos de acordo com a sua False Discovery Rate (FDR). Os exomas de indivíduos Portugueses analisados continham 224,155 alterações genéticas filtradas de acordo com critérios de qualidade definidos no presente estudo. Aproximadamente 16,4% das variantes não se encontravam descritas nas bases de dados dos projetos 1kG e gnomAD. Os resultados obtidos endossam evidências, previamente descritas na literatura, de uma correlação entre as diferenças genéticas das populações comparadas em relação à população Portuguesa e a distância geográfica das mesmas a Portugal. Diferenças significativas entre distribuições alélicas da população estudada e outras subpopulações Europeias foram encontradas para 7,284 alterações genéticas distribuídas por 2,571 genes. Os resultados obtidos sugerem a existência de marcadores genéticos populacionais e podem motivar futuros estudos com vista a detetar marcadores genéticos específicos da população Portuguesa. O estudo apresentado representa uma contribuição significativa para, não só enriquecer iniciativas genómicas de grande escala, mas também para estabelecer uma referência auxiliar para análises genéticas a doentes Portugueses.The in-depth study of the genomics of single populations has contributed significantly to the enlargement of known SNVs in databases. Each single population study has contributed with 18 to 57% of novel SNVs. The new genetic information is particularly relevant as a reference for clinical purposes. Global-scale initiatives as the 1000 Genomes Project (1kG) already include Iberian population; however, no Portuguese individuals were included in this cohort. Furthermore, to our knowledge, gnomAD, the most extensive genomic dataset, does not include Portuguese individuals either. We believe that a Portuguese collection of genomic information would greatly benefit molecular diagnosis in Portuguese patients. Variants detected in 70 Portuguese individuals were inserted in a MongoDB No-SQL Database. The 1kG and gnomAD information for each variant were uploaded to the same database. Allele frequencies for seven gnomAD populations, five 1kG populations, and five 1kG European subpopulations were compared to the values calculated for our data. Allele distribution differences were tested with Fisher’s exact test. P-values were corrected for False Discovery Rate (FDR). The exomes of the Portuguese individuals contained 224,155 variants filtered accordingly to defined quality criteria. Approximately 16.4% of the variants had not been previously reported by 1kG or gnomAD projects. The present work endorsed the evidence for a correlation between genetic and geographic distance previously reported in the literature. Finally, significative differences were found for the allele distribution between our population and the other 1kG European subpopulations in 7,284 variants distributed by 2,571 genes. Results suggest the existence of populational genetic markers and may prompt future studies for detection of Portuguese-specific genetic markers. The present study is a significant contribution to enrich large-scale genomic initiatives and, to stand as a useful auxiliary reference for genetic analyses of Portuguese patients.Este trabalho foi efetuado no âmbito do projeto In2Genome, ref. CENTRO-01-0247-FEDER-017800, apoiado pelo Programa Operacional Regional do Centro de Portugal (CENTRO 2020), ao abrigo do Acordo de Parceria Portugal 2020, através do Comité Regional Europeu Fundo de Desenvolvimento (FEDER)

    Universal DNA methylation age across mammalian tissues

    Get PDF
    Aging, often considered a result of random cellular damage, can be accurately estimated using DNA methylation profiles, the foundation of pan-tissue epigenetic clocks. Here, we demonstrate the development of universal pan-mammalian clocks, using 11,754 methylation arrays from our Mammalian Methylation Consortium, which encompass 59 tissue types across 185 mammalian species. These predictive models estimate mammalian tissue age with high accuracy (r > 0.96). Age deviations correlate with human mortality risk, mouse somatotropic axis mutations and caloric restriction. We identified specific cytosines with methylation levels that change with age across numerous species. These sites, highly enriched in polycomb repressive complex 2-binding locations, are near genes implicated in mammalian development, cancer, obesity and longevity. Our findings offer new evidence suggesting that aging is evolutionarily conserved and intertwined with developmental processes across all mammals

    Universal DNA methylation age across mammalian tissues

    Get PDF
    Aging, often considered a result of random cellular damage, can be accurately estimated using DNA methylation profiles, the foundation of pan-tissue epigenetic clocks. Here, we demonstrate the development of universal pan-mammalian clocks, using 11,754 methylation arrays from our Mammalian Methylation Consortium, which encompass 59 tissue types across 185 mammalian species. These predictive models estimate mammalian tissue age with high accuracy (r &gt; 0.96). Age deviations correlate with human mortality risk, mouse somatotropic axis mutations and caloric restriction. We identified specific cytosines with methylation levels that change with age across numerous species. These sites, highly enriched in polycomb repressive complex 2-binding locations, are near genes implicated in mammalian development, cancer, obesity and longevity. Our findings offer new evidence suggesting that aging is evolutionarily conserved and intertwined with developmental processes across all mammals.<br/
    corecore