Search CORE

14 research outputs found

CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources

Author: Bleda Marta
Celma Giménez Matilde
De María Alejandro
Dopazo Joaquín
García-Alonso Luz
Martín Mayordomo Ainoha
Medina Ignacio
Salavert Francisco
Tarraga Joaquín
Publication venue: 'Oxford University Press (OUP)'
Publication date: 12/06/2012
Field of study

During the past years, the advances in high-throughput technologies have produced an unprecedented growth in the number and size of repositories and databases storing relevant biological data. Today, there is more biological information than ever but, unfortunately, the current status of many of these repositories is far from being optimal. Some of the most common problems are that the information is spread out in many small databases; frequently there are different standards among repositories and some databases are no longer supported or they contain too specific and unconnected information. In addition, data size is increasingly becoming an obstacle when accessing or storing biological data. All these issues make very difficult to extract and integrate information from different sources, to analyze experiments or to access and query this information in a programmatic way. CellBase provides a solution to the growing necessity of integration by easing the access to biological data. CellBase implements a set of RESTful web services that query a centralized database containing the most relevant biological data sources. The database is hosted in our servers and is regularly updated. CellBase documentation can be found at http://docs.bioinfo.cipf.es/projects/cellbase.The Spanish Ministry of Science and Innovation (MICINN) [BIO2011-27069]; the Conselleria de Educacio of the Valencian Community [PROMETEO/2010/001]; National Institute of Bioinformatics (www.inab.org); CIBER de Enfermedades Raras (CIBERER), ISCIII and MICINN; Red Tematica de Investigacion Cooperativa en Cancer (RTICC) [RD06/0020/1019] ISCIII, MICINN and INNPACTO [IPT-010000-2010-43], MICINN. Funding for open access charge: MICINN [BIO2011-27069].Bleda, M.; Tarraga, J.; De María, A.; Salavert, F.; García-Alonso, L.; Celma Giménez, M.; Martín Mayordomo, A.... (2012). CellBase, a comprehensive collection of RESTful web services for retrieving relevant biological information from heterogeneous sources. Nucleic Acids Research. 40(1):609-614. https://doi.org/10.1093/nar/gks575S609614401Smith, B., Ashburner, M., Rosse, C., Bard, J., Bug, W., … Lewis, S. (2007). The OBO Foundry: coordinated evolution of ontologies to support biomedical data integration. Nature Biotechnology, 25(11), 1251-1255. doi:10.1038/nbt1346Flicek, P., Amode, M. R., Barrell, D., Beal, K., Brent, S., Carvalho-Silva, D., … Fitzgerald, S. (2011). Ensembl 2012. Nucleic Acids Research, 40(D1), D84-D90. doi:10.1093/nar/gkr991(2011). Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Research, 40(D1), D71-D75. doi:10.1093/nar/gkr981Kozomara, A., & Griffiths-Jones, S. (2010). miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Research, 39(Database), D152-D157. doi:10.1093/nar/gkq1027Xiao, F., Zuo, Z., Cai, G., Kang, S., Gao, X., & Li, T. (2009). miRecords: an integrated resource for microRNA-target interactions. Nucleic Acids Research, 37(Database), D105-D110. doi:10.1093/nar/gkn851Hsu, S.-D., Lin, F.-M., Wu, W.-Y., Liang, C., Huang, W.-C., Chan, W.-L., … Huang, H.-D. (2010). miRTarBase: a database curates experimentally validated microRNA–target interactions. Nucleic Acids Research, 39(suppl_1), D163-D169. doi:10.1093/nar/gkq1107Friedman, R. C., Farh, K. K.-H., Burge, C. B., & Bartel, D. P. (2008). Most mammalian mRNAs are conserved targets of microRNAs. Genome Research, 19(1), 92-105. doi:10.1101/gr.082701.108Betel, D., Wilson, M., Gabow, A., Marks, D. S., & Sander, C. (2007). The microRNA.org resource: targets and expression. Nucleic Acids Research, 36(Database), D149-D153. doi:10.1093/nar/gkm995Dreszer, T. R., Karolchik, D., Zweig, A. S., Hinrichs, A. S., Raney, B. J., Kuhn, R. M., … James Kent, W. (2011). The UCSC Genome Browser database: extensions and updates 2011. Nucleic Acids Research, 40(D1), D918-D923. doi:10.1093/nar/gkr1055Hunter, S., Jones, P., Mitchell, A., Apweiler, R., Attwood, T. K., Bateman, A., … Yong, S.-Y. (2011). InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Research, 40(D1), D306-D312. doi:10.1093/nar/gkr948Sherry, S. T. (2001). dbSNP: the NCBI database of genetic variation. Nucleic Acids Research, 29(1), 308-311. doi:10.1093/nar/29.1.308(2010). Integrating common and rare genetic variation in diverse human populations. Nature, 467(7311), 52-58. doi:10.1038/nature09298(2010). A map of human genome variation from population-scale sequencing. Nature, 467(7319), 1061-1073. doi:10.1038/nature09534Hindorff, L. A., Sethupathy, P., Junkins, H. A., Ramos, E. M., Mehta, J. P., Collins, F. S., & Manolio, T. A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proceedings of the National Academy of Sciences, 106(23), 9362-9367. doi:10.1073/pnas.0903103106Stenson, P. D., Ball, E. V., Mort, M., Phillips, A. D., Shiel, J. A., Thomas, N. S. T., … Cooper, D. N. (2003). Human Gene Mutation Database (HGMD®): 2003 update. Human Mutation, 21(6), 577-581. doi:10.1002/humu.10212Johnson, A. D., & O’Donnell, C. J. (2009). An Open Access Database of Genome-wide Association Results. BMC Medical Genetics, 10(1). doi:10.1186/1471-2350-10-6Forbes, S. A., Bindal, N., Bamford, S., Cole, C., Kok, C. Y., Beare, D., … Futreal, P. A. (2010). COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Research, 39(Database), D945-D950. doi:10.1093/nar/gkq929Kerrien, S., Aranda, B., Breuza, L., Bridge, A., Broackes-Carter, F., Chen, C., … Hermjakob, H. (2011). The IntAct molecular interaction database in 2012. Nucleic Acids Research, 40(D1), D841-D846. doi:10.1093/nar/gkr1088Croft, D., O’Kelly, G., Wu, G., Haw, R., Gillespie, M., Matthews, L., … Stein, L. (2010). Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Research, 39(Database), D691-D697. doi:10.1093/nar/gkq1018Demir, E., Cary, M. P., Paley, S., Fukuda, K., Lemer, C., Vastrik, I., … Luciano, J. (2010). The BioPAX community standard for pathway data sharing. Nature Biotechnology, 28(9), 935-942. doi:10.1038/nbt.1666Bhagat, J., Tanoh, F., Nzuobontane, E., Laurent, T., Orlowski, J., Roos, M., … Goble, C. A. (2010). BioCatalogue: a universal catalogue of web services for the life sciences. Nucleic Acids Research, 38(Web Server), W689-W694. doi:10.1093/nar/gkq394Dowell, R. D., Jokerst, R. M., Day, A., Eddy, S. R., & Stein, L. (2001). BMC Bioinformatics, 2(1), 7. doi:10.1186/1471-2105-2-

Crossref

PubMed Central

RiuNet

VISMapper: ultra-fast exhaustive cartography of viral insertion sites for gene therapy

Author: A Arens
A Paruzynski
AR Schroder
Asunción Gallego
Felipe J. Chaves
H Li
H Li
HB Gaspar
I Medina
Ignacio Medina
J Tarraga
JD Hocum
Joaquín Dopazo
Joaquín Tárraga
José M. Juanes
JU Appelt
M Bleda
M Cavazzana-Calvo
N Cartier
Pablo Marín-Garcia
RS Mitchell
SA Forbes
SF Altschul
Vicente Arnau
WJ Kent
X Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Background -- The possibility of integrating viral vectors to become a persistent part of the host genome makes them a crucial element of clinical gene therapy. However, viral integration has associated risks, such as the unintentional activation of oncogenes that can result in cancer. Therefore, the analysis of integration sites of retroviral vectors is a crucial step in developing safer vectors for therapeutic use. Results -- Here we present VISMapper, a vector integration site analysis web server, to analyze next-generation sequencing data for retroviral vector integration sites. VISMapper can be found at: http://vismapper.babelomics.org. Conclusions -- Because it uses novel mapping algorithms VISMapper is remarkably faster than previous available programs. It also provides a useful graphical interface to analyze the integration sites found in the genomic context

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori d'Objectes Digitals per a l'Ensenyament la Recerca i la Cultura

Directory of Open Access Journals

Fondo Bibliográfico Digital Institucional

Identification of epistatic interactions through genome-wide association studies in sporadic medullary and juvenile papillary thyroid carcinomas

Author: A Bell
AL Price
Ana Torroglosa
BA Binzak
Berta Luzón-Toro
C Eng
C Xie
CL Holley
Cristina Y. Gonzalez
D Warde-Farley
Elena Navarro
F Frasca
G Sassolas
G Stelzer
GM Clarke
Guillermo Antiñolo
GW Randolph
H He
I Landa
Ignacio Medina
JJ Figge
Joaquin Dopazo
KC Bulusu
LA Arnaldi
LA Hindorff
Luz García-Alonso
M Bleda
M Uhlen
M Zou
Macarena Ruiz-Ferrer
Marta Bleda
Marta Martín-Sánchez
MD Ritchie
ME Dottorini
N Hod
P Gaudet
PI Bakker de
R Alonso
Raquel M. Fernández
RM Fernandez
RM Fernandez
S Purcell
S Ruiz-Llorente
SA Forbes
Salud Borrego
SJ Schonfeld
SK Musani
T Barrett
T Kondo
TA Manolio
TF Mackay
W Huang da
Y Benjamini
Y Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Epigenetic Modifications in the Biology of Nonalcoholic Fatty Liver Disease: The Role of DNA-Hydroxymethylation and TET Proteins

Author: Castaño Gustavo Osvaldo
Dopazo Hernán Javier
Fernández Gianotti Tomás
Pirola Carlos José
Rohr Cristian Oscar
San Martino Julio
Scian Romina
Sookoian Silvia Cristina
Publication venue: 'Ovid Technologies (Wolters Kluwer Health)'
Publication date: 01/09/2015
Field of study

The 5-Hydroxymethylcytosine (5-hmC) is an epigenetic modification whose role in the pathogenesis of metabolic-related complex diseases remains unexplored; 5-hmC appears to be prevalent in the mitochondrial genome. The Ten-Eleven-Translocation (TET) family of proteins is responsible for catalyzing the conversion of 5-methylcytosine to 5-hmC. We hypothesized that epigenetic editing by 5-hmC might be a novel mechanism through which nonalcoholic fatty liver disease (NAFLD)-associated molecular traits could be explained. Hence, we performed an observational study to explore global levels of 5-hmC in fresh liver samples of patients with NAFLD and controls (n = 90) using an enzyme-linked-immunosorbent serologic assay and immunohistochemistry. We also screened for genetic variation in TET 1–3 loci by next generation sequencing to explore its contribution to the disease biology. The study was conducted in 2 stages (discovery and replication) and included 476 participants. We observed that the amount of 5-hmC in the liver of both NAFLD patients and controls was relatively low (up to 0.1%); a significant association was found with liver mitochondrial DNA copy number (R = 0.50, P = 0.000382) and PPARGC1A-mRNA levels (R = −0.57, P = 0.04). We did not observe any significant difference in the 5-hmC nuclear immunostaining score between NAFLD patients and controls; nevertheless, we found that patients with NAFLD (0.4 ± 0.5) had significantly lower nonnuclear-5-hmC staining compared with controls (1.8 ± 0.8), means ± standard deviation, P = 0.028. The missense p.Ile1123Met variant (TET1-rs3998860) was significantly associated with serum levels of caspase-generated CK-18 fragment-cell death biomarker in the discovery and replication stage, and the disease severity (odds ratio: 1.47, 95% confidence interval: 1.10–1.97; P = 0.005). The p.Ile1762Val substitution (TET2-rs2454206) was associated with liver PPARGC1A-methylation and transcriptional levels, and Type 2 diabetes. Our results suggest that 5-hmC might be involved in the pathogenesis of NAFLD by regulating liver mitochondrial biogenesis and PPARGC1A expression. Genetic diversity at TET loci suggests an “epigenetic” regulation of programmed liver-cell death and a TET-mediated fine-tuning of the liver PPARGC1A-transcriptional program.Fil: Pirola, Carlos José. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Scian, Romina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Fernández Gianotti, Tomás. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; ArgentinaFil: Dopazo, Hernán Javier. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Ecología, Genética y Evolución de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Ecología, Genética y Evolución de Buenos Aires; ArgentinaFil: Rohr, Cristian Oscar. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Ciudad Universitaria. Instituto de Ecología, Genética y Evolución de Buenos Aires. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Instituto de Ecología, Genética y Evolución de Buenos Aires; ArgentinaFil: San Martino, Julio. Provincia de Buenos Aires. Ministerio de Salud. Hospital Municipal Dr. Diego Thompson; ArgentinaFil: Castaño, Gustavo Osvaldo. Gobierno de la Ciudad de Buenos Aires. Hospital "Dr. Abel Zubizarreta"; ArgentinaFil: Sookoian, Silvia Cristina. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Houssay. Instituto de Investigaciones Médicas. Universidad de Buenos Aires. Facultad de Medicina. Instituto de Investigaciones Médicas; Argentin

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

PubMed Central

Characterization of genetic variants in 70 portuguese individuals

Author: Martins Daniel Eduardo Fernandes
Publication venue
Publication date: 23/11/2018
Field of study

Dissertação de mestrado em BioinformáticaA análise genómica das populações tem contribuído significativamente para o aumento do número de SNVs descritos em bases de dados. Estudos populacionais prévios têm contribuído com 18 a 57% novas variantes. A nova informação genética é particularmente relevante enquanto referência para propósitos clínicos. Iniciativas à escala global como o 1000 Genomes Project (1kG) incluem populações Ibéricas, contudo, nenhum indivíduo Português foi incluído no mesmo grupo. Tanto quanto se sabe, nenhum indivíduo Português foi incluído no projeto gnomAD, o maior conjunto de dados genómicos atual. Acreditamos que uma coleção de informação genómica referente à população Portuguesa poderia trazer grandes benefícios ao diagnóstico molecular em pacientes Portugueses. As alterações genéticas detetadas em 70 indivíduos Portugueses foram inseridas em uma base de dados não-relacional. A informação publicada pelos projetos 1kG e gnomAD para cada alteração incluída nas mesmas foi adicionada à referida base de dados. Frequências alélicas reportadas para sete populações incluídas na base de dados do gnomAD, cinco populações do 1kG e 5 subpopulações Europeias do mesmo projeto foram comparadas contra os valores calculados para os nossos dados. As diferenças das distribuições alélicas foram testadas com o Fisher’s Exact test. Os p-values obtidos foram corrigidos de acordo com a sua False Discovery Rate (FDR). Os exomas de indivíduos Portugueses analisados continham 224,155 alterações genéticas filtradas de acordo com critérios de qualidade definidos no presente estudo. Aproximadamente 16,4% das variantes não se encontravam descritas nas bases de dados dos projetos 1kG e gnomAD. Os resultados obtidos endossam evidências, previamente descritas na literatura, de uma correlação entre as diferenças genéticas das populações comparadas em relação à população Portuguesa e a distância geográfica das mesmas a Portugal. Diferenças significativas entre distribuições alélicas da população estudada e outras subpopulações Europeias foram encontradas para 7,284 alterações genéticas distribuídas por 2,571 genes. Os resultados obtidos sugerem a existência de marcadores genéticos populacionais e podem motivar futuros estudos com vista a detetar marcadores genéticos específicos da população Portuguesa. O estudo apresentado representa uma contribuição significativa para, não só enriquecer iniciativas genómicas de grande escala, mas também para estabelecer uma referência auxiliar para análises genéticas a doentes Portugueses.The in-depth study of the genomics of single populations has contributed significantly to the enlargement of known SNVs in databases. Each single population study has contributed with 18 to 57% of novel SNVs. The new genetic information is particularly relevant as a reference for clinical purposes. Global-scale initiatives as the 1000 Genomes Project (1kG) already include Iberian population; however, no Portuguese individuals were included in this cohort. Furthermore, to our knowledge, gnomAD, the most extensive genomic dataset, does not include Portuguese individuals either. We believe that a Portuguese collection of genomic information would greatly benefit molecular diagnosis in Portuguese patients. Variants detected in 70 Portuguese individuals were inserted in a MongoDB No-SQL Database. The 1kG and gnomAD information for each variant were uploaded to the same database. Allele frequencies for seven gnomAD populations, five 1kG populations, and five 1kG European subpopulations were compared to the values calculated for our data. Allele distribution differences were tested with Fisher’s exact test. P-values were corrected for False Discovery Rate (FDR). The exomes of the Portuguese individuals contained 224,155 variants filtered accordingly to defined quality criteria. Approximately 16.4% of the variants had not been previously reported by 1kG or gnomAD projects. The present work endorsed the evidence for a correlation between genetic and geographic distance previously reported in the literature. Finally, significative differences were found for the allele distribution between our population and the other 1kG European subpopulations in 7,284 variants distributed by 2,571 genes. Results suggest the existence of populational genetic markers and may prompt future studies for detection of Portuguese-specific genetic markers. The present study is a significant contribution to enrich large-scale genomic initiatives and, to stand as a useful auxiliary reference for genetic analyses of Portuguese patients.Este trabalho foi efetuado no âmbito do projeto In2Genome, ref. CENTRO-01-0247-FEDER-017800, apoiado pelo Programa Operacional Regional do Centro de Portugal (CENTRO 2020), ao abrigo do Acordo de Parceria Portugal 2020, através do Comité Regional Europeu Fundo de Desenvolvimento (FEDER)

Universidade do Minho: RepositoriUM

Recommended from our members

Development of computational approaches for whole-genome sequence variation and deep phenotyping

Author: Haimel Matthias
Publication venue: University of Cambridge
Publication date: 10/10/2018
Field of study

The rare disease pulmonary arterial hypertension (PAH) results in high blood pressure in the lung caused by narrowing of lung arteries. Genes causative in PAH were discovered through family studies and very often harbour rare variants. However, the genetic cause in heritable (31%) and idiopathic (79%) PAH cases is not yet known but are speculated to be caused by rare variants. Advances in high-throughput sequencing (HTS) technologies made it possible to detect variants in 98% of the human genome. A drop in sequencing costs made it feasible to sequence 10,000 individuals including 1,250 subjects diagnosed with PAH and relatives as part of the NIHR Bioresource – Rare (BR-RD) disease study. This large cohort allows the genome-wide identification of rare variants to discover novel causative genes associated with PAH in a case-control study to advance our understanding of the underlying aetiology. In the first part of my thesis, I establish a phenotype capture system that allows research nurses to record clinical measurements and other patient related information of PAH patients recruited to the NIHR BR-RD study. The implemented extensions provide a programmatic data transfer and an automated data release pipeline for analysis ready data. The second part is dedicated to the discovery of novel disease genes in PAH. I focus on one well characterised PAH disease gene to establish variant filter strategies to enrich for rare disease causing variants. I apply these filter strategies to all known PAH disease genes and describe the phenotypic differences based on clinically relevant values. Genome-wide results from different filter strategies are tested for association with PAH. I describe the findings of the rare variant association tests and provide a detailed interrogation of two novel disease genes. The last part describes the data characteristics of variant information, available non SQL (NoSQL) implementations and evaluates the suitability and scalability of distributed compute frameworks to store and analyse population scale variation data. Based on the evaluation, I implement a variant analysis platform that incrementally merges samples, annotates variants and enables the analysis of 10,000 individuals in minutes. An incremental design for variant merging and annotation has not been described before. Using the framework, I develop a quality score to reduce technical variation and other biases. The result from the rare variant association test is compared with traditional methods

Apollo (Cambridge)

Universal DNA methylation age across mammalian tissues

Author: Ablaeva J
Acosta-Rodriguez V A
Adams D M
Almunia J
Aloysius A
Ardehali R
Arneson A
Baker C S
Banks G
Belov K
Bennett N C
Black P
Blumstein D T
Bors E K
Breeze C E
Brooke R T
Brown J L
Carter G G
Caulton A
Cavin J M
Fei Z
Haghani A
Krützen Michael
Li C Z
Lowe R
Lu A T
Robeck T R
Vu H
Yan Q
Zhang J
Zoller J A
Publication venue: Nature Publishing Group
Publication date: 10/08/2023
Field of study

Aging, often considered a result of random cellular damage, can be accurately estimated using DNA methylation profiles, the foundation of pan-tissue epigenetic clocks. Here, we demonstrate the development of universal pan-mammalian clocks, using 11,754 methylation arrays from our Mammalian Methylation Consortium, which encompass 59 tissue types across 185 mammalian species. These predictive models estimate mammalian tissue age with high accuracy (r > 0.96). Age deviations correlate with human mortality risk, mouse somatotropic axis mutations and caloric restriction. We identified specific cytosines with methylation levels that change with age across numerous species. These sites, highly enriched in polycomb repressive complex 2-binding locations, are near genes implicated in mammalian development, cancer, obesity and longevity. Our findings offer new evidence suggesting that aging is evolutionarily conserved and intertwined with developmental processes across all mammals

ZORA