114 research outputs found

    The true story behind the annotation of a pathway

    Get PDF
    On 2010 we worked on the annotation of the N-Glycosylation pathway in the Reactome database. During this process, we found some unclear points and errors in other databases, and we reported and helped fix them. After we finished, we realized that the work of reporting errors to a database is basically not acknowledged by the scientific community: this is unfortunate because if only this process would be a bit more recognized and transparent, we could have better data in the databases and a more active community. Moreover, the fact that many databases tend to keep error reporting private creates great issues to the reproducibility of a work.

Another way to look at this talk is: if you dedicate 6 months of your PhD thesis to annotate carefully a set of genes, in this case a pathway I have been studying, how many errors do you expect to find in other databases, or what should you be careful at

    VCF2Networks: applying genotype networks to single-nucleotide variants data

    Get PDF
    Summary: A wealth of large-scale genome sequencing projects opens the doors to new approaches to study the relationship between genotype and phenotype. One such opportunity is the possibility to apply genotype networks analysis to population genetics data. Genotype networks are a representation of the set of genotypes associated with a single phenotype, and they allow one to estimate properties such as the robustness of the phenotype to mutations, and the ability of its associated genotypes to evolve new adaptations. So far, though, genotype networks analysis has rarely been applied to population genetics data. To help fill this gap, here we present VCF2Networks, a tool to determine and study genotype network structure from single-nucleotide variant data. Availability and implementation: VCF2Networks is available at https://bitbucket.org/dalloliogm/vcf2networks. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    CSDE1 Intracellular Distribution as a Biomarker of Melanoma Prognosis

    Get PDF
    RNA-binding protein; Biomarker; MelanomaProteína de unión a ARN; Biomarcador; MelanomaProteïna d'unió a l'ARN; Biomarcador; MelanomaRNA-binding proteins are emerging as critical modulators of oncogenic cell transformation, malignancy and therapy resistance. We have previously found that the RNA-binding protein Cold Shock Domain containing protein E1 (CSDE1) promotes invasion and metastasis of melanoma, the deadliest form of skin cancer and also a highly heterogeneous disease in need of predictive biomarkers and druggable targets. Here, we design a monoclonal antibody useful for IHC in the clinical setting and use it to evaluate the prognosis potential of CSDE1 in an exploratory cohort of 149 whole tissue sections including benign nevi and primary tumors and metastasis from melanoma patients. Contrary to expectations for an oncoprotein, we observed a global decrease in CSDE1 levels with increasing malignancy. However, the CSDE1 cytoplasmic/nuclear ratio exhibited a positive correlation with adverse clinical features of primary tumors and emerged as a robust indicator of progression free survival in cutaneous melanoma, highlighting the potential of CSDE1 as a biomarker of prognosis. Our findings provide a novel feature for prognosis assessment and highlight the intricacies of RNA-binding protein dynamics in cancer progression.A.I. and P.E. were supported by PhD4MD fellowships from the CRG and the Emerald program (Marie Skłodowska-Curie grant agreement 101034290), respectively. This work was supported by the following grants to F.G.: PGC2018-099697-B-I00 and PID2021-127948NB-I00 from the Spanish Ministry of Science and Innovation (MCIN) funded by MCIN/ AEI /10.13039/501100011033/ and by ERDF; “la Caixa” Foundation (ID 100010434) under the Grant LCF/PR/HR17/52150016; the Catalan Agency for Research and Universities (SGR-Cat-2021-01215) and intramural funds from the CRG on emergent translational research. We acknowledge the support of the Spanish Ministry of Science and Innovation through the Centro de Excelencia Severo Ochoa (CEX2020-001049-S, MCIN/AEI /10.13039/501100011033) and the Generalitat de Catalunya through the CERCA programme

    Thermal evolution of gene expression profiles in Drosophila subobscura

    Get PDF
    BACKGROUND: Despite its pervasiveness, the genetic basis of adaptation resulting in variation directly or indirectly related to temperature (climatic) gradients is poorly understood. By using 3-fold replicated laboratory thermal stocks covering much of the physiologically tolerable temperature range for the temperate (i.e., cold tolerant) species Drosophila subobscura we have assessed whole-genome transcriptional responses after three years of thermal adaptation, when the populations had already diverged for inversion frequencies, pre-adult life history components, and morphological traits. Total mRNA from each population was compared to a reference pool mRNA in a standard, highly replicated two-colour competitive hybridization experiment using cDNA microarrays. RESULTS: A total of 306 (6.6%) cDNA clones were identified as 'differentially expressed' (following a false discovery rate correction) after contrasting the two furthest apart thermal selection regimes (i.e., 13°C vs . 22°C), also including four previously reported candidate genes for thermotolerance in Drosophila (Hsp26, Hsp68, Fst, and Treh). On the other hand, correlated patterns of gene expression were similar in cold- and warm-adapted populations. Analysis of functional categories defined by the Gene Ontology project point to an overrepresentation of genes involved in carbohydrate metabolism, nucleic acids metabolism and regulation of transcription among other categories. Although the location of differently expressed genes was approximately at random with respect to chromosomes, a physical mapping of 88 probes to the polytene chromosomes of D. subobscura has shown that a larger than expected number mapped inside inverted chromosomal segments. CONCLUSION: Our data suggest that a sizeable number of genes appear to be involved in thermal adaptation in Drosophila, with a substantial fraction implicated in metabolism. This apparently illustrates the formidable challenge to understanding the adaptive evolution of complex trait variation. Furthermore, some clustering of genes within inverted chromosomal sections was detected. Disentangling the effects of inversions will be obviously required in any future approach if we want to identify the relevant candidate genes

    1000 Genomes Selection Browser 1.0: A genome browser dedicated to signatures of natural selection in modern humans

    Get PDF
    This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited.Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima's D, CLR, Fay and Wu's H, Fu and Li's F* and D*, XPEHH, ΔiHH, iHS, FST, ΔDAF and XPCLR among others to low coverage sequencing data from the 1000 genomes project (Phase 1; release April 2012). We have implemented a publicly available genome-wide browser to communicate the results from three different populations of West African, Northern European and East Asian ancestry (YRI, CEU, CHB). Information is provided in UCSC-style format to facilitate the integration with the rich UCSC browser tracks and an access page is provided with instructions and for convenient visualization. We believe that this expandable resource will facilitate the interpretation of signals of selection on different temporal, geographical and genomic scales. © 2013 The Author(s). Published by Oxford University Press.Ministerio de Ciencia y Tecnología (Spain); Direcció General de Recerca, Generalitat de Catalunya (Grup de Recerca Consolidat 2009 SGR 1101); Subprogram BMC [BFU2010-19443 awarded to J.B.]; Post-doctoral scholarship from the Volkswagenstiftung [Az: I/85 198 to J.E.]; Spanish government [BFU-2008-01046; SAF2011-29239]; The Spanish government FPI scholarships [BES-2009-017731 and BES-2011-04502 to G.M.D. and M.P., respectively]; PhD fellowship from ‘Acción Estratégica de Salud, en el marco del Plan Nacional de Investigación Científica, Desarrollo e Innovación Tecnológica 2008-2011’ from Instituto de Salud Carlos III (to P.L.). Funding for open access charge: Prof. Jaume Bertranpetit.Peer Reviewe

    Enhancers with tissue-specific activity are enriched in intronic regions

    Full text link
    Tissue function and homeostasis reflect the gene expression signature by which the combination of ubiquitous and tissue-specific genes contribute to the tissue maintenance and stimuli-responsive function. Enhancers are central to control this tissue-specific gene expression pattern. Here, we explore the correlation between the genomic location of enhancers and their role in tissue-specific gene expression. We find that enhancers showing tissue-specific activity are highly enriched in intronic regions and regulate the expression of genes involved in tissue-specific functions, whereas housekeeping genes are more often controlled by intergenic enhancers, common to many tissues. Notably, an intergenic-to-intronic active enhancers continuum is observed in the transition from developmental to adult stages: the most differentiated tissues present higher rates of intronic enhancers, whereas the lowest rates are observed in embryonic stem cells. Altogether, our results suggest that the genomic location of active enhancers is key for the tissue-specific control of gene expression

    Decay of linkage disequilibrium within genes across HGDP-CEPH human samples: most population isolates do not show increased LD

    Get PDF
    9 pages, 2 figures, 4 additional files.[Background] It is well known that the pattern of linkage disequilibrium varies between human populations, with remarkable geographical stratification. Indirect association studies routinely exploit linkage disequilibrium around genes, particularly in isolated populations where it is assumed to be higher. Here, we explore both the amount and the decay of linkage disequilibrium with physical distance along 211 gene regions, most of them related to complex diseases, across 39 HGDP-CEPH population samples, focusing particularly on the populations defined as isolates. Within each gene region and population we use r2 between all possible single nucleotide polymorphism (SNP) pairs as a measure of linkage disequilibrium and focus on the proportion of SNP pairs with r2 greater than 0.8.[Results] Although the average r2 was found to be significantly different both between and within continental regions, a much higher proportion of r2 variance could be attributed to differences between continental regions (2.8% vs. 0.5%, respectively). Similarly, while the proportion of SNP pairs with r2 > 0.8 was significantly different across continents for all distance classes, it was generally much more homogenous within continents, except in the case of Africa and the Americas. The only isolated populations with consistently higher LD in all distance classes with respect to their continent are the Kalash (Central South Asia) and the Surui (America). Moreover, isolated populations showed only slightly higher proportions of SNP pairs with r2 > 0.8 per gene region than non-isolated populations in the same continent. Thus, the number of SNPs in isolated populations that need to be genotyped may be only slightly less than in non-isolates.[Conclusion] The "isolated population" label by itself does not guarantee a greater genotyping efficiency in association studies, and properties other than increased linkage disequilibrium may make these populations interesting in genetic epidemiology.This research was supported by "Fundación Genoma España" (proyectos piloto CEGEN 2004–2005), Dirección General de Investigación, Ministerio de Educación y Ciencia of Spain (grants BFU2005-00243, BFU2006-01235, BFU2006-15413-CO2-01, SEJ2006-13537) and Direcció General de Recerca, Generalitat de Catalunya (2005SGR00608). SNP genotyping services were provided by the Spanish "Centro Nacional de Genotipado"Peer reviewe

    Ten Simple Rules for Getting Help from Online Scientific Communities

    Get PDF
    The increasing complexity of research requires scientists to work at the intersection of multiple fields and to face problems for which their formal education has not prepared them. For example, biologists with no or little background in programming are now often using complex scripts to handle the results from their experiments; vice versa, programmers wishing to enter the world of bioinformatics must know about biochemistry, genetics, and other fields. In this context, communication tools such as mailing lists, web forums, and online communities acquire increasing importance. These tools permit scientists to quickly contact people skilled in a specialized field. A question posed properly to the right online scientific community can help in solving difficult problems, often faster than screening literature or writing to publication authors. The growth of active online scientific communities, such as those listed in Table S1, demonstrates how these tools are becoming an important source of support for an increasing number of researchers. Nevertheless, making proper use of these resources is not easy. Adhering to the social norms of World Wide Web communication—loosely termed “netiquette”—is both important and non-trivial. In this article, we take inspiration from our experience on Internet-shared scientific knowledge, and from similar documents such as “Asking the Questions the Smart Way” and “Getting Answers”, to provide guidelines and suggestions on how to use online communities to solve scientific problems

    The annotation and the usage of scientific databases could be improved with public issue tracker software

    Get PDF
    Since the publication of their longtime predecessor The Atlas of Protein Sequences and Structures in 1965 by Margaret Dayhoff, scientific databases have become a key factor in the organization of modern science. All the information and knowledge described in the novel scientific literature is translated into entries in many different scientific databases, making it possible to obtain very accurate information on a biological entity like genes or proteins without having to manually review the literature on it. However, even for the databases with the finest annotation procedures, errors or unclear parts sometimes appear in the publicly released version and influence the research of unaware scientists using them. The researcher that finds an error in a database is often left in a uncertain state, and often abandons the effort of reporting it because of a lack of a standard procedure to do so. In the present work, we propose that the simple adoption of a public error tracker application, as in many open software projects, could improve the quality of the annotations in many databases and encourage feedback from the scientific community on the data annotated publicly. In order to illustrate the situation, we describe a series of errors that we found and helped solve on the genes of a very well-known pathway in various biomedically relevant databases. We would like to show that, even if a majority of the most important scientific databases have procedures for reporting errors, these are usually not publicly visible, making the process of reporting errors time consuming and not useful. Also, the effort made by the user that reports the error often goes unacknowledged, putting him in a discouraging position

    000 Genomes Selection Browser 1.0: a genome browser dedicated to signatures of natural selection in modern humans

    Get PDF
    ABSTRACT Searching for Darwinian selection in natural populations has been the focus of a multitude of studies over the last decades. Here we present the 1000 Genomes Selection Browser 1.0 (http://hsb.upf.edu) as a resource for signatures of recent natural selection in modern humans. We have implemented and applied a large number of neutrality tests as well as summary statistics informative for the action of selection such as Tajima's D, CLR
    corecore