23 research outputs found
IDconverter and IDClight: Conversion and annotation of gene and protein IDs
Background: Researchers involved in the annotation of large numbers of gene, clone or protein
identifiers are usually required to perform a one-by-one conversion for each identifier. When the
field of research is one such as microarray experiments, this number may be around 30,000.
Results: To help researchers map accession numbers and identifiers among clones, genes, proteins
and chromosomal positions, we have designed and developed IDconverter and IDClight. They are
two user-friendly, freely available web server applications that also provide additional functional
information by mapping the identifiers on to pathways, Gene Ontology terms, and literature
references. Both tools are high-throughput oriented and include identifiers for the most common
genomic databases. These tools have been compared to other similar tools, showing that they are
among the fastest and the most up-to-date.
Conclusion: These tools provide a fast and intuitive way of enriching the information coming out
of high-throughput experiments like microarrays. They can be valuable both to wet-lab researchers
and to bioinformaticiansFunding has been provided by Fundación de Investigatión
Médica Mutua Madrileña and Project TIC2003-09331-C02-02 of the Spanish
Ministry of Education and Science (MEC). RD-U is partially supported
by the Ramón y Cajal programme of the Spanish ME
Asterias: integrated analysis of expression and aCGH data using an open-source, web-based, parallelized software suite
Asterias (http://www.asterias.info) is an open-source, web-based, suite for the analysis of gene expression and aCGH data. Asterias implements validated statistical methods, and most of the applications use parallel computing, which permits taking advantage of multicore CPUs and computing clusters. Access to, and further analysis of, additional biological information and annotations (PubMed references, Gene Ontology terms, KEGG and Reactome pathways) are available either for individual genes (from clickable links in tables and figures) or sets of genes. These applications cover from array normalization to imputation and preprocessing, differential gene expression analysis, class and survival prediction and aCGH analysis. The source code is available, allowing for extention and reuse of the software. The links and analysis of additional functional information, parallelization of computation and open-source availability of the code make Asterias a unique suite that can exploit features specific to web-based environments
Asterias: A Parallelized Web-based Suite for the Analysis of Expression and aCGH Data
The analysis of expression and CGH arrays plays a central role in the study of complex diseases, especially cancer, including finding markers for early diagnosis and prognosis, choosing an optimal therapy, or increasing our understanding of cancer development and metastasis. Asterias (http://www.asterias.info) is an integrated collection of freely-accessible web tools for the analysis of gene expression and aCGH data. Most of the tools use parallel computing (via MPI) and run on a server with 60 CPUs for computation; compared to a desktop or server-based but not parallelized application, parallelization provides speed ups of factors up to 50. Most of our applications allow the user to obtain additional information for user-selected genes (chromosomal location, PubMed ids, Gene Ontology terms, etc.) by using clickable links in tables and/or figures. Our tools include: normalization of expression and aCGH data (DNMAD); converting between different types of gene/clone and protein identifiers (IDconverter/IDClight); filtering and imputation (preP); finding differentially expressed genes related to patient class and survival data (Pomelo II); searching for models of class prediction (Tnasas); using random forests to search for minimal models for class prediction or for large subsets of genes with predictive capacity (GeneSrF); searching for molecular signatures and predictive genes with survival data (SignS); detecting regions of genomic DNA gain or loss (ADaCGH). The capability to send results between different applications, access to additional functional information, and parallelized computation make our suite unique and exploit features only available to web-based applications
Helicase Lymphoid-specific enzyme contributes to the maintenance of methylation of SST1 pericentromeric repeats that are frequently demethylated in colon cancer and associated with genomic damage
DNA hypomethylation at repetitive elements accounts for the genome-wide DNA hypomethylation common in cancer, including colorectal cancer (CRC). We identified a pericentromeric repeat element called SST1 frequently hypomethylated (>5% demethylation compared with matched normal tissue) in several cancers, including 28 of 128 (22%) CRCs. SST1 somatic demethylation associated with genome damage, especially in tumors with wild-type TP53. Seven percent of the 128 CRCs exhibited a higher ("severe") level of demethylation (≥10%) that co-occurred with TP53 mutations. SST1 demethylation correlated with distinct histone marks in CRC cell lines and primary tumors: demethylated SST1 associated with high levels of the repressive histone 3 lysine 27 trimethylation (H3K27me3) mark and lower levels of histone 3 lysine 9 trimethylation (H3K9me3). Furthermore, induced demethylation of SST1 by 5-aza-dC led to increased H3K27me3 and reduced H3K9me3. Thus, in some CRCs, SST1 demethylation reflects an epigenetic reprogramming associated with changes in chromatin structure that may affect chromosomal integrity. The chromatin remodeler factor, the helicase lymphoid-specific (HELLS) enzyme, called the "epigenetic guardian of repetitive elements", interacted with SST1 as shown by chromatin immunoprecipitation, and down-regulation of HELLS by shRNA resulted in demethylation of SST1 in vitro. Altogether these results suggest that HELLS contributes to SST1 methylation maintenance. Alterations in HELLS recruitment and function could contribute to the somatic demethylation of SST1 repeat elements undergone before and/or during CRC pathogenesis
Mutational spectrum by phenotype: panel-based NGS testing of patients with clinical suspition of RASopathy and children with multiple café-au-lait macules
Children with neurofibromatosis type 1 (NF1) may exhibit an incomplete clinical presentation, making difficult to reach a clinical diagnosis. A phenotypic overlap may exist in children with other RASopathies or with other genetic conditions if only multiple café‐au‐lait macules (CALMs) are present. The syndromes that can converge in these inconclusive phenotypes have different clinical courses. In this context, an early genetic testing has been proposed to be clinically useful to manage these patients. We present the validation and implementation into diagnostics of a custom NGS panel (I2HCP, ICO‐IMPPC Hereditary Cancer Panel) for testing patients with a clinical suspicion of a RASopathy (n = 48) and children presenting multiple CALMs (n = 102). We describe the mutational spectrum and the detection rates identified in these two groups of individuals. We identified pathogenic variants in 21 out of 48 patients with clinical suspicion of RASopathy, with mutations in NF1 accounting for 10% of cases. Furthermore, we identified pathogenic mutations mainly in the NF1 gene, but also in SPRED1, in more than 50% of children with multiple CALMs, exhibiting an NF1 mutational spectrum different from a group of clinically diagnosed NF1 patients (n = 80). An NGS panel strategy for the genetic testing of these two phenotype‐defined groups outperforms previous strategie
HumMeth27QCReport: an R package for quality control and primary analysis of Illumina Infinium methylation data
<p>Abstract</p> <p>Background</p> <p>The study of the human DNA methylome has gained particular interest in the last few years. Researchers can nowadays investigate the potential role of DNA methylation in common disorders by taking advantage of new high-throughput technologies. Among these, Illumina Infinium assays can interrogate the methylation levels of hundreds of thousands of CpG sites, offering an ideal solution for genome-wide methylation profiling. However, like for other high-throughput technologies, the main bottleneck remains at the stage of data analysis rather than data production.</p> <p>Findings</p> <p>We have developed <it>HumMeth27QCReport</it>, an R package devoted to researchers wanting to quickly analyse their Illumina Infinium methylation arrays. This package automates quality control steps by generating a report including sample-independent and sample-dependent quality plots, and performs primary analysis of raw methylation calls by computing data normalization, statistics, and sample similarities. This package is available at CRAN repository, and can be integrated in any Galaxy instance through the implementation of ad-hoc scripts accessible at Galaxy Tool Shed.</p> <p>Conclusions</p> <p>Our package provides users of the Illumina Infinium Methylation assays with a simplified, automated, open-source quality control and primary analysis of their methylation data. Moreover, to enhance its use by experimental researchers, the tool is being distributed along with the scripts necessary for its implementation in the Galaxy workbench. Finally, although it was originally developed for HumanMethylation27, we proved its compatibility with data generated with the HumanMethylation450 Bead Chip.</p
Using a structural and logics systems approach to infer bHLH–DNA binding specificity determinants
Numerous efforts are underway to determine gene regulatory networks that describe physical relationships between transcription factors (TFs) and their target DNA sequences. Members of paralogous TF families typically recognize similar DNA sequences. Knowledge of the molecular determinants of protein–DNA recognition by paralogous TFs is of central importance for understanding how small differences in DNA specificities can dictate target gene selection. Previously, we determined the in vitro DNA binding specificities of 19 Caenorhabditis elegans basic helix-loop-helix (bHLH) dimers using protein binding microarrays. These TFs bind E-box (CANNTG) and E-box-like sequences. Here, we combine these data with logics, bHLH–DNA co-crystal structures and computational modeling to infer which bHLH monomer can interact with which CAN E-box half-site and we identify a critical residue in the protein that dictates this specificity. Validation experiments using mutant bHLH proteins provide support for our inferences. Our study provides insights into the mechanisms of DNA recognition by bHLH dimers as well as a blueprint for system-level studies of the DNA binding determinants of other TF families in different model organisms and humans.National Institute of General Medical Sciences (U.S.) (DK068429)National Institute of General Medical Sciences (U.S.) (HG003985)European Union (PROSPECTS HEALTH-F4-2008-201648
Using protein design algorithms to understand the molecular basis of disease caused by protein–DNA interactions: the Pax6 example
Quite often a single or a combination of protein mutations is linked to specific diseases. However, distinguishing from sequence information which mutations have real effects in the protein’s function is not trivial. Protein design tools are commonly used to explain mutations that affect protein stability, or protein–protein interaction, but not for mutations that could affect protein–DNA binding. Here, we used the protein design algorithm FoldX to model all known missense mutations in the paired box domain of Pax6, a highly conserved transcription factor involved in eye development and in several diseases such as aniridia. The validity of FoldX to deal with protein–DNA interactions was demonstrated by showing that high levels of accuracy can be achieved for mutations affecting these interactions. Also we showed that protein-design algorithms can accurately reproduce experimental DNA-binding logos. We conclude that 88% of the Pax6 mutations can be linked to changes in intrinsic stability (77%) and/or to its capabilities to bind DNA (30%). Our study emphasizes the importance of structure-based analysis to understand the molecular basis of diseases and shows that protein–DNA interactions can be analyzed to the same level of accuracy as protein stability, or protein–protein interactions
Molecular basis of engineered meganuclease targeting of the endogenous human RAG1 locus
Homing endonucleases recognize long target DNA sequences generating an accurate double-strand break that promotes gene targeting through homologous recombination. We have modified the homodimeric I-CreI endonuclease through protein engineering to target a specific DNA sequence within the human RAG1 gene. Mutations in RAG1 produce severe combined immunodeficiency (SCID), a monogenic disease leading to defective immune response in the individuals, leaving them vulnerable to infectious diseases. The structures of two engineered heterodimeric variants and one single-chain variant of I-CreI, in complex with a 24-bp oligonucleotide of the human RAG1 gene sequence, show how the DNA binding is achieved through interactions in the major groove. In addition, the introduction of the G19S mutation in the neighborhood of the catalytic site lowers the reaction energy barrier for DNA cleavage without compromising DNA recognition. Gene-targeting experiments in human cell lines show that the designed single-chain molecule preserves its in vivo activity with higher specificity, further enhanced by the G19S mutation. This is the first time that an engineered meganuclease variant targets the human RAG1 locus by stimulating homologous recombination in human cell lines up to 265 bp away from the cleavage site. Our analysis illustrates the key features for à la carte procedure in protein–DNA recognition design, opening new possibilities for SCID patients whose illness can be treated ex vivo
Molecular diagnostic techniques in cancer
La transformació neoplàsica succeeix per l’acumulació consecutiva d’alteracions genètiques i epigenètiques que confereixen a les cèŀlules on ocorren avantatges selectius sobre les seves germanes, cosa que condueix al tumor per selecció clonal i evolució. Aquest treball pretén facilitar una visió general de diverses tècniques de diagnòstic d’alteracions genètiques i epigenètiques comunament utilitzades en l’estudi molecular dels càncers humans en l’àmbit de la investigació bàsica, algunes de les quals s’estan aplicant actualment o estan properes a implementar-se en l’entorn clínic. Entre les tècniques de detecció d’alteracions genètiques descriurem aquelles que identifiquen alteracions cromosòmiques i mutacions puntuals. I entre els mètodes de detecció d’alteracions epigenètiques descriurem aquells dedicats als canvis somàtics de metilació en el DNA.Paraules clau: càncer, diagnòstic molecular, mutació, tècnica, seqüenciació.Neoplastic transformation is caused by the consecutive accumulation of genetic and epigenetic alterations providing selective growth advantages to cancer cells, ultimately leading to tumor development by clonal selection and evolution. This study aims to provide an overview of several epigenetic and genetic alteration diagnostic techniques commonly used in basic research in the molecular study of human cancers, some of which are currently implemented or are close to implementation in the clinical setting. Among the techniques for detecting genetic alterations we describe those that identify chromosomic alterations and point mutations. Among the methods of detecting epigenetic alterations we describe those that deal with somatic changes in DNA methylation.Keywords: cancer, molecular diagnostics, mutation, techniques, sequencing