34 research outputs found
SVC: structured visualization of evolutionary sequence conservation
We have developed a web application for the detailed analysis and visualization of evolutionary sequence conservation in complex vertebrate genes. Given a pair of orthologous genes, the protein-coding sequences are aligned. When these sequences are mapped back onto their encoding exons in the genomes, a scaffold of the conserved gene structure naturally emerges. Sequence similarity between exons and introns is analysed and embedded into the gene structure scaffold. The visualization on the SVC server provides detailed information about evolutionarily conserved features of these genes. It further allows concise representation of complex splice patterns in the context of evolutionary conservation. A particular application of our tool arises from the fact that around mRNA editing sites both exonic and intronic sequences are highly conserved. This aids in delineation of these sites. SVC is available at
MACAT—microarray chromosome analysis tool
By linking differential gene expression to the chromosomal localization of genes, one can investigate microarray data for characteristic patterns of expression phenomena involving sizeable parts of specific chromosomes. We have implemented a statistical approach for identifying significantly differentially expressed chromosome regions. We demonstrate the applicability of the approach on a publicly available data set on acute lymphocytic leukemia
SVC: structured visualization of evolutionary sequence conservation
We have developed a web application for the detailed analysis and visualization of evolutionary sequence conservation in complex vertebrate genes. Given a pair of orthologous genes, the protein-coding sequences are aligned. When these sequences are mapped back onto their encoding exons in the genomes, a scaffold of the conserved gene structure naturally emerges. Sequence similarity between exons and introns is analysed and embedded into the gene structure scaffold. The visualization on the SVC server provides detailed information about evolutionarily conserved features of these genes. It further allows concise representation of complex splice patterns in the context of evolutionary conservation. A particular application of our tool arises from the fact that around mRNA editing sites both exonic and intronic sequences are highly conserved. This aids in delineation of these sites. SVC is available at
Вопросы проектирования эффективных СФЗ
In order to screen for differentially expressed genes that might be useful in diagnosis or therapy of prostate cancer we have used a custom made Affymetrix GeneChip containing 3950 cDNA fragments. Expression profiles were obtained from 42 matched pairs of mRNAs isolated from microdissected malignant and benign prostate tissues. Applying three different bioinformatic approaches to define differential gene expression, we found 277 differentially expressed genes, of which 98 were identified by all three methods. Fourteen per cent of these genes were not found in other expression studies, which were based on bulk tissue. Resultant candidate genes were further validated by quantitative RT-PCR, mRNA in situ hybridization and immunohistochemistry. AGR2 was over-expressed in 89% of prostate carcinomas, but did not have prognostic significance. Immunohistologically detected over-expression of MEMD and CD24 was identified in 86% and 38.5% of prostate carcinomas respectively, and both were predictive of PSA relapse. Combined marker analysis using MEMD and CD24 expression proved to be an independent prognostic factor (RR = 4.7, p = 0.006) in a Cox regression model, and was also superior to conventional markers. This combination of molecular markers thus appears to allow improved prediction of patient prognosis, but should be validated in larger studies. Copyright © 2004 Pathological Society of Great Britain and Ireland. Published by John Wiley & Sons, Ltd
FISim: A new similarity measure between transcription factor binding sites based on the fuzzy integral
Background
Regulatory motifs describe sets of related transcription factor binding sites (TFBSs) and can be represented as position frequency matrices (PFMs). De novo identification of TFBSs is a crucial problem in computational biology which includes the issue of comparing putative motifs with one another and with motifs that are already known. The relative importance of each nucleotide within a given position in the PFMs should be considered in order to compute PFM similarities. Furthermore, biological data are inherently noisy and imprecise. Fuzzy set theory is particularly suitable for modeling imprecise data, whereas fuzzy integrals are highly appropriate for representing the interaction among different information sources.Results
We propose FISim, a new similarity measure between PFMs, based on the fuzzy integral of the distance of the nucleotides with respect to the information content of the positions. Unlike existing methods, FISim is designed to consider the higher contribution of better conserved positions to the binding affinity. FISim provides excellent results when dealing with sets of randomly generated motifs, and outperforms the remaining methods when handling real datasets of related motifs. Furthermore, we propose a new cluster methodology based on kernel theory together with FISim to obtain groups of related motifs potentially bound by the same TFs, providing more robust results than existing approaches.Conclusion
FISim corrects a design flaw of the most popular methods, whose measures favour similarity of low information content positions. We use our measure to successfully identify motifs that describe binding sites for the same TF and to solve real-life problems. In this study the reliability of fuzzy technology for motif comparison tasks is proven.This work has been carried out as part of projects P08-TIC-4299 of J. A., Sevilla and TIN2006-13177 of DGICT, Madrid
A Novel Bayesian DNA Motif Comparison Method for Clustering and Retrieval
Characterizing the DNA-binding specificities of transcription factors is a key problem in computational biology that has been addressed by multiple algorithms. These usually take as input sequences that are putatively bound by the same factor and output one or more DNA motifs. A common practice is to apply several such algorithms simultaneously to improve coverage at the price of redundancy. In interpreting such results, two tasks are crucial: clustering of redundant motifs, and attributing the motifs to transcription factors by retrieval of similar motifs from previously characterized motif libraries. Both tasks inherently involve motif comparison. Here we present a novel method for comparing and merging motifs, based on Bayesian probabilistic principles. This method takes into account both the similarity in positional nucleotide distributions of the two motifs and their dissimilarity to the background distribution. We demonstrate the use of the new comparison method as a basis for motif clustering and retrieval procedures, and compare it to several commonly used alternatives. Our results show that the new method outperforms other available methods in accuracy and sensitivity. We incorporated the resulting motif clustering and retrieval procedures in a large-scale automated pipeline for analyzing DNA motifs. This pipeline integrates the results of various DNA motif discovery algorithms and automatically merges redundant motifs from multiple training sets into a coherent annotated library of motifs. Application of this pipeline to recent genome-wide transcription factor location data in S. cerevisiae successfully identified DNA motifs in a manner that is as good as semi-automated analysis reported in the literature. Moreover, we show how this analysis elucidates the mechanisms of condition-specific preferences of transcription factors
A Novel Putative miRNA Target Enhancer Signal
It is known that miRNA target sites are very short and the effect of miRNA-target site interaction alone appears as being unspecific. Recent experiments suggest further context signals involved in miRNA target site recognition and regulation. Here, we present a novel GC-rich RNA motif downstream of experimentally supported miRNA target sites in human mRNAs with no similarity to previously reported functional motifs. We demonstrate that the novel motif can be found in at least one third of all transcripts regulated by miRNAs. Furthermore, we show that motif occurrence and the frequency of miRNA target sites as well as the stability of their duplex structures correlate. The finding, that the novel motif is significantly associated with miRNA target sites, suggests a functional role of the motif in miRNA target site biology. Beyond, the novel motif has the impact to improve prediction of miRNA target sites significantly
An expression module of WIPF1-coexpressed genes identifies patients with favorable prognosis in three tumor types
Wiskott–Aldrich syndrome (WAS) predisposes patients to leukemia and lymphoma. WAS is caused by mutations in the protein WASP which impair its interaction with the WIPF1 protein. Here, we aim to identify a module of WIPF1-coexpressed genes and to assess its use as a prognostic signature for colorectal cancer, glioma, and breast cancer patients. Two public colorectal cancer microarray data sets were used for discovery and validation of the WIPF1 co-expression module. Based on expression of the WIPF1 signature, we classified more than 400 additional tumors with microarray data from our own experiments or from publicly available data sets according to their WIPF1 signature expression. This allowed us to separate patient populations for colorectal cancers, breast cancers, and gliomas for which clinical characteristics like survival times and times to relapse were analyzed. Groups of colorectal cancer, breast cancer, and glioma patients with low expression of the WIPF1 co-expression module generally had a favorable prognosis. In addition, the majority of WIPF1 signature genes are individually correlated with disease outcome in different studies. Literature gene network analysis revealed that among WIPF1 co-expressed genes known direct transcriptional targets of c-myc, ESR1 and p53 are enriched. The mean expression profile of WIPF1 signature genes is correlated with the profile of a proliferation signature. The WIPF1 signature is the first microarray-based prognostic expression signature primarily developed for colorectal cancer that is instrumental in other tumor types: low expression of the WIPF1 module is associated with better prognosis
Sex-specific pathways in early cardiac response to pressure overload in mice
Pressure overload (PO) first causes cardiac hypertrophy and then heart failure (HF), which are associated with sex differences in cardiac morphology and function. We aimed to identify genes that may cause HF-related sex differences. We used a transverse aortic constriction (TAC) mouse model leading to hypertrophy without sex differences in cardiac function after 2 weeks, but with sex differences in hypertrophy 6 and 9 weeks after TAC. Cardiac gene expression was analyzed 2 weeks after surgery. Deregulated genes were classified into functional gene ontology (GO) categories and used for pathway analysis. Classical marker genes of hypertrophy were similarly upregulated in both sexes (α-actin, ANP, BNP, CTGF). Thirty-five genes controlling mitochondrial function (PGC-1, cytochrome oxidase, carnitine palmitoyl transferase, acyl-CoA dehydrogenase, pyruvate dehydrogenase kinase) had lower expression in males compared to females after TAC. Genes encoding ribosomal proteins and genes associated with extracellular matrix remodeling exhibited relative higher expression in males (collagen 3, matrix metalloproteinase 2, TIMP2, and TGFβ2, all about twofold) after TAC. We confirmed 87% of the gene expression by real-time polymerase chain reaction. By GO classification, female-specific genes were related to mitochondria and metabolism and males to matrix and biosynthesis. Promoter studies confirmed the upregulation of PGC-1 by E2. Less downregulation of metabolic genes in female hearts and increased protein synthesis capacity and deregulation of matrix remodeling in male hearts characterize the sex-specific early response to PO. These differences could contribute to subsequent sex differences in cardiac function and HF