87 research outputs found

    CisMols Analyzer: identification of compositionally similar cis-element clusters in ortholog conserved regions of coordinately expressed genes

    Get PDF
    Combinatorial interactions of sequence-specific trans-acting factors with localized genomic cis-element clusters are the principal mechanism for regulating tissue-specific and developmental gene expression. With the emergence of expanding numbers of genome-wide expression analyses, the identification of the cis-elements responsible for specific patterns of transcriptional regulation represents a critical area of investigation. Computational methods for the identification of functional cis-regulatory modules are difficult to devise, principally because of the short length and degenerate nature of individual cis-element binding sites and the inherent complexity that is generated by combinatorial interactions within cis-clusters. Filtering candidate cis-element clusters based on phylogenetic conservation is helpful for an individual ortholog gene pair, but combining data from cis-conservation and coordinate expression across multiple genes is a more difficult problem. To approach this, we have extended an ortholog gene-pair database with additional analytical architecture to allow for the analysis and identification of maximal numbers of compositionally similar and phylogenetically conserved cis-regulatory element clusters from a list of user-selected genes. The system has been successfully tested with a series of functionally related and microarray profile-based co-expressed ortholog pairs of promoters and genes using known regulatory regions as training sets and co-expressed genes in the olfactory and immunohematologic systems as test sets. CisMols Analyzer is accessible via a Web interface at

    GenomeTrafac: a whole genome resource for the detection of transcription factor binding site clusters associated with conventional and microRNA encoding genes conserved between mouse and human gene orthologs

    Get PDF
    Transcriptional cis-regulatory control regions frequently are found within non-coding DNA segments conserved across multi-species gene orthologs. Adopting a systematic gene-centric pipeline approach, we report here the development of a web-accessible database resourceā€”GenomeTraFac ()ā€”that allows genome-wide detection and characterization of compositionally similar cis-clusters that occur in gene orthologs between any two genomes for both microRNA genes as well as conventional RNA-encoding genes. Each ortholog gene pair can be scanned to visualize overall conserved sequence regions, and within these, the relative density of conserved cis-element motif clusters form graph peak structures. The results of these analyses can be mined en masse to identify most frequently represented cis-motifs in a list of genes. The system also provides a method for rapid evaluation and visualization of gene model-consistency between orthologs, and facilitates consideration of the potential impact of sequence variation in conserved non-coding regions to impact complex cis-element structures. Using the mouse and human genomes via the NCBI Reference Sequence database and the Sanger Institute miRBase, the system demonstrated the ability to identify validated transcription factor targets within promoter and distal genomic regulatory regions of both conventional and microRNA genes

    PU.1 positively regulates GATA-1 expression in mast cells

    Get PDF
    Coexpression of PU.1 and GATA-1 is required for proper specification of the mast cell lineage; however, in the myeloid and erythroid lineages, PU.1 and GATA-1 are functionally antagonistic. In this study, we report a transcriptional network in which PU.1 positively regulates GATA-1 expression in mast cell development. We isolated a variant mRNA isoform of GATA-1 in murine mast cells that is significantly upregulated during mast cell differentiation. This isoform contains an alternatively spliced first exon (IB) that is distinct from the first exon (IE) incorporated in the major erythroid mRNA transcript. In contrast to erythroid and megakaryocyte cells, in mast cells we show that PU.1 and GATA-2 predominantly occupy potential cis-regulatory elements in the IB exon region in vivo. Using reporter assays, we identify an enhancer flanking the IB exon that is activated by PU.1. Furthermore, we observe that in PU.1 -/- fetal liver cells, low levels of the IE GATA-1 isoform is expressed, but the variant IB isoform is absent. Reintroduction of PU.1 restores variant IB isoform and upregulates total GATA-1 protein expression, which is concurrent with mast cell differentiation. Our results are consistent with a transcriptional hierarchy in which PU.1, possibly in concert with GATA-2, activates GATA-1 expression in mast cells in a pathway distinct from that seen in the erythroid and megakaryocytic lineages. Copyright Ā© 2010 by The American Association of Immunologists, Inc

    Exome Sequencing Identifies Candidate Genetic Modifiers of Syndromic and Familial Thoracic Aortic Aneurysm Severity

    Get PDF
    Thoracic aortic aneurysm (TAA) is a genetic disease predisposing to aortic dissection. It is important to identify the genetic modifiers controlling penetrance and expressivity to improve clinical prognostication. Exome sequencing was performed in 27 subjects with syndromic or familial TAA presenting with extreme phenotypes (15 with severe TAA; 12 with mild or absent TAA). Family-based analysis of a subset of the cohort identified variants, genes, and pathways segregating with TAA severity among three families. A rare missense variant in ADCK4 (p.Arg63Trp) segregated with mild TAA in each family. Genes and pathways identified in families were further investigated in the entire cohort using the optimal unified sequence kernel association test, finding significance for the gene COL15A1 (p = 0.025) and the retina homeostasis pathway (p = 0.035). Thus, we identified candidate genetic modifiers of TAA severity by exome-based study of extreme phenotypes, which may lead to improved risk stratification and development of new medical therapies

    Staging of biliary atresia at diagnosis by molecular profiling of the liver

    Get PDF
    Abstract Background Young age at portoenterostomy has been linked to improved outcome in biliary atresia, but pre-existing biological factors may influence the rate of disease progression. In this study, we aimed to determine whether molecular profiling of the liver identifies stages of disease at diagnosis. Methods We examined liver biopsies from 47 infants with biliary atresia enrolled in a prospective observational study. Biopsies were scored for inflammation and fibrosis, used for gene expression profiles, and tested for association with indicators of disease severity, response to surgery, and survival at 2 years. Results Fourteen of 47 livers displayed predominant histological features of inflammation (N = 9) or fibrosis (N = 5), with the remainder showing similar levels of both simultaneously. By differential profiling of gene expression, the 14 livers had a unique molecular signature containing 150 gene probes. Applying prediction analysis models, the probes classified 29 of the remaining 33 livers into inflammation or fibrosis. Molecular classification into the two groups was validated by the findings of increased hepatic population of lymphocyte subsets or tissue accumulation of matrix substrates. The groups had no association with traditional markers of liver injury or function, response to surgery, or complications of cirrhosis. However, infants with an inflammation signature were younger, while those with a fibrosis signature had decreased transplant-free survival. Conclusions Molecular profiling at diagnosis of biliary atresia uncovers a signature of inflammation or fibrosis in most livers. This signature may relate to staging of disease at diagnosis and has implications to clinical outcomes.http://deepblue.lib.umich.edu/bitstream/2027.42/112492/1/13073_2010_Article_154.pd

    ToppCluster: a multiple gene list feature analyzer for comparative enrichment clustering and network-based dissection of biological systems

    Get PDF
    ToppCluster is a web server application that leverages a powerful enrichment analysis and underlying data environment for comparative analyses of multiple gene lists. It generates heatmaps or connectivity networks that reveal functional features shared or specific to multiple gene lists. ToppCluster uses hypergeometric tests to obtain list-specific feature enrichment P-values for currently 17 categories of annotations of human-ortholog genes, and provides user-selectable cutoffs and multiple testing correction methods to control false discovery. Each nameable gene list represents a column input to a resulting matrix whose rows are overrepresented features, and individual cells per-list P-values and corresponding genes per feature. ToppCluster provides users with choices of tabular outputs, hierarchical clustering and heatmap generation, or the ability to interactively select features from the functional enrichment matrix to be transformed into XGMML or GEXF network format documents for use in Cytoscape or Gephi applications, respectively. Here, as example, we demonstrate the ability of ToppCluster to enable identification of list-specific phenotypic and regulatory element features (both cis-elements and 3ā€²UTR microRNA binding sites) among tissue-specific gene lists. ToppClusterā€™s functionalities enable the identification of specialized biological functions and regulatory networks and systems biology-based dissection of biological states. ToppCluster can be accessed freely at http://toppcluster.cchmc.org

    Regression based predictor for p53 transactivation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The p53 protein is a master regulator that controls the transcription of many genes in various pathways in response to a variety of stress signals. The extent of this regulation depends in part on the binding affinity of p53 to its response elements (REs). Traditional profile scores for p53 based on position weight matrices (PWM) are only a weak indicator of binding affinity because the level of binding also depends on various other factors such as interaction between the nucleotides and, in case of p53-REs, the extent of the spacer between the dimers.</p> <p>Results</p> <p>In the current study we introduce a novel <it>in-silico </it>predictor for p53-RE transactivation capability based on a combination of multidimensional scaling and multinomial logistic regression. Experimentally validated known p53-REs along with their transactivation capabilities are used for training. Through cross-validation studies we show that our method outperforms other existing methods. To demonstrate the utility of this method we (a) rank putative p53-REs of target genes and target microRNAs based on the predicted transactivation capability and (b) study the implication of polymorphisms overlapping p53-RE on its transactivation capability.</p> <p>Conclusion</p> <p>Taking into account both nucleotide interactions and the spacer length of p53-RE, we have created a novel <it>in-silico </it>regression-based transactivation capability predictor for p53-REs and used it to analyze validated and novel p53-REs and to predict the impact of SNPs overlapping these elements.</p

    ToppGene Suite for gene list enrichment analysis and candidate gene prioritization

    Get PDF
    ToppGene Suite (http://toppgene.cchmc.org; this web site is free and open to all users and does not require a login to access) is a one-stop portal for (i) gene list functional enrichment, (ii) candidate gene prioritization using either functional annotations or network analysis and (iii) identification and prioritization of novel disease candidate genes in the interactome. Functional annotation-based disease candidate gene prioritization uses a fuzzy-based similarity measure to compute the similarity between any two genes based on semantic annotations. The similarity scores from individual features are combined into an overall score using statistical meta-analysis. A P-value of each annotation of a test gene is derived by random sampling of the whole genome. The proteinā€“protein interaction network (PPIN)-based disease candidate gene prioritization uses social and Web networks analysis algorithms (extended versions of the PageRank and HITS algorithms, and the K-Step Markov method). We demonstrate the utility of ToppGene Suite using 20 recently reported GWAS-based geneā€“disease associations (including novel disease genes) representing five diseases. ToppGene ranked 19 of 20 (95%) candidate genes within the top 20%, while ToppNet ranked 12 of 16 (75%) candidate genes among the top 20%
    • ā€¦
    corecore