37 research outputs found

    oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes

    Get PDF
    Targeted transcript profiling studies can identify sets of co-expressed genes; however, identification of the underlying functional mechanism(s) is a significant challenge. Established methods for the analysis of gene annotations, particularly those based on the Gene Ontology, can identify functional linkages between genes. Similar methods for the identification of over-represented transcription factor binding sites (TFBSs) have been successful in yeast, but extension to human genomics has largely proved ineffective. Creation of a system for the efficient identification of common regulatory mechanisms in a subset of co-expressed human genes promises to break a roadblock in functional genomics research. We have developed an integrated system that searches for evidence of co-regulation by one or more transcription factors (TFs). oPOSSUM combines a pre-computed database of conserved TFBSs in human and mouse promoters with statistical methods for identification of sites over-represented in a set of co-expressed genes. The algorithm successfully identified mediating TFs in control sets of tissue-specific genes and in sets of co-expressed genes from three transcript profiling studies. Simulation studies indicate that oPOSSUM produces few false positives using empirically defined thresholds and can tolerate up to 50% noise in a set of co-expressed genes

    The Association of Virulence Factors with Genomic Islands

    Get PDF
    Background: It has been noted that many bacterial virulence factor genes are located within genomic islands (GIs; clusters of genes in a prokaryotic genome of probable horizontal origin). However, such studies have been limited to single genera or isolated observations. We have performed the first large-scale analysis of multiple diverse pathogens to examine this association. We additionally identified genes found predominantly in pathogens, but not non-pathogens, across multiple genera using 631 complete bacterial genomes, and we identified common trends in virulence for genes in GIs. Furthermore, we examined the relationship between GIs and clustered regularly interspaced palindromic repeats (CRISPRs) proposed to confer resistance to phage. Methodology/Principal Findings: We show quantitatively that GIs disproportionately contain more virulence factors than the rest of a given genome (p,1E-40 using three GI datasets) and that CRISPRs are also over-represented in GIs. Virulence factors in GIs and pathogen-associated virulence factors are enriched for proteins having more ‘‘offensive’ ’ functions, e.g. active invasion of the host, and are disproportionately components of type III/IV secretion systems or toxins. Numerous hypothetical pathogen-associated genes were identified, meriting further study. Conclusions/Significance: This is the first systematic analysis across diverse genera indicating that virulence factors are disproportionately associated with GIs. ‘‘Offensive’ ’ virulence factors, as opposed to host-interaction factors, may more ofte

    Pseudomonas aeruginosa Genome Database and PseudoCAP: facilitating community-based, continually updated, genome annotation

    Get PDF
    Using the Pseudomonas aeruginosa Genome Project as a test case, we have developed a database and submission system to facilitate a community-based approach to continually updated genome annotation (http://www.pseudomonas.com). Researchers submit proposed annotation updates through one of three web-based form options which are then subjected to review, and if accepted, entered into both the database and log file of updates with author acknowledgement. In addition, a coordinator continually reviews literature for suitable updates, as we have found such reviews to be the most efficient. Both the annotations database and updates-log database have Boolean search capability with the ability to sort results and download all data or search results as tab-delimited files. To complement this peer-reviewed genome annotation, we also provide a linked GBrowse view which displays alternate annotations. Additional tools and analyses are also integrated, including PseudoCyc, and knockout mutant information. We propose that this database system, with its focus on facilitating flexible queries of the data and providing access to both peer-reviewed annotations as well as alternate annotation information, may be a suitable model for other genome projects wishing to use a continually updated, community-based annotation approach. The source code is freely available under GNU General Public Licence

    The CD38/NAD/SIRTUIN1/EZH2 Axis Mitigates Cytotoxic CD8 T Cell Function and Identifies Patients with SLE Prone to Infections

    Get PDF
    Summary: Patients with systemic lupus erythematosus (SLE) suffer frequent infections that account for significant morbidity and mortality. T cell cytotoxic responses are decreased in patients with SLE, yet the responsible molecular events are largely unknown. We find an expanded CD8CD38high T cell subset in a subgroup of patients with increased rates of infections. CD8CD38high T cells from healthy subjects and patients with SLE display decreased cytotoxic capacity, degranulation, and expression of granzymes A and B and perforin. The key cytotoxicity-related transcription factors T-bet, RUNX3, and EOMES are decreased in CD8CD38high T cells. CD38 leads to increased acetylated EZH2 through inhibition of the deacetylase Sirtuin1. Acetylated EZH2 represses RUNX3 expression, whereas inhibition of EZH2 restores CD8 T cell cytotoxic responses. We propose that high levels of CD38 lead to decreased CD8 T cell-mediated cytotoxicity and increased propensity to infections in patients with SLE, a process that can be reversed pharmacologically. : Katsuyama et al. find that an expanded CD8CD38high T cell population in SLE patients is linked to infections. CD8CD38high T cells display decreased cytotoxic capacity by suppressing the expression of related molecules through an NAD+/Sirtuin1/EZH2 pathway. EZH2 inhibitors increase cytotoxicity offering a means to mitigate infection rates in SLE. Keywords: systemic lupus erythematosus, patients, CD8 T cell, CD38, cytotoxicity, infection, nicotinamide adenine dinucleotide, Sirtuin1, EZH

    oPOSSUM: integrated tools for analysis of regulatory motif over-representation

    Get PDF
    The identification of over-represented transcription factor binding sites from sets of co-expressed genes provides insights into the mechanisms of regulation for diverse biological contexts. oPOSSUM, an internet-based system for such studies of regulation, has been improved and expanded in this new release. New features include a worm-specific version for investigating binding sites conserved between Caenorhabditis elegans and C. briggsae, as well as a yeast-specific version for the analysis of co-expressed sets of Saccharomyces cerevisiae genes. The human and mouse applications feature improvements in ortholog mapping, sequence alignments and the delineation of multiple alternative promoters. oPOSSUM2, introduced for the analysis of over-represented combinations of motifs in human and mouse genes, has been integrated with the original oPOSSUM system. Analysis using user-defined background gene sets is now supported. The transcription factor binding site models have been updated to include new profiles from the JASPAR database. oPOSSUM is available at http://www.cisreg.ca/oPOSSUM

    The Stem Cell Discovery Engine: an integrated repository and analysis system for cancer stem cell comparisons

    Get PDF
    Mounting evidence suggests that malignant tumors are initiated and maintained by a subpopulation of cancerous cells with biological properties similar to those of normal stem cells. However, descriptions of stem-like gene and pathway signatures in cancers are inconsistent across experimental systems. Driven by a need to improve our understanding of molecular processes that are common and unique across cancer stem cells (CSCs), we have developed the Stem Cell Discovery Engine (SCDE)—an online database of curated CSC experiments coupled to the Galaxy analytical framework. The SCDE allows users to consistently describe, share and compare CSC data at the gene and pathway level. Our initial focus has been on carefully curating tissue and cancer stem cell-related experiments from blood, intestine and brain to create a high quality resource containing 53 public studies and 1098 assays. The experimental information is captured and stored in the multi-omics Investigation/Study/Assay (ISA-Tab) format and can be queried in the data repository. A linked Galaxy framework provides a comprehensive, flexible environment populated with novel tools for gene list comparisons against molecular signatures in GeneSigDB and MSigDB, curated experiments in the SCDE and pathways in WikiPathways. The SCDE is available at http://discovery.hsci.harvard.edu

    Toward interoperable bioscience data

    Get PDF
    © The Author(s), 2012. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Nature Genetics 44 (2012): 121-126, doi:10.1038/ng.1054.To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open 'data commoning' culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared 'Investigation-Study-Assay' framework to support that vision.The authors also acknowledge the following funding sources in particular: UK Biotechnology and Biological Sciences Research Council (BBSRC) BB/I000771/1 to S.-A.S. and A.T.; UK BBSRC BB/I025840/1 to S.-A.S.; UK BBSRC BB/I000917/1 to D.F.; EU CarcinoGENOMICS (PL037712) to J.K.; US National Institutes of Health (NIH) 1RC2CA148222-01 to W.H. and the HSCI; US MIRADA LTERS DEB-0717390 and Alfred P. Sloan Foundation (ICoMM) to L.A.-Z.; Swiss Federal Government through the Federal Office of Education and Science (FOES) to L.B. and I.X.; EU Innovative Medicines Initiative (IMI) Open PHACTS 115191 to C.T.E.; US Department of Energy (DOE) DE-AC02- 06CH11357 and Arthur P. Sloan Foundation (2011- 6-05) to J.G.; UK BBSRC SysMO-DB2 BB/I004637/1 and BBG0102181 to C.G.; UK BBSRC BB/I000933/1 to C.S. and J.L.G.; UK MRC UD99999906 to J.L.G.; US NIH R21 MH087336 (National Institute of Mental Health) and R00 GM079953 (National Institute of General Medical Science) to A.L.; NIH U54 HG006097 to J.C. and C.E.S.; Australian government through the National Collaborative Research Infrastructure Strategy (NCRIS); BIRN U24-RR025736 and BioScholar RO1-GM083871 to G.B. and the 2009 Super Science initiative to C.A.S

    The Constrained Maximal Expression Level Owing to Haploidy Shapes Gene Content on the Mammalian X Chromosome.

    Get PDF
    X chromosomes are unusual in many regards, not least of which is their nonrandom gene content. The causes of this bias are commonly discussed in the context of sexual antagonism and the avoidance of activity in the male germline. Here, we examine the notion that, at least in some taxa, functionally biased gene content may more profoundly be shaped by limits imposed on gene expression owing to haploid expression of the X chromosome. Notably, if the X, as in primates, is transcribed at rates comparable to the ancestral rate (per promoter) prior to the X chromosome formation, then the X is not a tolerable environment for genes with very high maximal net levels of expression, owing to transcriptional traffic jams. We test this hypothesis using The Encyclopedia of DNA Elements (ENCODE) and data from the Functional Annotation of the Mammalian Genome (FANTOM5) project. As predicted, the maximal expression of human X-linked genes is much lower than that of genes on autosomes: on average, maximal expression is three times lower on the X chromosome than on autosomes. Similarly, autosome-to-X retroposition events are associated with lower maximal expression of retrogenes on the X than seen for X-to-autosome retrogenes on autosomes. Also as expected, X-linked genes have a lesser degree of increase in gene expression than autosomal ones (compared to the human/Chimpanzee common ancestor) if highly expressed, but not if lowly expressed. The traffic jam model also explains the known lower breadth of expression for genes on the X (and the Z of birds), as genes with broad expression are, on average, those with high maximal expression. As then further predicted, highly expressed tissue-specific genes are also rare on the X and broadly expressed genes on the X tend to be lowly expressed, both indicating that the trend is shaped by the maximal expression level not the breadth of expression per se. Importantly, a limit to the maximal expression level explains biased tissue of expression profiles of X-linked genes. Tissues whose tissue-specific genes are very highly expressed (e.g., secretory tissues, tissues abundant in structural proteins) are also tissues in which gene expression is relatively rare on the X chromosome. These trends cannot be fully accounted for in terms of alternative models of biased expression. In conclusion, the notion that it is hard for genes on the Therian X to be highly expressed, owing to transcriptional traffic jams, provides a simple yet robustly supported rationale of many peculiar features of X's gene content, gene expression, and evolution

    Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    No full text
    Background: We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.Infectious Diseases, Division ofMedicine, Faculty ofNon UBCMedicine, Department ofReviewedFacult

    Genomic regulatory blocks underlie extensive microsynteny conservation in insects

    Get PDF
    Insect genomes contain larger blocks of conserved gene order (microsynteny) than would be expected under a random breakage model of chromosome evolution. We present evidence that microsynteny has been retained to keep large arrays of highly conserved noncoding elements (HCNEs) intact. These arrays span key developmental regulatory genes, forming genomic regulatory blocks (GRBs). We recently described GRBs in vertebrates, where most HCNEs function as enhancers and HCNE arrays specify complex expression programs of their target genes. Here we present a comparison of five Drosophila genomes showing that HCNE density peaks centrally in large synteny blocks containing multiple genes. Besides developmental regulators that are likely targets of HCNE enhancers, HCNE arrays often span unrelated neighboring genes. We describe differences in core promoters between the target genes and the unrelated genes that offer an explanation for the differences in their responsiveness to enhancers. We show examples of a striking correspondence between boundaries of synteny blocks, HCNE arrays, and Polycomb binding regions, confirming that the synteny blocks correspond to regulatory domains. Although few noncoding elements are highly conserved between Drosophila and the malaria mosquito Anopheles gambiae, we find that A. gambiae regions orthologous to Drosophila GRBs contain an equivalent distribution of noncoding elements highly conserved in the yellow fever mosquito Aëdes aegypti and coincide with regions of ancient microsynteny between Drosophila and mosquitoes. The structural and functional equivalence between insect and vertebrate GRBs marks them as an ancient feature of metazoan genomes and as a key to future studies of development and gene regulation
    corecore