13,340 research outputs found

    An Omega-Based Bacterial One-Hybrid System for the Determination of Transcription Factor Specificity

    Get PDF
    From the yeast genome completed in 1996 to the 12 Drosophilagenomes published earlier this year; little more than a decade has provided an incredible amount of genomic data. Yet even with this mountain of genetic information the regulatory networks that control gene expression remain relatively undefined. In part, this is due to the enormous amount of non-coding DNA, over 98% of the human genome, which needs to be made sense of. It is also due to the large number of transcription factors, potentially 2,000 such factors in the human genome, which may contribute to any given network directly or indirectly. Certainly, one of the central limitations has been the paucity of transcription factor (TF) specificity data that would aid in the prediction of regulatory targets throughout a genome. The general lack of specificity data has hindered the prediction of regulatory targets for individual TFs as well as groups of factors that function within a common regulatory pathway. A large collection of factor specificities would allow for the combinatorial prediction of regulatory targets that considers all factors actively expressed in a given cell, under a given condition. Herein we describe substantial improvements to a previous bacterial one-hybrid system with increased sensitivity and dynamic range that make it amenable for the high-throughput analysis of sequence-specific TFs. Currently we have characterized 108 (14.3%) of the predicted TFs in Drosophilathat fall into a broad range of DNA-binding domain families, demonstrating the feasibility of characterizing a large number of TFs using this technology. To fully exploit our large database of binding specificities, we have created a GBrowse-based search tool that allows an end-user to examine the overrepresentation of binding sites for any number of individual factors as well as combinations of these factors in up to six Drosophila genomes (veda.cs.uiuc.edu/cgi-bin/gbrowse/gbrowse/Dmel4). We have used this tool to demonstrate that a collection of factor specificities within a common pathway will successfully predict previously validated cis-regulatory modules within a genome. Furthermore, within our database we provide a complete catalog of DNA-binding specificities for all 84 homeodomains in Drosophila. This catalog enabled us to propose and test a detailed set of recognition rules for homeodomains and use this information to predict the specificities of the majority of homeodomains in the human genome

    Protein-DNA Recognition Models for the Homeodomain and C2H2 Zinc Finger Transcription Factor Families

    Get PDF
    Transcription factors: TFs) play a central role in the gene regulatory network of each cell. They can stimulate or inhibit transcription of their target genes by binding to short, degenerate DNA sequence motifs. The goal of this research is to build improved models of TF binding site recognition. This can facilitate the determination of regulatory networks and also allow for the prediction of binding site motifs based only on the TF protein sequence. Recent technological advances have rapidly expanded the amount of quantitative TF binding data available. PBMs: Protein Binding Microarrays) have recently been implemented in a format that allows all 10mers to be assayed in parallel. There is now PBM data available for hundreds of transcription factors. Another fairly recent technique for determining the binding preference of a TF is an in vivo bacterial one-hybrid assay: B1H). In this approach a TF is expressed in E. coli where it can be used to select strong binding sites from a library of randomized sites located upstream of a weak promoter, driving expression of a selectable gene. When coupled with high throughput sequencing and a newly developed analysis method, quantitative binding data can be obtained. In the last few years, the binding specificities of hundreds of TFs have been determined using B1H. The two largest eukaryotic transcription factor families are the zf-C2H2 and homeodomain TF families. Newly available PBM and B1H specificity models were used to develop recognition models for these two families, with the goal of being able to predict the binding specific of a TF from its protein sequence. We developed a feature selection method based on adjusted mutual information that automatically recovers nearly all of the known key residues for the homeodomain and zf-C2H2 families. Using those features we find that, for both families, random forest: RF) and support vector machine: SVM) based recognition models outperform the nearest neighbor method, which has previously been considered the best method

    Vertebrate endothelial lipase: comparative studies of an ancient gene and protein in vertebrate evolution

    Get PDF
    Endothelial lipase (gene: LIPG; enzyme: EL) is one of three members of the triglyceride lipase family that contributes to lipoprotein degradation within the circulation system and plays a major role in HDL metabolism in the body. In this study, in silico methods were used to predict the amino acid sequences, secondary and tertiary structures, and gene locations for LIPG genes and encoded proteins using data from several vertebrate genome projects. LIPG is located on human chromosome 18 and is distinct from other human 'neutral lipase' genes, hepatic lipase (gene: LIPC; enzyme: HL) and lipoprotein lipase (gene: LPL; enzyme: LPL) examined. Vertebrate LIPG genes usually contained 10 coding exons located on the positive strand for most primates, as well as for horse, bovine, opossum, platypus and frog genomes. The rat LIPG gene however contained only 9 coding exons apparently due to the presence of a 'stop' codon' within exon 9. Vertebrate EL protein subunits shared 58-97% sequence identity as compared with 38-45% sequence identities with human HL and LPL. Four previously reported human EL N-glycosylation sites were predominantly conserved among the 10 potential N-glycosylation sites observed for the vertebrate EL sequences examined. Sequence alignments and identities for key EL amino acid residues were observed as well as conservation of predicted secondary and tertiary structures with those previously reported for horse pancreatic lipase (PL) (Bourne et al. 1994). Several potential sites for regulating LIPG gene expression were observed including CpG islands near the LIPG gene promoter and a predicted microRNA binding site near the 3'-untranslated region. Promoter regions containing functional polymorphisms that regulate HDL cholesterol in baboons were conserved among primates but not retained between primates and rodents. Phylogenetic analyses examined the relationships and potential evolutionary origins of the vertebrate LIPG gene subfamily with other neutral triglyceride lipase gene families, LIPC and LPL. It is apparent that the triglyceride lipase ancestral gene for the vertebrate LIPG gene predated the appearance of fish during vertebrate evolution[500 million years ago.Full Tex

    Exploring the DNA-recognition potential of homeodomains

    Get PDF
    The recognition potential of most families of DNA-binding domains (DBDs) remains relatively unexplored. Homeodomains (HDs), like many other families of DBDs, display limited diversity in their preferred recognition sequences. To explore the recognition potential of HDs, we utilized a bacterial selection system to isolate HD variants, from a randomized library, that are compatible with each of the 64 possible 3′ triplet sites (i.e., TAANNN). The majority of these selections yielded sets of HDs with overrepresented residues at specific recognition positions, implying the selection of specific binders. The DNA-binding specificity of 151 representative HD variants was subsequently characterized, identifying HDs that preferentially recognize 44 of these target sites. Many of these variants contain novel combinations of specificity determinants that are uncommon or absent in extant HDs. These novel determinants, when grafted into different HD backbones, produce a corresponding alteration in specificity. This information was used to create more explicit HD recognition models, which can inform the prediction of transcriptional regulatory networks for extant HDs or the engineering of HDs with novel DNA-recognition potential. The diversity of recovered HD recognition sequences raises important questions about the fitness barrier that restricts the evolution of alternate recognition modalities in natural systems

    Histone acetyltransferases:challenges in targeting bi-substrate enzymes

    Get PDF
    Histone acetyltransferases (HATs) are epigenetic enzymes that install acetyl groups onto lysine residues of cellular proteins such as histones, transcription factors, nuclear receptors, and enzymes. HATs have been shown to play a role in diseases ranging from cancer and inflammatory diseases to neurological disorders, both through acetylations of histone proteins and non-histone proteins. Several HAT inhibitors, like bi-substrate inhibitors, natural product derivatives, small molecules, and protein-protein interaction inhibitors, have been developed. Despite their potential, a large gap remains between the biological activity of inhibitors in in vitro studies and their potential use as therapeutic agents. To bridge this gap, new potent HAT inhibitors with improved properties need to be developed. However, several challenges have been encountered in the investigation of HATs and HAT inhibitors that hinder the development of new HAT inhibitors. HATs have been shown to function in complexes consisting of many proteins. These complexes play a role in the activity and target specificity of HATs, which limits the translation of in vitro to in vivo experiments. The current HAT inhibitors suffer from undesired properties like anti-oxidant activity, reactivity, instability, low potency, or lack of selectivity between HAT subtypes and other enzymes. A characteristic feature of HATs is that they are bi-substrate enzymes that catalyze reactions between two substrates: the cofactor acetyl coenzyme A (Ac-CoA) and a lysine-containing substrate. This has important-but frequently overlooked-consequences for the determination of the inhibitory potency of small molecule HAT inhibitors and the reproducibility of enzyme inhibition experiments. We envision that a careful characterization of molecular aspects of HATs and HAT inhibitors, such as the HAT catalytic mechanism and the enzyme kinetics of small molecule HAT inhibitors, will greatly improve the development of potent and selective HAT inhibitors and provide validated starting points for further development towards therapeutic agents.</p

    Mutation in the pssZ Gene Negatively Impacts Exopolysaccharide Synthesis, Surface Properties, and Symbiosis of Rhizobium leguminosarum bv. trifolii with Clover

    Get PDF
    Rhizobium leguminosarum bv. trifolii is a soil bacterium capable of establishing a nitrogen-fixing symbiosis with clover plants (Trifolium spp.). This bacterium secretes large amounts of acidic exopolysaccharide (EPS), which plays an essential role in the symbiotic interaction with the host plant. This polymer is biosynthesized by a multi-enzymatic complex located in the bacterial inner membrane, whose components are encoded by a large chromosomal gene cluster, called Pss-I. In this study, we characterize R. leguminosarum bv. trifolii strain Rt297 that harbors a Tn5 transposon insertion located in the pssZ gene from the Pss-I region. This gene codes for a protein that shares high identity with bacterial serine/threonine protein phosphatases. We demonstrated that the pssZ mutation causes pleiotropic effects in rhizobial cells. Strain Rt297 exhibited several physiological and symbiotic defects, such as lack of EPS production, reduced growth kinetics and motility, altered cell-surface properties, and failure to infect the host plant. These data indicate that the protein encoded by the pssZ gene is indispensable for EPS synthesis, but also required for proper functioning of R. leguminosarum bv. trifolii cells.Polish National Science Centre grant no. DEC-2012/07/B/NZ1/0009
    corecore