11 research outputs found

    Evolving Notch polyQ tracts reveal possible solenoid interference elements

    No full text
    <div><p>Polyglutamine (polyQ) tracts in regulatory proteins are extremely polymorphic. As functional elements under selection for length, triplet repeats are prone to DNA replication slippage and indel mutations. Many polyQ tracts are also embedded within intrinsically disordered domains, which are less constrained, fast evolving, and difficult to characterize. To identify structural principles underlying polyQ tracts in disordered regulatory domains, here I analyze deep evolution of metazoan Notch polyQ tracts, which can generate alleles causing developmental and neurogenic defects. I show that Notch features polyQ tract turnover that is restricted to a discrete number of conserved “polyQ insertion slots”. Notch polyQ insertion slots are: (<i>i</i>) identifiable by an amphipathic “slot leader” motif; (<i>ii</i>) conserved as an intact C-terminal array in a 1-to-1 relationship with the N-terminal solenoid-forming ankyrin repeats (ARs); and (<i>iii</i>) enriched in carboxamide residues (Q/N), whose sidechains feature dual hydrogen bond donor and acceptor atoms. Correspondingly, the terminal loop and β-strand of each AR feature conserved carboxamide residues, which would be susceptible to folding interference by hydrogen bonding with residues outside the ARs. I thus suggest that Notch polyQ insertion slots constitute an array of AR interference elements (ARIEs). Notch ARIEs would dynamically compete with the delicate serial folding induced by adjacent ARs. Huntingtin, which harbors solenoid-forming HEAT repeats, also possesses a similar number of polyQ insertion slots. These results suggest that intrinsically disordered interference arrays featuring carboxamide and polyQ enrichment may constitute coupled proteodynamic modulators of solenoids.</p></div

    Notch AR interference elements (ARIEs) are present as a distinct array within NICD.

    No full text
    <p><b>(a)</b> The polyQ slot leader motif defines 7 possible insertion sites, for which six have independently evolved polyQ tracts in distinct lineages of flies. The number of slots is also suggestive of an interference with the seven ankyrin repeats or the six pairs of adjacent repeats. <b>(b)</b> The leader sequences from <i>Drosophila</i> and <i>Stomoxys</i> polyQ insertion slots (except for slot C) were used to produce the motif logo shown. Also shown is an amphipathic helix starting with residue number four and showing how the hydrophobic amino acids at positions 6, 9, and 10 of the slot leader motif are segregated to one side of the helix (yellow circles). Interestingly, position 7 frequently features a single glutamine residue, which could interact with an adjacent polyQ tract. <b>(c)</b> <i>Drosophila</i> Notch and vertebrate Notch1 feature a polyQ tract in slot-G, although it is much more expanded in <i>Drosophila</i> than humans. <b>(d, e)</b> Unlike <i>Drosophila</i> Notch, the Notch proteins from several other fly genera, including <i>Musca</i>, <i>Stomoxys</i>, <i>Glossina</i>, <i>Lucilia</i>, <i>Bactrocera</i>, and <i>Ceratitis</i>, feature a more prominent polyQ tract in slot-F. These flies also have a much smaller polyQ tract in slot-G. The tephritid genera also feature a tract in slot-A, demonstrating that N-terminal slots can also accept polyQ tracts. <b>(f)</b> The Notch protein from <i>Anopheles darlingii</i> features polyQ tracts in slot-B and slot-E in addition to a very long tract in slot-G.</p

    PolyQ tracts evolve in well-defined slots in Htt and Notch proteins.

    No full text
    <p><b>(a)</b> Shown is the N-terminal region of Huntingtin (Htt) from humans, tarsier, mouse, and <i>Drosophila</i>. This region evolved a polyglutamine (polyQ) tract at a well-defined location in mammals, which later expanded in the evolution of humans, and is still evolving. While fly Htt is missing the N-terminal polyQ tract, it features a separate tract elsewhere in the protein as shown. Small amino acid residues that are secondary structure breakers are highlighted in blue. Glutamine residues are highlighted in red. Hydrophobic amino acid residues preceding the polyQ tract insertion site are highlighted in cyan. Conserved residues are highlighted in yellow. See <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0174253#pone.0174253.g003" target="_blank">Fig 3</a> for a diagram showing the location of polyQ tracts relative to other NICD domains. <b>(b)</b> Shown is an evolutionary tree of various dipteran genera for which Notch sequences were analyzed in this study. The cladogram is a simplified version of the comprehensive fly tree based on multiple nuclear and mitochondrial genes and morphological markers [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0174253#pone.0174253.ref042" target="_blank">42</a>]. The six different families to which these species belong are listed. <b>(c)</b> Shown is the polyQ tract regions of fly Notch proteins. Several sites (lettered slots) have independently evolved a polyQ tract in different fly genera. Most slots are preceded by a leader motif sequence, even when a polyQ tract is not present. <b>(d)</b> Inset shows the high hydrogen bonding potential of carboxamide amino acid side chains.</p

    Yeast interactome for the lost genes shows specific associations between lysine biosynthetic enzymes and the ClpB chaperones.

    No full text
    <p>(<b>A</b>) Yeast interactome graph showing high-throughput physical interactions (pink links), medium throughput TAP-tag physical interaction data for diverse chaperones (gray links), and genetic interactions (green links) amongst the yeast counterparts to the twenty-four lost genetic functions. Genetic interactions are based on 108 separate data sets, while physical interactions, not including the separate TAP-tag data set, are based on 103 data sets. The gene nodes are also colored by functional category. The largest statistically significant category of enrichment is for “biosynthetic process” with an expect value in random sampling of 2.3E-24 (17/27 lost genes). The two lost eukaryotic ClpB chaperones encoded by HSP104 and HSP78 interact physically with only two of the lost genes, LYS4 and LYS20, which are mitochondrial components of the lysine biosynthesis pathway. (<b>B</b>) Gene association analysis shows that Lys4 and Lys20 make contact with ribosome-associated chaperones (RAC) but no other Hsp70 and J-domain co-chaperones. Proteins localized to the mitochondria are indicated with purple labels. Genes with known roles in protein targeting and import to the mitochondria or refolding after import are indicated by asterisks (see key). Genes with HSE4 elements in their promoters (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.t001" target="_blank">Table 1</a>) are indicated by yellow halos. The edge and node color combination is as in (A).</p

    The <i>clpB</i> genes are co-regulated by HSE4s in Fungi and Holozoa.

    No full text
    <p>* Each locus matching the indicated motif in each genome is either listed as a named gene or enumerated. For example, “+2” means “plus two more loci” with matches in upstream regulatory region and “+2” in ORF” means “plus two more in ORFs”.</p><p>The <i>clpB</i> genes are co-regulated by HSE4s in Fungi and Holozoa.</p

    A perfect HSE4 element is specific to a <i>clpB</i> refolding regulon in <i>S</i>. <i>cerevisiae</i>.

    No full text
    <p>Abbreviations: DP-I, dot plot motif I; IUPAC, International Union of Pure and Applied Chemistry; PWM, Position-Weighted Matrix; Ψ, pseudo-count correction.</p><p>* Models 1–3 describe the genomic distributions of the <i>clpB</i>-specific HSE4 element using different descriptions. IUPAC DNA codes are used as needed except ‘N’ is represented by a dot (‘.’). Y = pYrimidine (C or T), and R = puRine (A or G).</p><p>** Percentages refer to the fraction of all genomic loci matching the model signature that are known to be involved in prion homeostasis. All such loci are listed in bold in the penultimate column. Hits in CDS regions are not listed but are counted for the precision metric.</p><p>A perfect HSE4 element is specific to a <i>clpB</i> refolding regulon in <i>S</i>. <i>cerevisiae</i>.</p

    Hsp78 and Hsp104 are required for thermotolerance.

    No full text
    <p>Viability of <i>clpB</i> mutants cultured in rich media (YPD). Cultures were grown to log phase at 30°C, pre-treated at 37°C and diluted 1:1 into 50°C media (20 min) or diluted directly into cold water. Values are average colony forming unit (cfu) counts from YPD plates incubated at 30°C from three strains of each genotype. Error bars correspond to the standard deviation. Strains are: wild type: BY4730, BY4700, BY4738; <i>hsp78</i>: JF2478, JF2479, JF2480; <i>hsp104</i>: JF2473, JF2474, JF2498; <i>hsp78 hsp104</i>: JF2494, JF2495, JF2516. Double asterisks correspond to a p-value of < 0.001 in an unpaired Student’s t-test.</p

    Twenty-five genetic functions undetectable in animals.

    No full text
    <p>(<b>A</b>) Heat map of twenty-seven genes (first twenty-seven rows with gene names) from the budding yeast <i>S</i>. <i>cerevisiae</i> (<i>S</i>. <i>cer</i>.) and their orthologs in related organisms with whole genome sequence assemblies and their closest matches in animals. Six of the twenty-seven yeast genes are the result of lineage-specific duplications in yeast (indicated by dark gray brackets, left of table), leaving twenty-four genetic functions either lost or not measurably conserved in metazoans. The heat map is based on the BLASTP expect values of the best match to the indicated yeast gene (see key, lower right of table). Species abbreviations: <i>S</i>. <i>cer</i>., <i>Saccharomyces cerevisiae</i>; <i>S</i>. <i>pom</i>., <i>Schizosaccharomyces pombe</i>; <i>C</i>. <i>neo</i>., <i>Cryptococcus neoformans</i>; <i>P</i>. <i>gra</i>., <i>Puccinia graminis</i>; <i>B</i>. <i>den</i>., <i>Batrachochytrium dendrobatidis; C</i>. <i>owc</i>., <i>Capsaspora owczarzaki; M</i>. <i>bre</i>., <i>Monosiga brevicollis</i>; <i>S</i>. <i>ros</i>., <i>Salpingoeca rosetta; A</i>. <i>que</i>., <i>Amphimedon queenslandica</i>; <i>T</i>. <i>adh</i>., <i>Trichoplax adhaerens; M</i>. <i>lei</i>., <i>Mnemiopsis leidyi; N</i>. <i>vec</i>., <i>Nematostella vectensis; L</i>. <i>gig</i>., <i>Lottia gigantea; C</i>. <i>tel</i>., <i>Capitella teleta</i>; <i>H</i>. <i>rob</i>., <i>Helobdella robusta</i>; <i>D</i>. <i>mel</i>., <i>Drosophila melanogaster</i>; and <i>H</i>. <i>sap</i>., <i>Homo sapiens</i>. Other abbreviations: Chytridio., Chytridiomycota; Placo., Placozoa; Ctenoph., Ctenophora; Ecdyso., Ecdysozoa; and Chor., Chordata. The last row is from an alternative screen for lost animal genes (see text) in which Amoebozoa PFAM domains that are absent in animals were identified (five genes already identified plus one new gene, <i>PURB</i>). (<b>B</b>) Close-up of the lost animal gene columns from panel A. This table shows either the taxonomic origin of a weak animal match, or the name of the gene if it is not directly related (<i>i</i>.<i>e</i>., not orthologous) to the named yeast gene (first column) for all matches with BLASTP expectation values < 1e-20. Most of the animal matches to the candidate lost genes correspond to lineage-specific horizontal gene transfers and/or environmental contaminants from unrelated (non-opisthokont) clades. Some matches correspond to non-orthologous genes such as <i>ANKCLP</i> (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.s002" target="_blank">S2 Fig.</a>) in the case of the eukaryotic clpB genes <i>HSP78</i> and <i>HSP104</i>, or <i>ACO1</i> in the case of <i>LYS4</i> (<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.g003" target="_blank">Fig. 3</a>). Only <i>HIS2</i> might have been lost soon after early animal radiation based on the presence of a weak match in <i>Trichoplax adhaerens</i>. This gene is most similar to the version from <i>Capsaspora</i> and not to either of the two choanoflagellates, so it is of uncertain origin.</p

    Phylogenetic analyses confirming absence of intact <i>clpB</i> genes in animals.

    No full text
    <p>(<b>A</b>) Phylogenetic inference using Bayesian MCMC methods shows that the closest matching sequence in known animal genome sequences is a partial ClpB fragment in the sponge <i>Amphimedon queenslandica</i>. This fragment does not contain both nucleotide binding domains (NBDs) seen in bacterial or eukaryotic (Hsp78 and Hsp104) ClpB sequences, which are at least 800 amino acids long. The top three sequence hits in GenBank databases using the <i>Amphimedon</i> ClpB fragment as query are included. Two of these are intact ClpB protein sequences from bacterial (γ-proteobacteria) endosymbionts of marine tubeworms. Topological convergence was achieved after 300,000 generations with a relative burn-in phase of 25% (MrBayes) [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref080" target="_blank">80</a>–<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref082" target="_blank">82</a>]. Mixed models were tested, but WAG [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref083" target="_blank">83</a>] was the sole model favored upon completion. Average standard deviation of split frequencies was < 0.002. The tree was rooted between the bacterial clpB clade, including mitochondrial Hsp78, and the eukaryotic Hsp104 clade. (<b>B</b>) Maximum likelihood phylogenetic analysis of the ClpB family with choanoflagellates and chytridiomycetes (<i>Batrachochytrium dendrobatidis</i> and <i>Gonapodya prolifera</i>) representing fungal lineages shows that Hsp78 is the mitochondrially inherited <i>clpB</i> gene based on its sister grouping with α-proteobacteria. Dikarya sequences were omitted to simplify the tree. Numbers indicate bootstrap support from 500 replicates using MEGA5 [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref085" target="_blank">85</a>]. (<b>C</b>) A simple, unrooted, neighbor-joining tree of the highly-conserved <i>clpB</i> genes across all of life shows that the ClpB functions are maintained in all major domains of life and in all endosymbiotic eukaryotic compartments. The ubiquity of the <i>clpB</i> genes makes the loss in animals more striking.</p

    Yeast <i>HSP78</i> and <i>HSP104</i> share a unique regulatory architecture.

    No full text
    <p>(<b>A</b>) A dot plot between 320 bp sequence blocks located -10 bp and -65 bp upstream of the <i>S</i>. <i>cerevisiae HSP78</i> (vertical) and <i>HSP104</i> (horizontal) open reading frames, respectively. These coordinates were chosen because it anchors their HSE4 elements at the same position (starting 80 bp from the left) for ease of comparison. This comparison reveals a serial ordered alignment corresponding to an HSE4 element (blue box), a short alignment partially overlapping a triple array of STRE elements (pink boxes), and a TATA-containing core promoter element (green box). (<b>B</b>) Graph of the relevant details of the <i>HSP78</i> (top) and <i>HSP104</i> (bottom) upstream regulatory sequences. Each track (numbered lines 1–6) provides locations of the dot plot motifs (DP-I and DP-II) and known binding motifs (colored boxes in lines 1–5). Labels above certain binding sites indicate binding of a known transcription factor at this position under the specific conditions listed [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref039" target="_blank">39</a>,<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref049" target="_blank">49</a>]. Line 6 shows conserved non-coding sequences (CNS) as determined from comparisons of corresponding orthologs from other species of <i>Saccharomyces</i>. Dots and lines in the ruler mark off 10 bp and 100 bp intervals, respectively. The transcriptional arrows represent only approximate positions of the transcriptional +1 based on distance from the TATA-containing promoter motif. (<b>C</b>) Motif key for panel B. (<b>D</b>) Results using the Serial Pattern of Expression Levels Locator (SPELL [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref031" target="_blank">31</a>]) algorithm/query engine to identify the closest regulated genes to <i>HSP78</i> (left graph), <i>HSP104</i> (center graph), or both genes combined (right graph). Plotted are the transcriptional correlation scores (SPELL, Adjusted Correlation Scores). The SPELL scores are based on pair-wise Pearson correlation co-efficients with Fisher z-transforms across 9,190 gene expression studies in yeast [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0117192#pone.0117192.ref031" target="_blank">31</a>]. These graphs show that the two <i>clpB</i> genes are co-regulated with one another (red arrows) and to Hsp70/DnaK (K) and J-domain/DnaJ co-chaperones (J), which bind and recruit unfolded proteins to chaperone machinery in the cytoplasm (blue bars) and in the mitochondria (purple bars).</p
    corecore