15 research outputs found

    Pattern of Sequence Variation Across 213 Environmental Response Genes

    No full text
    To promote the clinical and epidemiological studies that improve our understanding of human genetic susceptibility to environmental exposure, the Environmental Genome Project (EGP) has scanned 213 environmental response genes involved in DNA repair, cell cycle regulation, apoptosis, and metabolism for single nucleotide polymorphisms (SNPs). Many of these genes have been implicated by loss-of-function mutations associated with severe diseases attributable to decreased protection of genomic integrity. Therefore, the hypothesis for these studies is that individuals with functionally significant polymorphisms within these genes may be particularly susceptible to genotoxic environmental agents. On average, 20.4 kb of baseline genomic sequence or 86% of each gene, including a substantial amount of introns, all exons, and 1.3 kb upstream and downstream, were scanned for variations in the 90 samples of the Polymorphism Discovery Resource panel. The average nucleotide diversity across the 4.2 MB of these 213 genes is 6.7 × 10(-4), or one SNP every 1500 bp, when two random chromosomes are compared. The average candidate environmental response gene contains 26 PHASE inferred haplotypes, 34 common SNPs, 6.2 coding SNPs (cSNPs), and 2.5 nonsynonymous cSNPs. SIFT and Polyphen analysis of 541 nonsynonymous cSNPs identified 57 potentially deleterious SNPs. An additional eight polymorphisms predict altered protein translation. Because these genes represent 1% of all known human genes, extrapolation from these data predicts the total genomic set of cSNPs, nonsynonymous cSNPs, and potentially deleterious nonsynonymous cSNPs. The implications for the use of these data in direct and indirect association studies of environmentally induced diseases are discussed

    Retention of strain HS orthologs in <i>S. glossinidius</i> and SOPE according to COG functional category.

    No full text
    <p>The dark shaded component of each bar refers to intact genes retained in both <i>S. glossinidius</i> and SOPE. The intermediate shaded component refers to intact genes retained in only <i>S. glossinidius</i> (upper bar) or SOPE (lower bar) and the lighter shaded component refers to genes that are either absent or disrupted in both <i>S. glossinidius</i> and SOPE. The COG categories are organized in five larger groups with red representing genes involved in information storage and processing, blue representing genes involved in cellular processes and signaling, black representing genes involved in metabolism, green representing genes with poorly characterized functions, and yellow representing components of phages and IS-elements.</p

    Allelic spectrum of pseudogene mutations in strain HS orthologs found in SOPE and <i>S. glossinidius</i>.

    No full text
    <p>Numbers in parentheses indicate the number of pseudogenes of strain HS orthologs found in the SOPE or <i>S. glossinidius</i> genome sequences. Nonsense mutations are classified as point mutations that catalyze the incorporation of a premature stop codon in the reading frame of a strain HS ortholog, independent of the presence of any frameshift resulting from an indel.</p>a<p>Nonsense mutations are classified as point mutations that catalyze the incorporation of a premature stop codon in the reading frame of an HS ortholog, independent of the presence of any frameshifting indel.</p

    Alignment between strain HS contigs (top) and chromosomes of SOPE (left) and <i>S. glossinidius</i> (right).

    No full text
    <p>The draft strain HS contigs are depicted in an arbitrary color scheme (outer top ring). Contigs sharing <5 kb synteny with either the SOPE or <i>S. glossinidius</i> genome are uncolored. The uppermost plot (colored in purple and orange) depicts G+C skew, based on a 40 kb sliding window. For upper tracks, grey bars depict genes unique to strain HS whereas green bars depict strain HS genes that share orthologs with the aligned symbiont chromosome. For lower tracks, green and red bars represent (respectively) intact and disrupted orthologs of strain HS genes in the insect symbiont genomes, whereas blue bars highlight prophage and IS-element sequences in the insect symbiont chromosomes. Plots of pairwise nucleotide sequence identity are shown in the lower alignment following <i>in silico</i> removal of prophage and IS-elements from the SOPE and <i>S. glossinidius</i> sequences. Consensus <i>oriC</i> and <i>dif</i> sequences are labeled to indicate putative origins and termini of chromosome replication.</p

    Base composition bias and mutation rates observed in pairwise comparisons between strain HS, <i>S. glossinidius</i> and SOPE.

    No full text
    <p>The evolutionary relationships between SOPE, strain HS and <i>S. glossinidius</i> are depicted by bold lines drawn to scale in accordance with levels of genome-wide divergence at 4-fold degenerate (GC4) sites. Upper boxes show genome-wide GC-percentages at 2<sup>nd</sup> codon position (GC2), GC4 and intergenic (GCI) sites. Lower boxes depict the number of substitutions per site for intact genes (dGC2 and dGC4) and pseudogenes (dGC2<sub>Ψ</sub> and dGC4<sub>Ψ</sub>). The data were obtained from pairwise analysis of point mutations in 1,355 intact genes and 1,376 pseudogenes shared between strain HS and <i>S. glossinidius</i>, and 1,414 intact genes and 1,194 pseudogenes shared between strain HS and SOPE.</p

    Densities of disrupting mutations in SOPE and <i>S. glossinidius</i> pseudogenes.

    No full text
    <p>The numbers of frameshifting and truncating indels and nonsense mutations were computed from alignments of strain HS, SOPE and <i>S. glossinidius</i> orthologs. Mutation densities were computed according to the original strain HS ORF sizes (left) or the current SOPE or <i>S. glossinidius</i> pseudogene sizes (right).</p

    General features of the strain HS, SOPE, and <i>S. glossinidius</i> genome sequences.

    No full text
    <p>The chromosome size of strain HS is estimated based on the combined size of non-redundant contigs in the draft sequence assembly. The number in parentheses indicates the total number of candidate genes identified in strain HS, including representatives that are fragmented in the current assembly.</p>a<p>Estimated based on current draft assembly.</p>b<p>Total number of genes identified in strain HS draft genome. Genes containing gaps in the draft assembly were excluded from all comparative analyses.</p

    Numbers of cryptic pseudogenes in <i>S. glossinidius</i> and SOPE estimated using a Monte Carlo simulation.

    No full text
    <p>The simulation was repeated with an increasing number of candidate pseudogenes until estimates of pseudogene number (red) and the size difference between pseudogenes and intact genes (blue) matched empirical values shown in <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002990#pgen-1002990-g005" target="_blank">Figure 5</a> and <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1002990#pgen-1002990-t001" target="_blank">Table 1</a>, as highlighted by bold bars. The densities of disrupting mutations in <i>S. glossinidius</i> and SOPE pseudogenes (which include cryptic pseudogenes) are shown in the upper left inset, corresponding to the data points highlighted in bold.</p

    Phylogeny of strain HS and related <i>Sodalis</i>-allied endosymbionts and free-living bacteria based on maximum likelihood analyses of a 1.46 kb fragment of 16S rRNA and a 1.68 kb fragment of <i>groEL</i>.

    No full text
    <p>Insect endosymbionts that do not have proper nomenclature are designed by the prefix “E”, followed by the name of their insect host. The numbers adjacent to nodes indicate maximum likelihood bootstrap values shown for nodes with bootstrap support >70%.</p
    corecore