35 research outputs found

    A Selective Force Favoring Increased G+C Content in Bacterial Genes

    Get PDF
    Bacteria display considerable variation in their overall base compositions, which range from 13% to over 75% G+C. This variation in genomic base compositions has long been considered to be a strictly neutral character, due solely to differences in the mutational process; however, recent sequence comparisons indicate that mutational input alone cannot produce the observed base compositions, implying a role for natural selection. Because bacterial genomes have high gene content, forces that operate on the base composition of individual genes could help shape the overall genomic base composition. To explore this possibility, we tested whether genes that encode the same protein but vary only in their base compositions at synonymous sites have effects on bacterial fitness. Escherichia coli strains harboring G+C-rich versions of genes display higher growth rates, indicating that despite a pervasive mutational bias toward A+T, a selective force, independent of adaptive codon use, is driving genes toward higher G+C contents

    What Is a Microsatellite: A Computational and Experimental Definition Based upon Repeat Mutational Behavior at A/T and GT/AC Repeats

    Get PDF
    Microsatellites are abundant in eukaryotic genomes and have high rates of strand slippage-induced repeat number alterations. They are popular genetic markers, and their mutations are associated with numerous neurological diseases. However, the minimal number of repeats required to constitute a microsatellite has been debated, and a definition of a microsatellite that considers its mutational behavior has been lacking. To define a microsatellite, we investigated slippage dynamics for a range of repeat sizes, utilizing two approaches. Computationally, we assessed length polymorphism at repeat loci in ten ENCODE regions resequenced in four human populations, assuming that the occurrence of polymorphism reflects strand slippage rates. Experimentally, we determined the in vitro DNA polymerase-mediated strand slippage error rates as a function of repeat number. In both approaches, we compared strand slippage rates at tandem repeats with the background slippage rates. We observed two distinct modes of mutational behavior. At small repeat numbers, slippage rates were low and indistinguishable from background measurements. A marked transition in mutability was observed as the repeat array lengthened, such that slippage rates at large repeat numbers were significantly higher than the background rates. For both mononucleotide and dinucleotide microsatellites studied, the transition length corresponded to a similar number of nucleotides (approximately 10). Thus, microsatellite threshold is determined not by the presence/absence of strand slippage at repeats but by an abrupt alteration in slippage rates relative to background. These findings have implications for understanding microsatellite mutagenesis, standardization of genome-wide microsatellite analyses, and predicting polymorphism levels of individual microsatellite loci

    Causes and Consequences of Genome Expansion in Fungi

    Get PDF
    Fungi display a large diversity in genome size and complexity, variation that is often considered to be adaptive. But because nonadaptive processes can also have important consequences on the features of genomes, we investigated the relationship of genetic drift and genome size in the phylum Ascomycota using multiple indicators of genetic drift. We detected a complex relationship between genetic drift and genome size in fungi: genetic drift is associated with genome expansion on broad evolutionary timescales, as hypothesized for other eukaryotes; but within subphyla over smaller timescales, the opposite trend is observed. Moreover, fungi and bacteria display similar patterns of genome degradation that are associated with initial effects of genetic drift. We conclude that changes in genome size within Ascomycota have occurred using two different routes: large-scale genome expansions are catalyzed by increasing drift as predicted by the mutation-hazard model of genome evolution and small-scale modifications in genome size are independent of drift

    The genome-wide determinants of human and chimpanzee microsatellite evolution

    No full text
    Mutation rates of microsatellites vary greatly among loci. The causes of this heterogeneity remain largely enigmatic yet are crucial for understanding numerous human neurological diseases and genetic instability in cancer. In this first genome-wide study, the relative contributions of intrinsic features and regional genomic factors to the variation in mutability among orthologous human–chimpanzee microsatellites are investigated with resampling and regression techniques. As a result, we uncover the intricacies of microsatellite mutagenesis as follows. First, intrinsic features (repeat number, length, and motif size), which all influence the probability and rate of slippage, are the strongest predictors of mutability. Second, mutability increases nonuniformly with length, suggesting that processes additional to slippage, such as faulty repair, contribute to mutations. Third, mutability varies among microsatellites with different motif composition likely due to dissimilarities in secondary DNA structure formed by their slippage intermediates. Fourth, mutability of mononucleotide microsatellites is impacted by their location on sex chromosomes vs. autosomes and inside vs. outside of Alu repeats, the former confirming the importance of replication and the latter suggesting a role for gene conversion. Fifth, transcription status and location in a particular isochore do not influence microsatellite mutability. Sixth, compared with intrinsic features, regional genomic factors have only minor effects. Finally, our regression models explain ∼90% of variation in microsatellite mutability and can generate useful predictions for the studies of human diseases, forensics, and conservation genetics

    A matter of life or death: How microsatellites emerge in and vanish from the human genome

    No full text
    Microsatellites—tandem repeats of short DNA motifs—are abundant in the human genome and have high mutation rates. While microsatellite instability is implicated in numerous genetic diseases, the molecular processes involved in their emergence and disappearance are still not well understood. Microsatellites are hypothesized to follow a life cycle, wherein they are born and expand into adulthood, until their degradation and death. Here we identified microsatellite births/deaths in human, chimpanzee, and orangutan genomes, using macaque and marmoset as outgroups. We inferred mutations causing births/deaths based on parsimony, and investigated local genomic environments affecting them. We also studied birth/death patterns within transposable elements (Alus and L1s), coding regions, and disease-associated loci. We observed that substitutions were the predominant cause for births of short microsatellites, while insertions and deletions were important for births of longer microsatellites. Substitutions were the cause for deaths of microsatellites of virtually all lengths. AT-rich L1 sequences exhibited elevated frequency of births/deaths over their entire length, while GC-rich Alus only in their 3′ poly(A) tails and middle A-stretches, with differences depending on transposable element integration timing. Births/deaths were strongly selected against in coding regions. Births/deaths occurred in genomic regions with high substitution rates, protomicrosatellite content, and L1 density, but low GC content and Alu density. The majority of the 17 disease-associated microsatellites examined are evolutionarily ancient (were acquired by the common ancestor of simians). Our genome-wide investigation of microsatellite life cycle has fundamental applications for predicting the susceptibility of birth/death of microsatellites, including many disease-causing loci

    High-dimensional linear state space models for dynamic microbial interaction networks.

    No full text
    Medical researchers are increasingly interested in knowing how the complex community of micro-organisms living on human body impacts human health. Key to this is to understand how the microbes interact with each other. Time-course studies on human microbiome indicate that the composition of microbiome changes over short time periods, primarily as a consequence of synergistic and antagonistic interactions of the members of the microbiome with each other and with the environment. Knowledge of the abundance of bacteria-which are the predominant members of the human microbiome-in such time-course studies along with appropriate mathematical models will allow us to identify key dynamic interaction networks within the microbiome. However, the high-dimensional nature of these data poses significant challenges to the development of such mathematical models. We propose a high-dimensional linear State Space Model (SSM) with a new Expectation-Regularization-Maximization (ERM) algorithm to construct a dynamic Microbial Interaction Network (MIN). System noise and measurement noise can be separately specified through SSMs. In order to deal with the problem of high-dimensional parameter space in the SSMs, the proposed new ERM algorithm employs the idea of the adaptive LASSO-based variable selection method so that the sparsity property of MINs can be preserved. We performed simulation studies to evaluate the proposed ERM algorithm for variable selection. The proposed method is applied to identify the dynamic MIN from a time-course vaginal microbiome study of women. This method is amenable to future developments, which may include interactions between microbes and the environment

    Genome Report: Whole Genome Sequence and Annotation of the Parasitoid Jewel Wasp Nasonia giraulti Laboratory Strain RV2X[u]

    No full text
    Jewel wasps in the genus of Nasonia are parasitoids with haplodiploidy sex determination, rapid development and are easy to culture in the laboratory. They are excellent models for insect genetics, genomics, epigenetics, development, and evolution. Nasonia vitripennis (Nv) and N. giraulti (Ng) are closely-related species that can be intercrossed, particularly after removal of the intracellular bacterium Wolbachia, which serve as a powerful tool to map and positionally clone morphological, behavioral, expression and methylation phenotypes. The Nv reference genome was assembled using Sanger, PacBio and Nanopore approaches and annotated with extensive RNA-seq data. In contrast, Ng genome is only available through low coverage resequencing. Therefore, de novo Ng assembly is in urgent need to advance this system. In this study, we report a high-quality Ng assembly using 10X Genomics linked-reads with 670X sequencing depth. The current assembly has a genome size of 259,040,977 bp in 3,160 scaffolds with 38.05% G-C and a 98.6% BUSCO completeness score. 97% of the RNA reads are perfectly aligned to the genome, indicating high quality in contiguity and completeness. A total of 14,777 genes are annotated in the Ng genome, and 72% of the annotated genes have a one-to-one ortholog in the Nv genome. We reported 5 million Ng-Nv SNPs which will facility mapping and population genomic studies in Nasonia. In addition, 42 Ng-specific genes were identified by comparing with Nv genome and annotation. This is the first de novo assembly for this important species in the Nasonia model system, providing a useful new genomic toolkit

    Genome and Ontogenetic-Based Transcriptomic Analyses of the Flesh Fly, Sarcophaga bullata

    No full text
    The flesh fly, Sarcophaga bullata, is a widely-used model for examining the physiology of insect diapause, development, stress tolerance, neurobiology, and host-parasitoid interactions. Flies in this taxon are implicated in myiasis (larval infection of vertebrates) and feed on carrion, aspects that are important in forensic studies. Here we present the genome of S. bullata, along with developmental- and reproduction-based RNA-Seq analyses. We predict 15,768 protein coding genes, identify orthology in relation to closely related flies, and establish sex and developmental-specific gene sets based on our RNA-Seq analyses. Genomic sequences, predicted genes, and sequencing data sets have been deposited at the National Center for Biotechnology Information. Our results provide groundwork for genomic studies that will expand the flesh fly’s utility as a model system
    corecore