21 research outputs found

    Gene expression in diapausing rotifer eggs in response to divergent environmental predictability regimes

    Get PDF
    In unpredictable environments in which reliable cues for predicting environmental variation are lacking, a diversifying bet-hedging strategy for diapause exit is expected to evolve, whereby only a portion of diapausing forms will resume development at the first occurrence of suitable conditions. This study focused on diapause termination in the rotifer Brachionus plicatilis s.s., addressing the transcriptional profile of diapausing eggs from environments differing in the level of predictability and the relationship of such profiles with hatching patterns. RNA-Seq analyses revealed significant differences in gene expression between diapausing eggs produced in the laboratory under combinations of two contrasting selective regimes of environmental fluctuation (predictable vs unpredictable) and two different diapause conditions (passing or not passing through forced diapause). The results showed that the selective regime was more important than the diapause condition in driving differences in the transcriptome profile. Most of the differentially expressed genes were upregulated in the predictable regime and mostly associated with molecular functions involved in embryo morphological development and hatching readiness. This was in concordance with observations of earlier, higher, and more synchronous hatching in diapausing eggs produced under the predictable regime

    A genome-wide view of Caenorhabditis elegans base-substitution mutation processes

    Get PDF
    Knowledge of mutation processes is central to understanding virtually all evolutionary phenomena and the underlying nature of genetic disorders and cancers. However, the limitations of standard molecular mutation detection methods have historically precluded a genome-wide understanding of mutation rates and spectra in the nuclear genomes of multicellular organisms. We applied two high-throughput DNA sequencing technologies to identify and characterize hundreds of spontaneously arising base-substitution mutations in 10 Caenorhabditis elegans mutation-accumulation (MA)-line nuclear genomes. C. elegans mutation rate estimates were similar to previous calculations based on smaller numbers of mutations. Mutations were distributed uniformly within and among chromosomes and were not associated with recombination rate variation in the MA lines, suggesting that intragenomic variation in genetic hitchhiking and/or background selection are primarily responsible for the chromosomal distribution patterns of polymorphic nucleotides in C. elegans natural populations. A strong mutational bias from G/C to A/T nucleotides was detected in the MA lines, implicating oxidative DNA damage as a major endogenous mutagenic force in C. elegans. The observed mutational bias also suggests that the C. elegans nuclear genome cannot be at equilibrium because of mutation alone. Transversions dominate the spectrum of spontaneous mutations observed here, whereas transitions dominate patterns of allegedly neutral polymorphism in natural populations of C. elegans and many other animal species; this observation challenges the assumption that natural patterns of molecular variation in noncoding regions of the nuclear genome accurately reflect underlying mutation processes

    On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

    Get PDF
    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

    Functional impact and evolution of a novel human polymorphic inversion that disrupts a gene and creates a fusion transcript

    Get PDF
    Since the discovery of chromosomal inversions almost 100 years ago, how they are maintained in natural populations has been a highly debated issue. One of the hypotheses is that inversion breakpoints could affect genes and modify gene expression levels, although evidence of this came only from laboratory mutants. In humans, a few inversions have been shown to associate with expression differences, but in all cases the molecular causes have remained elusive. Here, we have carried out a complete characterization of a new human polymorphic inversion and determined that it is specific to East Asian populations. In addition, we demonstrate that it disrupts the ZNF257 gene and, through the translocation of the first exon and regulatory sequences, creates a previously nonexistent fusion transcript, which together are associated to expression changes in several other genes. Finally, we investigate the potential evolutionary and phenotypic consequences of the inversion, and suggest that it is probably deleterious. This is therefore the first example of a natural polymorphic inversion that has position effects and creates a new chimeric gene, contributing to answer an old question in evolutionary biology

    Population genetic analysis of bi-allelic structural variants from low-coverage sequence data with an expectation-maximization algorithm

    Get PDF
    Background Population genetics and association studies usually rely on a set of known variable sites that are then genotyped in subsequent samples, because it is easier to genotype than to discover the variation. This is also true for structural variation detected from sequence data. However, the genotypes at known variable sites can only be inferred with uncertainty from low coverage data. Thus, statistical approaches that infer genotype likelihoods, test hypotheses, and estimate population parameters without requiring accurate genotypes are becoming popular. Unfortunately, the current implementations of these methods are intended to analyse only single nucleotide and short indel variation, and they usually assume that the two alleles in a heterozygous individual are sampled with equal probability. This is generally false for structural variants detected with paired ends or split reads. Therefore, the population genetics of structural variants cannot be studied, unless a painstaking and potentially biased genotyping is performed first. Results We present svgem, an expectation-maximization implementation to estimate allele and genotype frequencies, calculate genotype posterior probabilities, and test for Hardy-Weinberg equilibrium and for population differences, from the numbers of times the alleles are observed in each individual. Although applicable to single nucleotide variation, it aims at bi-allelic structural variation of any type, observed by either split reads or paired ends, with arbitrarily high allele sampling bias. We test svgem with simulated and real data from the 1000 Genomes Project. Conclusions svgem makes it possible to use low-coverage sequencing data to study the population distribution of structural variants without having to know their genotypes. Furthermore, this advance allows the combined analysis of structural and nucleotide variation within the same genotype-free statistical framework, thus preventing biases introduced by genotype imputation

    On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing

    No full text
    One of the most used techniques to study structural variation at a genome level is paired-end mapping (PEM). PEM has the advantage of being able to detect balanced events, such as inversions and translocations. However, inversions are still quite difficult to predict reliably, especially from high-throughput sequencing data. We simulated realistic PEM experiments with different combinations of read and library fragment lengths, including sequencing errors and meaningful base-qualities, to quantify and track down the origin of false positives and negatives along sequencing, mapping, and downstream analysis. We show that PEM is very appropriate to detect a wide range of inversions, even with low coverage data. However, % of inversions located between segmental duplications are expected to go undetected by the most common sequencing strategies. In general, longer DNA libraries improve the detectability of inversions far better than increments of the coverage depth or the read length. Finally, we review the performance of three algorithms to detect inversions -SVDetect, GRIAL, and VariationHunter-, identify common pitfalls, and reveal important differences in their breakpoint precisions. These results stress the importance of the sequencing strategy for the detection of structural variants, especially inversions, and offer guidelines for the design of future genome sequencing projects

    Inversion between the reference and the target genomes.

    No full text
    <p>The breakpoints (dashed lines) are located inside inverted repeats (red or orange arrows). Four pairs of reads that span the breakpoints are depicted in blue, with their sequenced ends in opposite orientations. Yellow bands indicate the correct mappings in the reference genome of ends located in unique sequences. The reads sequenced from a repeat are erroneously mapped to the alternative copy (pink bands), because concordant alignments are favored by the aligner. The mapped reads at the bottom are displayed in dark blue if correctly mapped or in light blue otherwise. The only discordant pair of reads that report the inversion is shown in green.</p

    Relationship between physical coverage and the expected sensitivity of different sequencing strategies to detect inversions.

    No full text
    <p>The expected sensitivity is based on the probability of correctly mapping paired ends across inversion breakpoints in four different sequence contexts. Inversions are assumed to be longer than the templates. The sequencing strategy is defined by the read length: dotted lines, 36 bp; dashed lines, 75 bp; solid lines, 150 bp; and by the template length: green, 250 bp; blue, 450 bp; purple, 2.5 kb; red, 10 kb; and black, 40 kb. Notice the different ranges of physical coverage among plots.</p

    Precision of breakpoint prediction plotted against the length of the template.

    No full text
    <p>The average size of the predicted range of a breakpoint is represented separately for inversions smaller (left) or larger (right) than the template. Colors correspond to the programs used to predict the breakpoints: green, VariationHunter; blue, SVDetect; and red, GRIAL. The dashed lines correspond to the theoretical expected precisions, obtained either from equation 3 in reference <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0061292#pone.0061292-Bashir1" target="_blank">[18]</a> for large inversions, or from the average difference between the inversion size and the template length for small inversions.</p

    Percentage of inversion breakpoints from each sequence context that are successfully detected by different programs.

    No full text
    <p>Results from SVDetect (SVD, upper row), VariationHunter (VH, middle row), or GRIAL (bottom row) are plotted against the template length used. Colors correspond to the length of the reads: green, 36 bp; blue, 75 bp; and red, 150 bp.</p
    corecore