51 research outputs found

    Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA.</p> <p>Results</p> <p>We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis).</p> <p>Conclusion</p> <p>This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.</p

    Sympatric ecological speciation meets pyrosequencing: sampling the transcriptome of the apple maggot Rhagoletis pomonella

    Get PDF
    Background The full power of modern genetics has been applied to the study of speciation in only a small handful of genetic model species - all of which speciated allopatrically. Here we report the first large expressed sequence tag (EST) study of a candidate for ecological sympatric speciation, the apple maggot Rhagoletis pomonella, using massively parallel pyrosequencing on the Roche 454-FLX platform. To maximize transcript diversity we created and sequenced separate libraries from larvae, pupae, adult heads, and headless adult bodies. Results We obtained 239,531 sequences which assembled into 24,373 contigs. A total of 6810 unique protein coding genes were identified among the contigs and long singletons, corresponding to 48% of all known Drosophila melanogaster protein-coding genes. Their distribution across GO classes suggests that we have obtained a representative sample of the transcriptome. Among these sequences are many candidates for potential R. pomonella speciation genes (or barrier genes ) such as those controlling chemosensory and life-history timing processes. Furthermore, we identified important marker loci including more than 40,000 single nucleotide polymorphisms (SNPs) and over 100 microsatellites. An initial search for SNPs at which the apple and hawthorn host races differ suggested at least 75 loci warranting further work. We also determined that developmental expression differences remained even after normalization; transcripts expected to show different expression levels between larvae and pupae in D. melanogaster also did so in R. pomonella. Preliminary comparative analysis of transcript presences and absences revealed evidence of gene loss in Drosophila and gain in the higher dipteran clade Schizophora. Conclusions These data provide a much needed resource for exploring mechanisms of divergence in this important model for sympatric ecological speciation. Our description of ESTs from a substantial portion of the R. pomonella transcriptome will facilitate future functional studies of candidate genes for olfaction and diapause-related life history timing, and will enable large scale expression studies. Similarly, the identification of new SNP and microsatellite markers will facilitate future population and quantitative genetic studies of divergence between the apple and hawthorn-infesting host races

    Transient genome-wide interactions of the master transcription factor NLP7 initiate a rapid nitrogen-response cascade

    Get PDF
    Dynamic reprogramming of gene regulatory networks (GRNs) enables organisms to rapidly respond to environmental perturbation. However, the underlying transient interactions between transcription factors (TFs) and genome-wide targets typically elude biochemical detection. Here, we capture both stable and transient TF-target interactions genome-wide within minutes after controlled TF nuclear import using time-series chromatin immunoprecipitation (ChIP-seq) and/or DNA adenine methyltransferase identification (DamID-seq). The transient TF-target interactions captured uncover the early mode-of-action of NIN-LIKE PROTEIN 7 (NLP7), a master regulator of the nitrogen signaling pathway in plants. These transient NLP7 targets captured in root cells using temporal TF perturbation account for 50% of NLP7-regulated genes not detectably bound by NLP7 in planta. Rapid and transient NLP7 binding activates early nitrogen response TFs, which we validate to amplify the NLP7-initiated transcriptional cascade. Our approaches to capture transient TF-target interactions genome-wide can be applied to validate dynamic GRN models for any pathway or organism of interest. Conventional methods cannot reveal transient transcription factors (TFs) and targets interactions. Here, Alvarez et al. capture both stable and transient TF-target interactions by time-series ChIP-seq and/or DamID-seq in a cell-based TF perturbation system and show NLP7 as a master TF to initiate a rapid nitrogen-response cascade

    A framework genetic map for \u3ci\u3eMiscanthus sinensis\u3c/i\u3e from RNAseq-based markers shows recent tetraploidy

    Get PDF
    Background: Miscanthus (subtribe Saccharinae, tribe Andropogoneae, family Poaceae) is a genus of temperate perennial C4 grasses whose high biomass production makes it, along with its close relatives sugarcane and sorghum, attractive as a biofuel feedstock. The base chromosome number of Miscanthus (x = 19) is different from that of other Saccharinae and approximately twice that of the related Sorghum bicolor (x = 10), suggesting largescale duplications may have occurred in recent ancestors of Miscanthus. Owing to the complexity of the Miscanthus genome and the complications of self-incompatibility, a complete genetic map with a high density of markers has not yet been developed. Results: We used deep transcriptome sequencing (RNAseq) from two M. sinensis accessions to define 1536 single nucleotide variants (SNVs) for a GoldenGate™ genotyping array, and found that simple sequence repeat (SSR) markers defined in sugarcane are often informative in M. sinensis. A total of 658 SNP and 210 SSR markers were validated via segregation in a full sibling F1 mapping population. Using 221 progeny from this mapping population, we constructed a genetic map for M. sinensis that resolves into 19 linkage groups, the haploid chromosome number expected from cytological evidence. Comparative genomic analysis documents a genomewide duplication in Miscanthus relative to Sorghum bicolor, with subsequent insertional fusion of a pair of chromosomes. The utility of the map is confirmed by the identification of two paralogous C4-pyruvate, phosphate dikinase (C4-PPDK) loci in Miscanthus, at positions syntenic to the single orthologous gene in Sorghum. Conclusions: The genus Miscanthus experienced an ancestral tetraploidy and chromosome fusion prior to its diversification, but after its divergence from the closely related sugarcane clade. The recent timing of this tetraploidy complicates discovery and mapping of genetic markers for Miscanthus species, since alleles and fixed differences between paralogs are comparable. These difficulties can be overcome by careful analysis of segregation patterns in a mapping population and genotyping of doubled haploids. The genetic map for Miscanthus will be useful in biological discovery and breeding efforts to improve this emerging biofuel crop, and also provide a valuable resource for understanding genomic responses to tetraploidy and chromosome fusion

    A framework genetic map for Miscanthus sinensis from RNAseq-based markers shows recent tetraploidy

    Get PDF
    Abstract Background Miscanthus (subtribe Saccharinae, tribe Andropogoneae, family Poaceae) is a genus of temperate perennial C4 grasses whose high biomass production makes it, along with its close relatives sugarcane and sorghum, attractive as a biofuel feedstock. The base chromosome number of Miscanthus (x = 19) is different from that of other Saccharinae and approximately twice that of the related Sorghum bicolor (x = 10), suggesting large-scale duplications may have occurred in recent ancestors of Miscanthus. Owing to the complexity of the Miscanthus genome and the complications of self-incompatibility, a complete genetic map with a high density of markers has not yet been developed. Results We used deep transcriptome sequencing (RNAseq) from two M. sinensis accessions to define 1536 single nucleotide variants (SNVs) for a GoldenGate™ genotyping array, and found that simple sequence repeat (SSR) markers defined in sugarcane are often informative in M. sinensis. A total of 658 SNP and 210 SSR markers were validated via segregation in a full sibling F1 mapping population. Using 221 progeny from this mapping population, we constructed a genetic map for M. sinensis that resolves into 19 linkage groups, the haploid chromosome number expected from cytological evidence. Comparative genomic analysis documents a genome-wide duplication in Miscanthus relative to Sorghum bicolor, with subsequent insertional fusion of a pair of chromosomes. The utility of the map is confirmed by the identification of two paralogous C4-pyruvate, phosphate dikinase (C4-PPDK) loci in Miscanthus, at positions syntenic to the single orthologous gene in Sorghum. Conclusions The genus Miscanthus experienced an ancestral tetraploidy and chromosome fusion prior to its diversification, but after its divergence from the closely related sugarcane clade. The recent timing of this tetraploidy complicates discovery and mapping of genetic markers for Miscanthus species, since alleles and fixed differences between paralogs are comparable. These difficulties can be overcome by careful analysis of segregation patterns in a mapping population and genotyping of doubled haploids. The genetic map for Miscanthus will be useful in biological discovery and breeding efforts to improve this emerging biofuel crop, and also provide a valuable resource for understanding genomic responses to tetraploidy and chromosome fusion

    Rapid Genotyping of Soybean Cultivars Using High Throughput Sequencing

    Get PDF
    Soybean (Glycine max) breeding involves improving commercially grown varieties by introgressing important agronomic traits from poor yielding accessions and/or wild relatives of soybean while minimizing the associated yield drag. Molecular markers associated with these traits are instrumental in increasing the efficiency of producing such crosses and Single Nucleotide Polymorphisms (SNPs) are particularly well suited for this task, owing to high density in the non-genic regions and thus increased likelihood of finding a tightly linked marker to a given trait. A rapid method to develop SNP markers that can differentiate specific loci between any two parents in soybean is thus highly desirable. In this study we investigate such a protocol for developing SNP markers between multiple soybean accessions and the reference Williams 82 genome. To restrict sampling frequency reduced representation libraries (RRLs) of genomic DNA were generated by restriction digestion followed by library construction. We chose to sequence four accessions Dowling (PI 548663), Dwight (PI 597386), Komata (PI200492) and PI 594538A for their agronomic importance as well as Williams 82 as a control

    Genome composition of Glycine max and sequence diversity among cultivated and exotic accessions

    Get PDF
    Soybean is an economically important crop in large portions of the world. Incorporation of soybean in to the food system in many direct and indirect ways has vastly increased the nutritional quality of low cost and plant-based diets. Therefore an enormous amount of effort has gone into increasing the yield and nutritional quality of soybeans through plant breeding over hundreds of years. Despite this economic and nutritional importance the soybean genome was largely uncharacterized until 2004. Research described in here deals with the application of novel sequencing technologies to elucidate the soybean genome composition as an initial step to understanding the organization of the genome. Three, partially independent, studies were performed to study soybean genome content and diversity. The first study applied 454 pyrosequencing to obtain a low coverage survey that identifi ed repeat composition of the genome. The second study compiled data from numerous small RNA sequence datasets to follow the small RNA level regulation of soybean genes and the maintenance of genomic stability by siRNA mediated heterochromatization. The third study applied a reduced representation sampling strategy to identify SNP markers in the non-repetitive regions of the genome that can distinguish between soybean accessions. The method developed in this study should be generally applicable to other lines of soybean or even in other crop plants that have a fully sequenced genome. These studies, along with others reported simultaneously, and those that will be conducted in the near future, together enhance our understanding of soybean and increase our ability to manipulate this important species to our advantage

    The inheritance pattern of 24 nt siRNA clusters in arabidopsis hybrids is influenced by proximity to transposable elements.

    Get PDF
    Hybrids often display increased size and growth, and thus are widely cultivated in agriculture and horticulture. Recent discoveries demonstrating the important regulatory roles of small RNAs have greatly improved our understanding of many basic biological questions, and could illuminate the molecular basis for the enhanced growth and size of hybrid plants. We profiled small RNAs by deep sequencing to characterize the inheritance patterns of small RNA levels in reciprocal hybrids of two Arabidopsis thaliana accessions, Columbia and Landsberg erecta. We find 24-nt siRNAs predominate among those small RNAs that are differentially expressed between the parents. Following hybridization, the transposable element (TE)-derived siRNAs are often inherited in an additive manner, whereas siRNAs associated with protein-coding genes are often down-regulated in hybrids to the levels observed for the parent with lower relative siRNA levels. Among the protein-coding genes that exhibit this pattern, genes that function in pathogen defense, abiotic stress tolerance, and secondary metabolism are significantly enriched. Small RNA clusters from protein-coding genes where a TE is present within one kilobase show a different predominant inheritance pattern (additive) from those that do not (low-parent dominance). Thus, down-regulation in the form of low-parent dominance is likely the default pattern of inheritance for genic siRNA, and a different inheritance mechanism for TE siRNA is suggested
    corecore