146 research outputs found

    Do you cov me? Effect of coverage reduction on species identification and genome reconstruction in complex biological matrices by metagenome shotgun high-throughput sequencing

    Get PDF
    Shotgun metagenomics sequencing is a powerful tool for the characterization of complex biological matrices, enabling analysis of prokaryotic and eukaryotic organisms and viruses in a single experiment, with the possibility of reconstructing de novo the whole metagenome or a set of genes of interest. One of the main factors limiting the use of shotgun metagenomics on wide scale projects is the high cost associated with the approach. We set out to determine if it is possible to use shallow shotgun metagenomics to characterize complex biological matrices while reducing costs. We measured the variation of several summary statistics simulating a decrease in sequencing depth by randomly subsampling a number of reads. The main statistics that were compared are alpha diversity estimates, species abundance, and ability of reconstructing de novo the metagenome in terms of length and completeness. Our results show that diversity indices of complex prokaryotic, eukaryotic and viral communities can be accurately estimated with 500,000 reads or less, although particularly complex samples may require 1,000,000 reads. On the contrary, any task involving the reconstruction of the metagenome performed poorly, even with the largest simulated subsample (1,000,000 reads). The length of the reconstructed assembly was smaller than the length obtained with the full dataset, and the proportion of conserved genes that were identified in the meta-genome was drastically reduced compared to the full sample. Shallow shotgun metagenomics can be a useful tool to describe the structure of complex matrices, but it is not adequate to reconstruct—even partially—the metagenome

    The miRNAome of globe artichoke: conserved and novel micro RNAs and target analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Plant microRNAs (miRNAs) are involved in post-transcriptional regulatory mechanisms of several processes, including the response to biotic and abiotic stress, often contributing to the adaptive response of the plant to adverse conditions. In addition to conserved miRNAs, found in a wide range of plant species a number of novel species-specific miRNAs, displaying lower levels of expression can be found. Due to low abundance, non conserved miRNAs are difficult to identify and isolate using conventional approaches. Conversely, deep-sequencing of small RNA (sRNA) libraries can detect even poorly expressed miRNAs.</p> <p>No miRNAs from globe artichoke have been described to date. We analyzed the miRNAome from artichoke by deep sequencing four sRNA libraries obtained from NaCl stressed and control leaves and roots.</p> <p>Results</p> <p>Conserved and novel miRNAs were discovered using accepted criteria. The expression level of selected miRNAs was monitored by quantitative real-time PCR. Targets were predicted and validated for their cleavage site. A total of 122 artichoke miRNAs were identified, 98 (25 families) of which were conserved with other plant species, and 24 were novel. Some miRNAs were differentially expressed according to tissue or condition, magnitude of variation after salt stress being more pronounced in roots. Target function was predicted by comparison to <it>Arabidopsis </it>proteins; the 43 targets (23 for novel miRNAs) identified included transcription factors and other genes, most of which involved in the response to various stresses. An unusual cleaved transcript was detected for miR393 target, transport inhibitor response 1.</p> <p>Conclusions</p> <p>The miRNAome from artichoke, including novel miRNAs, was unveiled, providing useful information on the expression in different organs and conditions. New target genes were identified. We suggest that the generation of secondary short-interfering RNAs from miR393 target can be a general rule in the plant kingdom.</p

    Metagenomic profiles of different types of Italian high-moisture Mozzarella cheese

    Get PDF
    The microbiota of different types of Italian high-moisture Mozzarella cheese produced using cow or buffalo milk, acidified with natural or selected cultures, and sampled at the dairy or at the mass market, was evaluated using a Next Generation Sequencing approach, in order to identify possible drivers of the bacterial diversity. Cow Mozzarella and buffalo Mozzarella acidified with commercial cultures were dominated by Streptococcus thermophilus, while buffalo samples acidified with natural whey cultures showed similar prevalence of L. delbrueckii subsp. bulgaricus, L. helveticus and S. thermophilus. Moreover, several species of non-starter lactic acid bacteria were frequently detected. The diversity in cow Mozzarella microbiota was much higher than that of water buffalo samples. Cluster analysis clearly separated cow's cheeses from buffalo's ones, the former having a higher prevalence of psychrophilic taxa, and the latter of Lactobacillus and Streptococcus. A higher prevalence of psychrophilic species and potential spoilers was observed in samples collected at the mass retail, suggesting that longer exposures to cooling temperatures and longer production-to-consumption times could significantly affect microbiota diversity. Our results could help in detecting some kind of thermal abuse during the production or storage of mozzarella cheese

    Characterization of the Poplar Pan-Genome by Genome-Wide Identification of Structural Variation

    Get PDF
    Many recent studies have emphasized the important role of structural variation (SV) in determining human genetic and phenotypic variation. In plants, studies aimed at elucidating the extent of SV are still in their infancy. Evidence has indicated a high presence and an active role of SV in driving plant genome evolution in different plant species.With the aim of characterizing the size and the composition of the poplar pan-genome, we performed a genome-wide analysis of structural variation in three intercrossable poplar species: Populus nigra, Populus deltoides, and Populus trichocarpa We detected a total of 7,889 deletions and 10,586 insertions relative to the P. trichocarpa reference genome, covering respectively 33.2\u2009Mb and 62.9\u2009Mb of genomic sequence, and 3,230 genes affected by copy number variation (CNV). The majority of the detected variants are inter-specific in agreement with a recent origin following separation of species.Insertions and deletions (INDELs) were preferentially located in low-gene density regions of the poplar genome and were, for the majority, associated with the activity of transposable elements. Genes affected by SV showed lower-than-average expression levels and higher levels of dN/dS, suggesting that they are subject to relaxed selective pressure or correspond to pseudogenes.Functional annotation of genes affected by INDELs showed over-representation of categories associated with transposable elements activity, while genes affected by genic CNVs showed enrichment in categories related to resistance to stress and pathogens. This study provides a genome-wide catalogue of SV and the first insight on functional and structural properties of the poplar pan-genome

    Genomic tools for durum wheat breeding: de novo assembly of Svevo transcriptome and SNP discovery in elite germplasm

    Get PDF
    BACKGROUND: The tetraploid durum wheat (Triticum turgidum L. ssp. durum Desf. Husnot) is an important crop which provides the raw material for pasta production and a valuable source of genetic diversity for breeding hexaploid wheat (Triticum aestivum L.). Future breeding efforts to enhance yield potential and climate resilience will increasingly rely on genomics-based approaches to identify and select beneficial alleles. A deeper characterisation of the molecular and functional diversity of the durum wheat transcriptome will be instrumental to more effectively harness its genetic diversity. RESULTS: We report on the de novo transcriptome assembly of durum wheat cultivar 'Svevo'. The transcriptome of four tissues/organs (shoots and roots at the seedling stage, reproductive organs and developing grains) was assembled de novo, yielding 180,108 contigs, with a N50 length of 1121\u2009bp and mean contig length of 883\u2009bp. Alignment against the transcriptome of nine plant species identified 43% of transcripts with homology to at least one reference transcriptome. The functional annotation was completed by means of a combination of complementary software. The presence of differential expression between the A- and B-homoeolog copies of the durum wheat tetraploid genome was ascertained by phase reconstruction of polymorphic sites based on the T. urartu transcripts and inferring homoeolog-specific sequences. We observed greater expression divergence between A and B homoeologs in grains rather than in leaves and roots. The transcriptomes of 13 durum wheat cultivars spanning the breeding period from 1969 to 2005 were analysed for SNP diversity, leading to 95,358 non-rare, hemi-SNPs shared among two or more cultivars and 33,747 locus-specific (diploid inheritance) SNPs. CONCLUSIONS: Our study updates and expands the de novo transcriptome reference assembly available for durum wheat. Out of 180,108 assembled transcripts, 13,636 were specific to the Svevo cultivar as compared to the only other reference transcriptome available for durum, thus contributing to the identification of the tetraploid wheat pan-transcriptome. Additionally, the analysis of 13 historically relevant hallmark varieties produced a SNP dataset that could successfully validate the genotyping in tetraploid wheat and provide a valuable resource for genomics-assisted breeding of both tetraploid and hexaploid wheats

    A knowledge base for Vitis vinifera functional analysis

    Get PDF
    Vitis vinifera (Grapevine) is the most important fruit species in the modern world. Wine and table grapes sales contribute significantly to the economy of major wine producing countries. The most relevant goals in wine production concern quality and safety. In order to significantly improve the achievement of these objectives and to gain biological knowledge about cultivars, a genomic approach is the most reliable strategy. The recent grapevine genome sequencing offers the opportunity to study the potential roles of genes and microRNAs in fruit maturation and other physiological and pathological processes. Although several systems allowing the analysis of plant genomes have been reported, none of them has been designed specifically for the functional analysis of grapevine genomes of cultivars under environmental stress in connection with microRNA data

    Draft Genome Sequence of the Probiotic Yeast Kluyveromyces marxianus fragilis B0399

    Get PDF
    Here, we report the draft genome sequence of Kluyveromyces marxianus fragilis B0399, the first yeast approved as a probiotic for human consumption not belonging to the genus Saccharomyces The genome is composed of 8 chromosomes, with a total size of 11.44\ua0Mb, including mitochondrial DNA

    Physical mapping integrated with syntenic analysis to characterize the gene space of the long arm of wheat chromosome 1A

    Get PDF
    Background: Bread wheat (Triticum aestivum L.) is one of the most important crops worldwide and its production faces pressing challenges, the solution of which demands genome information. However, the large, highly repetitive hexaploid wheat genome has been considered intractable to standard sequencing approaches. Therefore the International Wheat Genome Sequencing Consortium (IWGSC) proposes to map and sequence the genome on a chromosome-by-chromosome basis. Methodology/Principal Findings: We have constructed a physical map of the long arm of bread wheat chromosome 1A using chromosome-specific BAC libraries by High Information Content Fingerprinting (HICF). Two alternative methods (FPC and LTC) were used to assemble the fingerprints into a high-resolution physical map of the chromosome arm. A total of 365 molecular markers were added to the map, in addition to 1122 putative unique transcripts that were identified by microarray hybridization. The final map consists of 1180 FPC based or 583 LTC based contigs. Conclusions/Significance: The physical map presented here marks an important step forward in mapping of hexaploid bread wheat. The map is orders of magnitude more detailed than previously available maps of this chromosome, and the assignment of over a thousand putative expressed gene sequences to specific map locations will greatly assist future functional studies. This map will be an essential tool for future sequencing of and positional cloning within chromosome 1A

    A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm

    Get PDF
    The genome of the allotetraploid species Coffea arabica L. was sequenced to assemble independently the two component subgenomes (putatively deriving from C. canephora and C. eugenioides) and to perform a genome-wide analysis of the genetic diversity in cultivated coffee germplasm and in wild populations growing in the center of origin of the species. We assembled a total length of 1.536 Gbp, 444 Mb and 527 Mb of which were assigned to the canephora and eugenioides subgenomes, respectively, and predicted 46,562 gene models, 21,254 and 22,888 of which were assigned to the canephora and to the eugeniodes subgenome, respectively. Through a genome-wide SNP genotyping of 736 C. arabica accessions, we analyzed the genetic diversity in the species and its relationship with geographic distribution and historical records. We observed a weak population structure due to low-frequency derived alleles and highly negative values of Taijma's D, suggesting a recent and severe bottleneck, most likely resulting from a single event of polyploidization, not only for the cultivated germplasm but also for the entire species. This conclusion is strongly supported by forward simulations of mutation accumulation. However, PCA revealed a cline of genetic diversity reflecting a west-to-east geographical distribution from the center of origin in East Africa to the Arabian Peninsula. The extremely low levels of variation observed in the species, as a consequence of the polyploidization event, make the exploitation of diversity within the species for breeding purposes less interesting than in most crop species and stress the need for introgression of new variability from the diploid progenitors

    Genomic tools for durum wheat breeding: De novo assembly of Svevo transcriptome and SNP discovery in elite germplasm

    Get PDF
    Abstract Background The tetraploid durum wheat (Triticum turgidum L. ssp. durum Desf. Husnot) is an important crop which provides the raw material for pasta production and a valuable source of genetic diversity for breeding hexaploid wheat (Triticum aestivum L.). Future breeding efforts to enhance yield potential and climate resilience will increasingly rely on genomics-based approaches to identify and select beneficial alleles. A deeper characterisation of the molecular and functional diversity of the durum wheat transcriptome will be instrumental to more effectively harness its genetic diversity. Results We report on the de novo transcriptome assembly of durum wheat cultivar ‘Svevo’. The transcriptome of four tissues/organs (shoots and roots at the seedling stage, reproductive organs and developing grains) was assembled de novo, yielding 180,108 contigs, with a N50 length of 1121 bp and mean contig length of 883 bp. Alignment against the transcriptome of nine plant species identified 43% of transcripts with homology to at least one reference transcriptome. The functional annotation was completed by means of a combination of complementary software. The presence of differential expression between the A- and B-homoeolog copies of the durum wheat tetraploid genome was ascertained by phase reconstruction of polymorphic sites based on the T. urartu transcripts and inferring homoeolog-specific sequences. We observed greater expression divergence between A and B homoeologs in grains rather than in leaves and roots. The transcriptomes of 13 durum wheat cultivars spanning the breeding period from 1969 to 2005 were analysed for SNP diversity, leading to 95,358 non-rare, hemi-SNPs shared among two or more cultivars and 33,747 locus-specific (diploid inheritance) SNPs. Conclusions Our study updates and expands the de novo transcriptome reference assembly available for durum wheat. Out of 180,108 assembled transcripts, 13,636 were specific to the Svevo cultivar as compared to the only other reference transcriptome available for durum, thus contributing to the identification of the tetraploid wheat pan-transcriptome. Additionally, the analysis of 13 historically relevant hallmark varieties produced a SNP dataset that could successfully validate the genotyping in tetraploid wheat and provide a valuable resource for genomics-assisted breeding of both tetraploid and hexaploid wheats
    corecore