62 research outputs found
Large Gap Size Paired-end Library Construction for Second Generation Sequencing
Fosmid or BAC end sequencing plays an important role in de novo assembly of large genomes like fungi and plants. However construction and Sanger sequencing of fosmid or BAC libraries are laborious and costly. The current 454 Paired-End (PE) Library and Illumina Jumping Library construction protocols are limited with the gap sizes of approximately 20 kb and 8 kb, respectively. In the attempt to understand the limitations of constructing PE libraries with greater than 30Kb gaps, we have purified 18, 28, 45, and 65Kb sheared DNA fragments from yeast and circularized the ends using the Cre-loxP approach described in the 454 PE Library protocol. With the increasing fragment sizes, we found a general trend of decreasing library quality in several areas. First, redundant reads and reads containing multiple loxP linkers increase when the average fragment size increases. Second, the contamination of short distance pairs (<10Kb) increases as the fragment size increases. Third, chimeric rate increases with the increasing fragment sizes. We have modified several steps to improve the quality of the long span PE libraries. The modification includes (1) the use of special PFGE program to reduce small fragment contamination; (2) the increase of DNA samples in the circularization step and prior to the PCR to reduce redundant reads; and (3) the decrease of fragment size in the double SPRI size selection to get a higher frequency of LoxP linker containing reads. With these modifications we have generated large gap size PE libraries with a much better quality
Methane yield phenotypes linked to differential gene expression in the sheep rumen microbiome.
Ruminant livestock represent the single largest anthropogenic source of the potent greenhouse gas methane, which is generated by methanogenic archaea residing in ruminant digestive tracts. While differences between individual animals of the same breed in the amount of methane produced have been observed, the basis for this variation remains to be elucidated. To explore the mechanistic basis of this methane production, we measured methane yields from 22 sheep, which revealed that methane yields are a reproducible, quantitative trait. Deep metagenomic and metatranscriptomic sequencing demonstrated a similar abundance of methanogens and methanogenesis pathway genes in high and low methane emitters. However, transcription of methanogenesis pathway genes was substantially increased in sheep with high methane yields. These results identify a discrete set of rumen methanogens whose methanogenesis pathway transcription profiles correlate with methane yields and provide new targets for CH4 mitigation at the levels of microbiota composition and transcriptional regulation
Generation of Long Insert Pairs Using a Cre-LoxP Inverse PCR Approach
Large insert mate pair reads have a major impact on the overall success of de novo assembly and the discovery of inherited and acquired structural variants. The positional information of mate pair reads generally improves genome assembly by resolving repeat elements and/or ordering contigs. Currently available methods for building such libraries have one or more of limitations, such as relatively small insert size; unable to distinguish the junction of two ends; and/or low throughput. We developed a new approach, Cre-LoxP Inverse PCR Paired-End (CLIP-PE), which exploits the advantages of (1) Cre-LoxP recombination system to efficiently circularize large DNA fragments, (2) inverse PCR to enrich for the desired products that contain both ends of the large DNA fragments, and (3) the use of restriction enzymes to introduce a recognizable junction site between ligated fragment ends and to improve the self-ligation efficiency. We have successfully created CLIP-PE libraries up to 22 kb that are rich in informative read pairs and low in small fragment background. These libraries have demonstrated the ability to improve genome assemblies. The CLIP-PE methodology can be implemented with existing and future next-generation sequencing platforms
Critical Assessment of Metagenome Interpretation:A benchmark of metagenomics software
International audienceIn metagenome analysis, computational methods for assembly, taxonomic profilingand binning are key components facilitating downstream biological datainterpretation. However, a lack of consensus about benchmarking datasets andevaluation metrics complicates proper performance assessment. The CriticalAssessment of Metagenome Interpretation (CAMI) challenge has engaged the globaldeveloper community to benchmark their programs on datasets of unprecedentedcomplexity and realism. Benchmark metagenomes were generated from newlysequenced ~700 microorganisms and ~600 novel viruses and plasmids, includinggenomes with varying degrees of relatedness to each other and to publicly availableones and representing common experimental setups. Across all datasets, assemblyand genome binning programs performed well for species represented by individualgenomes, while performance was substantially affected by the presence of relatedstrains. Taxonomic profiling and binning programs were proficient at high taxonomicranks, with a notable performance decrease below the family level. Parametersettings substantially impacted performances, underscoring the importance ofprogram reproducibility. While highlighting current challenges in computationalmetagenomics, the CAMI results provide a roadmap for software selection to answerspecific research questions
Metagenomic gene annotation by a homology-independent approach
Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive.
To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families.
As ~;;50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.
In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation
Metagenomic gene annotation by a homology-independent approach
Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive.
To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families.
As ~;;50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.
In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation
Recommended from our members
Selection for efficiency at the transcriptional and translational levels are correlated across bacteria
Background: It has been shown that selection acts to remove the promoter's 10 and 35 consensus sequences in both coding and noncoding regions, implying that it is disadvantageous to maintain misplaced sites that can strongly bind ?70 and interfere with proper gene expression. Here we analyze 56 bacterial genomes and show that the numbers of nonconsensus, potential ?70 binding sites in both regulatory and non-regulatory noncoding regions deviate significantly from the random expectations when compensating for base composition, di- and tri-nucleotide bias in a majority of eubacteria. Not only do we expect that selection is maintaining high densities of potential ?70 binding sites in regulatory DNA, but that there is selection against these sites in non-regulatory DNA. The often overlapping binding sites inregulatory DNA likely confer some subtle survival advantage, even though experimental evidence suggests only one or a few of these sites are actual transcription initiation sites. Remarkably, we find that the degree of selection against potential ?70 binding sites in non-regulatory DNA correlates positively with rate of growth, adaptive codonbias and number of tRNA genes. This is evidence that the efficiency needed for faster growing bacteria can only be achieved by reducing spurious RNA polymerase binding to false sites, and that transcription and translation efficiencies are both optimized at a genome-wide level to permit faster growth
Reframing 9/11 : film, popular culture and the "war on terror"
September 11th, 2001 remains a focal point of American consciousness, a site demanding ongoing excavation, a site at which to mark before and after "everything" changed. In ways both real and intangible the entire sequence of events of that day continues to resonate in an endlessly proliferating aftermath of meanings that continue to evolve. Presenting a collection of analyses by an international body of scholars that examines America's recent history, this book focuses on popular culture as a profound discursive site of anxiety and discussion about 9/11 and demystifies the day's events in order to contextualize them into a historically grounded series of narratives that recognizes the complex relations of a globalized world. Essays in Reframing 9/11 share a collective drive to encourage new and original approaches for understanding the issues both within and beyond the official political rhetoric of the events of the "The Global War on Terror" and issues of national security
Recommended from our members
Over/Under-Representation of sigma-70 promoter-like signals in Different Genomic Regions
MetaBAT: Metagenome Binning based on Abundance and Tetranucleotide frequence
Grouping large fragments assembled from shotgun metagenomic sequences to deconvolute complex microbial communities, or metagenome binning, enables the study of individual organisms and their interactions. Here we developed automated metagenome binning software, called MetaBAT, which integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency. On synthetic datasets MetaBAT on average achieves 98percent precision and 90percent recall at the strain level with 281 near complete unique genomes. Applying MetaBAT to a human gut microbiome data set we recovered 176 genome bins with 92percent precision and 80percent recall. Further analyses suggest MetaBAT is able to recover genome fragments missed in reference genomes up to 19percent, while 53 genome bins are novel. In summary, we believe MetaBAT is a powerful tool to facilitate comprehensive understanding of complex microbial communities
- …