Search CORE

38 research outputs found

Additional file 3: of Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer

Author: Alex Copeland (34696)
Alexandra Worden (3490049)
Ann Gregory (3317088)
Ashley Maitland (3509408)
Filipa dos Santos (3509420)
J. Ignacio-Espinoza (3509414)
Joshua Weitz (3509417)
Kurt LaButti (114830)
Lauren Chittick (3509405)
Matthew Sullivan (83956)
Sebastian Sudek (2508499)
Sergei Solonenko (3509411)
Tanja Woyke (111784)
Publication venue
Publication date
Field of study

Testing genetic drift, selection, and recombination. Table S7. FST values for genetic differentiation between coastal and offshore populations within each cyanophage lineage. The FST value for lineage II is most likely high due to the low number of representatives and lack of diversity among the offshore lineage II phages. Due to the low number of individuals isolated from the offshore site for some of the clusters, FST could not be calculated. Figure S4. (A) Quantitative host range analyses of 15 Synechococcus host strains against 138 cyanophage isolates testing the efficacy of infection. (B) Analysis of mean infectivity of coastal and upwelling phages in lineage I, II, IV, and VI reveal statistically different infectivity phenotypes at either site with T-test p <0.05 (*). Statistical significance was not assessed for lineages III and V due to low sample size nor on the original isolation host, WH7803 (â ). Table S8. Corrected Rand Indices and Maliaâs VI values to hierarchical clustering between the original host range matrix and a randomized host range matrix. The hierarchical clusters were split into different number of clusters (5,10, 20 and 50) for the analyses. The analyses revealed low correspondence between clustering, indicating that the clustering we observe in the original shared genes matrix is not random and that there is some correlation with a biological signal. Table S9. Fisherâs exact tests p-values and phi coefficients (for effect size) for genes found under positive selection in comparisons between phylogenomic lineages using the non-polarized McDonald-Kreitman test. Also, the table reflects the corresponding protein clusters in the GOV population dataset [19]. Table S10. Genic versus intergenic recombination breakpoints for each lineage. (DOC 250 kb

FigShare

The Fast Changing Landscape of Sequencing Technologies and Their Impact on Microbial Genome Assemblies and Annotation

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date: 12/12/2012
Field of study

<div><h3>Background</h3><p>The emergence of next generation sequencing (NGS) has provided the means for rapid and high throughput sequencing and data generation at low cost, while concomitantly creating a new set of challenges. The number of available assembled microbial genomes continues to grow rapidly and their quality reflects the quality of the sequencing technology used, but also of the analysis software employed for assembly and annotation.</p> <h3>Methodology/Principal Findings</h3><p>In this work, we have explored the quality of the microbial draft genomes across various sequencing technologies. We have compared the draft and finished assemblies of 133 microbial genomes sequenced at the Department of Energy-Joint Genome Institute and finished at the Los Alamos National Laboratory using a variety of combinations of sequencing technologies, reflecting the transition of the institute from Sanger-based sequencing platforms to NGS platforms. The quality of the public assemblies and of the associated gene annotations was evaluated using various metrics. Results obtained with the different sequencing technologies, as well as their effects on downstream processes, were analyzed. Our results demonstrate that the Illumina HiSeq 2000 sequencing system, the primary sequencing technology currently used for de novo genome sequencing and assembly at JGI, has various advantages in terms of total sequence throughput and cost, but it also introduces challenges for the downstream analyses. In all cases assembly results although on average are of high quality, need to be viewed critically and consider sources of errors in them prior to analysis.</p> <h3>Conclusion</h3><p>These data follow the evolution of microbial sequencing and downstream processing at the JGI from draft genome sequences with large gaps corresponding to missing genes of significant biological role to assemblies with multiple small gaps (Illumina) and finally to assemblies that generate almost complete genomes (Illumina+PacBio).</p> </div

Directory of Open Access Journals

PubMed Central

FigShare

Additional file 2: of Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer

Author: Alex Copeland (34696)
Alexandra Worden (3490049)
Ann Gregory (3317088)
Ashley Maitland (3509408)
Filipa dos Santos (3509420)
J. Ignacio-Espinoza (3509414)
Joshua Weitz (3509417)
Kurt LaButti (114830)
Lauren Chittick (3509405)
Matthew Sullivan (83956)
Sebastian Sudek (2508499)
Sergei Solonenko (3509411)
Tanja Woyke (111784)
Publication venue
Publication date
Field of study

Lineage information. Figure S2. TEM images of phage isolates from cyanophage (A) lineage I, (B) lineage II, (C) lineage III, (D) lineage IV, (E) lineage V, (F) lineage VI, and (G-J) singleton and duplicon populations confirms myovirus morphology. Table S2. List of 51 core protein clusters shared across all six phylogenetic lineages. Fig. S3. Unrooted phylogenomic maximum likelihood tree of 27 concatenated protein sequences shared across published marine and non-marine T4-like phage genomes and the 142 cyanophage isolate genomes sequenced here. For simplicity, cyanophage isolate names in this tree were shortened from Syn7803* to just *. These analyses show that the 10 cyanophage populations observed here share similar evolutionary histories with other T4-like phages. Table S3. Average ANI of the 51 core genes within and between lineages. Table S4. AGDB groupings correspond with the phylogenetic lineages. Table S5. Average phylogenetic distances within and between lineages. Table S6. Corrected Rand Indices and Maliaâs VI values to compare the row and column hierarchical clustering between the original ANI and Shared Gene matrix and a randomized ANI and Shared Gene matrix, respectively. The hierarchical clusters were split into different number of clusters (5,10, 20 and 50) for the analyses. The analyses revealed low correspondence between clustering, indicating that the clustering we observe in the original matrices are not random. (DOC 4227 kb

FigShare

Additional file 1: Table S1. of High quality permanent draft genome sequence of Phaseolibacter flectens ATCC 12775T, a plant pathogen of French bean pods

Author: Alex Copeland (34696)
Alla Lapidus (6321)
Hans-Peter Klenk (158173)
Ido Izhaki (183119)
Malka Halpern (183125)
Manoj Pillay (3333165)
Marcel Huntemann (34103)
Markus GĂśker (3481007)
Nikos Kyrpides (8859)
Tanja Woyke (111784)
TBK Reddy (3537440)
Victor Markowitz (610405)
Yana Aizenberg-Gershtein (430376)
Publication venue
Publication date
Field of study

Scaffolds and contigs of Genomic DNA for Phaseolibacter flectens ATCC 12775T (Topology; linear, Read depth; 1.00). (DOCX 24Â kb

FigShare

Additional file 1: of Genomic differentiation among wild cyanophages despite widespread horizontal gene transfer

Author: Alex Copeland (34696)
Alexandra Worden (3490049)
Ann Gregory (3317088)
Ashley Maitland (3509408)
Filipa dos Santos (3509420)
J. Ignacio-Espinoza (3509414)
Joshua Weitz (3509417)
Kurt LaButti (114830)
Lauren Chittick (3509405)
Matthew Sullivan (83956)
Sebastian Sudek (2508499)
Sergei Solonenko (3509411)
Tanja Woyke (111784)
Publication venue
Publication date
Field of study

Metadata. Table S1. Metadata from the coastal (H3) and mesotrophic (67-70) sites. Samples for the 16S rRNA amplicons were collected twice at 67-70, 8 days apart (with the 10 Oct. sample being done on the same day as the viral sample). The mesotrophic station is often subject to upwelling. Figure S1. Sea surface temperatures (SST) of the region of the California Cooperative Oceanic Fisheries Investigations (CalCOFI) Line 67 ocean transect on 5 October 2009 (contoured from a single synoptic image, Aqua Modis, NOAA) with the locations of the nearshore (yellow, H3 - coastal) and offshore (red, 67-70 - offshore mesotrophic) stations marked with stars. Gene marker (16S rRNA gene amplicons) analyses using the reference alignments of ref. Sudek et al., 2015 revealed different Synechococcus communities at the two sites (for additional details on sampling details see Additional file 2: Table S1). The Synechococcus community was analyzed twice at 67-70, one day after the coastal sampling and on the same day as the viral 67-70 sample collection. Proportions of different clades varied in the 67-70 Synechococcus amplicon data but the same clades were present on both dates. (DOC 721 kb

FigShare

The distribution of projects among the 12 sequencing methods used.

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date
Field of study

<p>With dark green color are indicated the projects for which there are more than 5 sequenced projects and were used in downstream analysis.</p

FigShare

Methods used in this comparison.

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date
Field of study

1<p>PE: paired end reads.</p>2<p>LMP: Long Mate Paired reads.</p

FigShare

Correlation of the number of contigs with genome GC%, repeat content, and size.

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date
Field of study

<p>Data shown are the Kendall rank correlation coefficients.</p>*<p> = pvalue<0.05.</p

FigShare

Misassemblies as detected by low gene quality.

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date
Field of study

<p>Low quality genes are genes present in the finished genome that had a similarity (tBLASTn) to the draft genome but the alignment was either short (<50% of the gene length) or identity was <90%. Data is shown for the six sequencing methods with more than 5 projects.</p

FigShare

Genes missed in draft assemblies.

Author: Alex Copeland (34696)
Alicia Clum (111778)
Alla Lapidus (6321)
Daniel J. Quest (111776)
Hans Peter Klenk (111788)
Konstantinos Mavromatis (13836)
Lynne Goodwin (111781)
Miriam L. Land (111773)
Nikos C. Kyrpides (13837)
Robert W. Cottingham (111792)
Tanja Woyke (111784)
Thomas S. Brettin (67333)
Publication venue
Publication date
Field of study

<p>Data is shown for the sequencing methods with more than 5 projects. (a) Missed gene sequences, i.e., the number of genes in the finished genome whose nucleotide sequence is absent from the draft assembly. (b) Unrecognized genes, i.e., the number of genes whose nucleotide sequence is present in the draft assembly but that were not predicted by Prodigal (v2.5).</p

FigShare