11 research outputs found
Amino Termini of Many Yeast Proteins Map to Downstream Start Codons
Comprehensive knowledge of proteome complexity is crucial
to understanding
cell function. Amino termini of yeast proteins were identified through
peptide mass spectrometry on glutaraldehyde-treated cell lysates as
well as a parallel assessment of publicly deposited spectra. An unexpectedly
large fraction of detected amino-terminal peptides (35%) mapped to
translation initiation at AUG codons downstream of the annotated start
codon. Many of the implicated genes have suboptimal sequence contexts
for translation initiation near their annotated AUG, and their ribosome
profiles show elevated tag densities consistent with translation initiation
at downstream AUGs as well as their annotated AUGs. These data suggest
that a significant fraction of the yeast proteome derives from initiation
at downstream AUGs, increasing significantly the repertoire of encoded
proteins and their potential functions and cellular localizations
Amino Termini of Many Yeast Proteins Map to Downstream Start Codons
Comprehensive knowledge of proteome complexity is crucial
to understanding
cell function. Amino termini of yeast proteins were identified through
peptide mass spectrometry on glutaraldehyde-treated cell lysates as
well as a parallel assessment of publicly deposited spectra. An unexpectedly
large fraction of detected amino-terminal peptides (35%) mapped to
translation initiation at AUG codons downstream of the annotated start
codon. Many of the implicated genes have suboptimal sequence contexts
for translation initiation near their annotated AUG, and their ribosome
profiles show elevated tag densities consistent with translation initiation
at downstream AUGs as well as their annotated AUGs. These data suggest
that a significant fraction of the yeast proteome derives from initiation
at downstream AUGs, increasing significantly the repertoire of encoded
proteins and their potential functions and cellular localizations
American Gut Project fecal sOTU counts table
The Deblur sOTU counts table for the fecal samples used in the American Gut Project manuscript. The samples were trimmed to a common read length of 125nt, and processed by Deblur (Amir et al mSystems 2017). Blooms were removed (Amir et al mSystems 2017) and any sample with fewer than 1250 sequences was omitted. This table is not rarefied,
movie_s2.mp4
Placing changes in the microbiome in the context of the American Gut. We accumulated samples over sequencing runs to demonstrate the structural consistency in the data. We demonstrate that while the ICU dataset (https://www.ncbi.nlm.nih.gov/pubmed/27602409) falls within the American Gut samples, they do not fall close to most samples at any of the body sites. We then highlight samples from the United Kingdom, Australia, the United States and other countries to show that nationality does not overcome the variation in body site. We then highlight the utility of the American Gut in meta-analysis by reproducing results from (https://www.ncbi.nlm.nih.gov/pubmed/20668239) and (https://www.ncbi.nlm.nih.gov/pubmed/23861384), using the AGP dataset as the context for dynamic microbiome changes instead of the HMP dataset. We show rapid, complete recovery of C. diff patients following fecal material transplantation and also contextualized the change in an infant gut over time until it settles into an adult state. This demonstrates the power of the American Gut dataset, both as a cohesive study and as a context for other investigations
ag_tree.tre
The SEPP (Mirarab et al Pac Symp Biocomput 2012) fragment insertion tree used for phylogenetic analyses
Unweighted UniFrac distances
The unweighted UniFrac distance (Lozupone and Knight AEM 2005) matrix of the 9511 fecal samples used in the American Gut paper. UniFrac was computed using Striped UniFrac (https://github.com/biocore/unifrac). Prior to execution of UniFrac, Deblur (Amir et al mSystems 2017) was run on the samples, all bloom sOTUs were removed (Amir et al mSystems 2017), and samples were rarefied to a depth of 1250 reads (Weiss et al Microbiome 2017). For the phylogeny, fragments were inserted using SEPP (Mirarab et al Pac Symp Biocomput 2012) into the Greengenes 13_5 99% OTU tree (McDonald et al ISME 2012)
movie_s1.mp4
Longitudinal samples from a large bowel resection. We place longitudinal samples collected prior to and following a large bowel resection in the context of samples from the AGP, the Earth Microbiome Project (https://www.ncbi.nlm.nih.gov/pubmed/29088705), intensive care unit patients (https://www.ncbi.nlm.nih.gov/pubmed/27602409), "extreme" diet samples from (https://www.ncbi.nlm.nih.gov/pubmed/24336217), and samples from the Hadza hunter-gatherers (https://www.ncbi.nlm.nih.gov/pubmed/28839072). Unweighted UniFrac was computed on this sample set, and principal coordinates were assessed. Using EMPeror (https://www.ncbi.nlm.nih.gov/pubmed/24280061), we then animate the plot by connect successive data points gut resection time series, while rotating the data frame. We first show the how the extent of change in the microbial community, and how the samples immediately following surgery resemble fecal samples from ICU patients. In the background of the animation, a black line connects a plant rhizosphere sample to a marine sediment sample, which have the same unweighted UniFrac distance (0.78) as the longitudinal sample immediately preceding and immediately following surgery
American Gut Project fecal sOTU relative abundance table
The Deblur sOTU relative abundance table for the fecal samples used in the American Gut Project manuscript. The samples were trimmed to a common read length of 125nt, and processed by Deblur (Amir et al mSystems 2017). Blooms were removed (Amir et al mSystems 2017) and any sample with fewer than 1250 sequences was omitted. This table is not rarefied, and is normalized to 1
American Gut collated alpha diversities
Collated alpha diversity values for the 9511 fecal samples used in the American Gut Project manuscript
Full American Gut Project mapping file
The full American Gut Project mapping file, includes non-fecal samples