22 research outputs found
The Drosophila melanogaster PeptideAtlas facilitates the use of peptide data for improved fly proteomics and genome annotation
<p>Abstract</p> <p>Background</p> <p>Crucial foundations of any quantitative systems biology experiment are correct genome and proteome annotations. Protein databases compiled from high quality empirical protein identifications that are in turn based on correct gene models increase the correctness, sensitivity, and quantitative accuracy of systems biology genome-scale experiments.</p> <p>Results</p> <p>In this manuscript, we present the <it>Drosophila melanogaster </it>PeptideAtlas, a fly proteomics and genomics resource of unsurpassed depth. Based on peptide mass spectrometry data collected in our laboratory the portal <url>http://www.drosophila-peptideatlas.org</url> allows querying fly protein data observed with respect to gene model confirmation and splice site verification as well as for the identification of proteotypic peptides suited for targeted proteomics studies. Additionally, the database provides consensus mass spectra for observed peptides along with qualitative and quantitative information about the number of observations of a particular peptide and the sample(s) in which it was observed.</p> <p>Conclusion</p> <p>PeptideAtlas is an open access database for the <it>Drosophila </it>community that has several features and applications that support (1) reduction of the complexity inherently associated with performing targeted proteomic studies, (2) designing and accelerating shotgun proteomics experiments, (3) confirming or questioning gene models, and (4) adjusting gene models such that they are in line with observed <it>Drosophila </it>peptides. While the database consists of proteomic data it is not required that the user is a proteomics expert.</p
Recommended from our members
Spectral Libraries for SWATH-MS Assays for Drosophila melanogaster and Solanum lycopersicum.
Quantitative proteomics methods have emerged as powerful tools for measuring protein expression changes at the proteome level. Using MS-based approaches, it is now possible to routinely quantify thousands of proteins. However, prefractionation of the samples at the protein or peptide level is usually necessary to go deep into the proteome, increasing both MS analysis time and technical variability. Recently, a new MS acquisition method named SWATH is introduced with the potential to provide good coverage of the proteome as well as a good measurement precision without prior sample fractionation. In contrast to shotgun-based MS however, a library containing experimental acquired spectra is necessary for the bioinformatics analysis of SWATH data. In this study, spectral libraries for two widely used models are built to study crop ripening or animal embryogenesis, Solanum lycopersicum (tomato) and Drosophila melanogaster, respectively. The spectral libraries comprise fragments for 5197 and 6040 proteins for S. lycopersicum and D. melanogaster, respectively, and allow reproducible quantification for thousands of peptides per MS analysis. The spectral libraries and all MS data are available in the MassIVE repository with the dataset identifiers MSV000081074 and MSV000081075 and the PRIDE repository with the dataset identifiers PXD006493 and PXD006495
Proteomics Databases and Websites
Information avalanche (overload or expansion) in various scientific fields is a novel issue turned out by a number of factors considered necessary to facilitate their record and registration. Though, the biological science and its diverse fields like proteomics are not immune of this event and even may be as the event’s herald. On the other hand, time as the most valued anxiety of human has encountered a huge mass of information. Therefore, in order to maintain access and ease the understanding of information in several fields some emprises have been prepared. Bioinformatics is an upshot of this anxiety and emprise. Interestingly, proteomics through studying proteins collection in alive things has covered a great portion of bioinformatics. Consequently, a noteworthy outlook on proteomics related databases (DBs) and websites not only can help investigators to face the upcoming archive of databases but also estimate the volume of the needed facilitates. Furthermore, enrichment of the DBs or related websites must be the priority of researchers. Herein, by covering the major proteomics related databases and websites, we have presented a comprehensive classification to simplify and clarify their understanding and applications
Evidence of abundant stop codon readthrough in Drosophila and other Metazoa
While translational stop codon readthrough is often used by viral genomes, it has been observed for only a handful of eukaryotic genes. We previously used comparative genomics evidence to recognize protein-coding regions in 12 species of Drosophila and showed that for 149 genes, the open reading frame following the stop codon has a protein-coding conservation signature, hinting that stop codon readthrough might be common in Drosophila. We return to this observation armed with deep RNA sequence data from the modENCODE project, an improved higher-resolution comparative genomics metric for detecting protein-coding regions, comparative sequence information from additional species, and directed experimental evidence. We report an expanded set of 283 readthrough candidates, including 16 double-readthrough candidates; these were manually curated to rule out alternatives such as A-to-I editing, alternative splicing, dicistronic translation, and selenocysteine incorporation. We report experimental evidence of translation using GFP tagging and mass spectrometry for several readthrough regions. We find that the set of readthrough candidates differs from other genes in length, composition, conservation, stop codon context, and in some cases, conserved stem–loops, providing clues about readthrough regulation and potential mechanisms. Lastly, we expand our studies beyond Drosophila and find evidence of abundant readthrough in several other insect species and one crustacean, and several readthrough candidates in nematode and human, suggesting that functionally important translational stop codon readthrough is significantly more prevalent in Metazoa than previously recognized.National Institutes of Health (U.S.) (U54 HG00455-01)National Science Foundation (U.S.) (CAREER 0644282)Alfred P. Sloan Foundatio
Identification and Functional Characterization of N-Terminally Acetylated Proteins in Drosophila melanogaster
A new study reveals a functional rule for N-terminal acetylation in higher eukaryotes called the (X)PX rule and describes a generic method that prevents this modification to allow the study of N-terminal acetylation in any given protein
Genetic and Proteomic Evidence for Roles of Drosophila SUMO in Cell Cycle Control, Ras Signaling, and Early Pattern Formation
SUMO is a protein modifier that is vital for multicellular development. Here we present the first system-wide analysis, combining multiple approaches, to correlate the sumoylated proteome (SUMO-ome) in a multicellular organism with the developmental roles of SUMO. Using mass-spectrometry-based protein identification, we found over 140 largely novel SUMO conjugates in the early Drosophila embryo. Enriched functional groups include proteins involved in Ras signaling, cell cycle, and pattern formation. In support of the functional significance of these findings, sumo germline clone embryos exhibited phenotypes indicative of defects in these same three processes. Our cell culture and immunolocalization studies further substantiate roles for SUMO in Ras signaling and cell cycle regulation. For example, we found that SUMO is required for efficient Ras-mediated MAP kinase activation upstream or at the level of Ras activation. We further found that SUMO is dynamically localized during mitosis to the condensed chromosomes, and later also to the midbody. Polo kinase, a SUMO substrate found in our screen, partially colocalizes with SUMO at both sites. These studies show that SUMO coordinates multiple regulatory processes during oogenesis and early embryogenesis. In addition, our database of sumoylated proteins provides a valuable resource for those studying the roles of SUMO in development
Desarrollo de herramientas bioinformáticas para estudios de proteómica a gran escala de "Candida albicans"
El concepto de Proteómica, acuñado en analogía al de Genómica, fue usado por primera vez por Marc Wilkins a mediados de los años 90 para describir al conjunto total de proteínas que se expresan por los genes de una célula, tejido u organismo. Anteriormente, a finales de los 80, el desarrollo de las técnicas de ionización suave, como la Ionización por Electrospray, ESI (Electrospray Ionization) o la Desorción Suave por Láser, SLD (Soft Laser Desorption), permitió ionizar grandes biomoléculas como los péptidos y proteínas manteniéndolas relativamente intactas. Esto sentó las bases de la espectrometría de masas aplicada a la proteómica. En la proteómica shotgun (el término inglés está muy asentado), el primer paso del experimento generalmente consiste en la digestión de las proteínas de la muestra en péptidos por acción de una enzima proteolítica como la tripsina. Esto incrementa notablmente el rendimiento en términos de número de proteínas que pueden ser identificadas en un sólo experimento comparado con los experimentos basados en gel. Sin embargo, tiene el coste asociado de provocar una gran complejidad de la mezcla de péptidos y el problema añadido de la inferencia de las proteínas originarias. Los péptidos son separados por cromatografía líquida e ionizados para entrar a continuación en el espectrómetro de masas donde son separados en función de la proporción entre su masa y su carga (m/z) y los valores obtenidos son registrados en un espectro MS1. En la espectrometría de masas en tándem (MS/MS), los péptidos con mayor intensidad son seleccionados para ser fragmentados de modo que se generan espectros MS/MS, colecciones de valores m/z y de intensidad para cada precursor y sus fragmentos..
A Proteomic View of an Important Human Pathogen – Towards the Quantification of the Entire Staphylococcus aureus Proteome
The genome sequence is the “blue-print of life,” but proteomics provides the link to the actual physiology of living cells. Because of their low complexity bacteria are excellent model systems to identify the entire protein assembly of a living organism. Here we show that the majority of proteins expressed in growing and non-growing cells of the human pathogen Staphylococcus aureus can be identified and even quantified by a metabolic labeling proteomic approach. S. aureus has been selected as model for this proteomic study, because it poses a major risk to our health care system by combining high pathogenicity with an increasing frequency of multiple antibiotic resistance, thus requiring the development of new anti-staphylococcal therapy strategies. Since such strategies will likely have to target extracellular and surface-exposed virulence factors as well as staphylococcal survival and adaptation capabilities, we decided to combine four subproteomic fractions: cytosolic proteins, membrane-bound proteins, cell surface-associated and extracellular proteins, to comprehensively cover the entire proteome of S. aureus. This quantitative proteomics approach integrating data ranging from gene expression to subcellular localization in growing and non-growing cells is a proof of principle for whole-cell physiological proteomics that can now be extended to address physiological questions in infection-relevant settings. Importantly, with more than 1700 identified proteins (and 1450 quantified proteins) corresponding to a coverage of about three-quarters of the expressed proteins, our model study represents the most comprehensive quantification of a bacterial proteome reported to date. It thus paves the way towards a new level in understanding of cell physiology and pathophysiology of S. aureus and related pathogenic bacteria, opening new avenues for infection-related research on this crucial pathogen
Recommended from our members
High-throughput assessment of small open reading frame translation in Drosophila melanogaster
Hundreds of thousands of putative small ORFs (smORFs) sequences are present in eukaryotic genomes, potentially coding for peptides less than 100 amino acids. smORFs have been deemed non-coding on the basis of their high numbers and their small size that makes it extremely challenging to assess their functionality both bioinformatically and biochemically. The recently developed Ribo-Seq technique, which is the deep sequencing of ribosome footprints, has generated significant controversy by showing extensive translation of smORFs outside of annotated protein coding regions, including putative non-coding RNAs.. Our lab adapted the Ribo-Seq technique by combining it with the polysome fractionation in order to assess smORF translation in Drosophila S2 cells. This thesis provides a high-throughput assessment of smORF translation in Drosophila melanogaster by firstly implementing complementary techniques such as transfection-tagging and Mass spectrometry methods in order to provide an independent corroboration of the S2 cell data (Chapter 3). Secondly, the in order to expand the catalogue of smORFs that are translated, I significantly improve upon the yield and sequencing efficiency of the Poly-Ribo-Seq protocol while adapting it to Drosophila embryos and then implementing it across embryogenesis divided in to Early, Mid and Late stages (Chapter 4). Currently, there is still a lot of debate in the field with regards to Ribo-Seq data analysis, and various computational metrics have been developed aimed at discerning ‘real’ translation events to background noise. Chapter 5 explores some of the metrics developed and establishes a translation cut-off suitable for designating small ORFs as translated. Altogether, the improvements introduced to the protocol and my data analysis shows the translation of 500 annotated smORFs, 500 smORFs in long non-coding RNAs and 5,000 uORFs, of which only one-third of each type of smORF has previous evidence of translation. These findings strengthen the establishment of smORFs as a distinct class of genes that significantly expand the protein coding complement of the genome