Search CORE

12,923 research outputs found

A parallel algorithm for de novo peptide sequencing

Author: Elena Lodi
Elisa Mori
Sara Brunetti
Sonia Campa
Publication venue
Publication date: 08/05/2007
Field of study

Protein identification is a main problem in proteomics,the large-scale analysis of proteins. Tandem mass spec-trometry (MS/MS) provides an important tool to handleprotein identification problem. Indeed the spectrometeris capable of ionizing a mixture of peptides, essentiallyseveral copies of the same unknown peptide, dissociatingevery molecule into two fragments called complementaryions, and measuring the mass/charge ratios of the pep-tides and of their fragments. These measures are visualizedas mass peaks in a mass spectrum.There are two fundamental approaches to interpret thespectra. The first approach is to search in a database tofind the peptides that match the MS/MS spectra. This data-base search approach is effective for known proteins, butdoes not permit to detect novel proteins. This second taskcan be dealt with the de novo sequencing that computesthe amino acid sequence of the peptides directly fromtheir MS/MS spectra.In the de novo sequencing problem one knows the pep-tide mas

Springer - Publisher Connector

Open Access Repository

Protein Sequencing with an Adaptive Genetic Algorithm from Tandem Mass Spectrometry

Author: Boisson Jean-Charles
Jourdan Laetitia
Rolando Christian
Talbi El-Ghazali
Publication venue
Publication date: 06/02/2008
Field of study

In Proteomics, only the de novo peptide sequencing approach allows a partial amino acid sequence of a peptide to be found from a MS/MS spectrum. In this article a preliminary work is presented to discover a complete protein sequence from spectral data (MS and MS/MS spectra). For the moment, our approach only uses MS spectra. A Genetic Algorithm (GA) has been designed with a new evaluation function which works directly with a complete MS spectrum as input and not with a mass list like the other methods using this kind of data. Thus the mono isotopic peak extraction step which needs a human intervention is deleted. The goal of this approach is to discover the sequence of unknown proteins and to allow a better understanding of the differences between experimental proteins and proteins from databases

arXiv.org e-Print Archive

CiteSeerX

The impact of sequence database choice on metaproteomic results in gut microbiota studies

Author: Addis Maria Filippa
Deligios Massimo
Fraumene Cristina
Manghina Valeria
Martens Lennart
Muth Thilo
Pagnozzi Daniela
Palomba Antonio
Rapp Erdmann
Tanca Alessandro
Uzzau Sergio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: Elucidating the role of gut microbiota in physiological and pathological processes has recently emerged as a key research aim in life sciences. In this respect, metaproteomics, the study of the whole protein complement of a microbial community, can provide a unique contribution by revealing which functions are actually being expressed by specific microbial taxa. However, its wide application to gut microbiota research has been hindered by challenges in data analysis, especially related to the choice of the proper sequence databases for protein identification. Results: Here, we present a systematic investigation of variables concerning database construction and annotation and evaluate their impact on human and mouse gut metaproteomic results. We found that both publicly available and experimental metagenomic databases lead to the identification of unique peptide assortments, suggesting parallel database searches as a mean to gain more complete information. In particular, the contribution of experimental metagenomic databases was revealed to be mandatory when dealing with mouse samples. Moreover, the use of a "merged" database, containing all metagenomic sequences from the population under study, was found to be generally preferable over the use of sample-matched databases. We also observed that taxonomic and functional results are strongly database-dependent, in particular when analyzing the mouse gut microbiota. As a striking example, the Firmicutes/Bacteroidetes ratio varied up to tenfold depending on the database used. Finally, assembling reads into longer contigs provided significant advantages in terms of functional annotation yields. Conclusions: This study contributes to identify host- and database-specific biases which need to be taken into account in a metaproteomic experiment, providing meaningful insights on how to design gut microbiota studies and to perform metaproteomic data analysis. In particular, the use of multiple databases and annotation tools has to be encouraged, even though this requires appropriate bioinformatic resources

AIR Universita degli studi di Milano

Ghent University Academic Bibliography

PubMed Central

MPG.PuRe

PARPST: a PARallel algorithm to find peptide sequence tags

Author: Brunetti Sara
Lodi Elena
Mori Elisa
Stella Maria
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Background: Protein identification is one of the most challenging problems in proteomics. Tandem mass spectrometry provides an important tool to handle the protein identification problem. Results: We developed a work-efficient parallel algorithm for the peptide sequence tag problem. The algorithm runs on the concurrent-read, exclusive-write PRAM in O(n) time using log n processors, where n is the number of mass peaks in the spectrum. The algorithm is able to find all the sequence tags having score greater than a parameter or all the sequence tags of maximum length. Our tests on 1507 spectra in the Open Proteomics Database shown that our algorithm is efficient and effective since achieves comparable results to other methods. Conclusions: The proposed algorithm can be used to speed up the database searching or to identify post-translational modifications, comparing the homology of the sequence tags found with the sequences in the biological database

Crossref

Archivio della Ricerca - Università degli Studi di Siena

Springer - Publisher Connector

PubMed Central

Coordinated RNA-Seq and peptidomics identify neuropeptides and G-protein coupled receptors (GPCRs) in the large pine weevil Hylobius abietis, a major forestry pest

Author: Davies Shireen-Anne
Dow Julian A.T.
Inward Daegan J.G.
Marley Richard
Pandit Aniruddha A.
Predel Reinhard
Ragionieri Lapo
Yeoh Joseph G.C.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Hylobius abietis (Linnaeus), or large pine weevil (Coleoptera, Curculionidae), is a pest of European coniferous forests. In order to gain understanding of the functional physiology of this species, we have assembled a de novo transcriptome of H. abietis, from sequence data obtained by Next Generation Sequencing. In particular, we have identified genes encoding neuropeptides, peptide hormones and their putative G-protein coupled receptors (GPCRs) to gain insights into neuropeptide-modulated processes. The transcriptome was assembled de novo from pooled paired-end, sequence reads obtained from RNA from whole adults, gut and central nervous system tissue samples. Data analysis was performed on the transcripts obtained from the assembly including, annotation, gene ontology and functional assignment as well as transcriptome completeness assessment and KEGG pathway analysis. Pipelines were created using Bioinformatics tools and techniques for prediction and identification of neuropeptides and neuropeptide receptors. Peptidomic analysis was also carried out using a combination of MALDI-TOF as well as Q-Exactive Orbitrap mass spectrometry to confirm the identified neuropeptide. 41 putative neuropeptide families were identified in H. abietis, including Adipokinetic hormone (AKH), CAPA and DH31. Neuropeptide F, which has not been yet identified in the model beetle T. castaneum, was identified. Additionally, 24 putative neuropeptide and 9 leucine-rich repeat containing G protein coupled receptor-encoding transcripts were determined using both alignment as well as non-alignment methods. This information, submitted to the NCBI sequence read archive repository (SRA accession: SRP133355), can now be used to inform understanding of neuropeptide-modulated physiology and behaviour in H. abietis; and to develop specific neuropeptide-based tools for H. abietis control

Kölner UniversitätsPublikationsServer

Enlighten

Counting approximately-shortest paths in directed acyclic graphs

Author: A.Z. Broder
B. Lu
C. Burge
D. Naor
D. Štefankovič
J.M. Buhmann
L.G. Valiant
L.G. Valiant
M. Dyer
M. Jerrum
T. Chen
Publication venue
Publication date: 01/01/2013
Field of study

Given a directed acyclic graph with positive edge-weights, two vertices s and t, and a threshold-weight L, we present a fully-polynomial time approximation-scheme for the problem of counting the s-t paths of length at most L. We extend the algorithm for the case of two (or more) instances of the same problem. That is, given two graphs that have the same vertices and edges and differ only in edge-weights, and given two threshold-weights L_1 and L_2, we show how to approximately count the s-t paths that have length at most L_1 in the first graph and length at most L_2 in the second graph. We believe that our algorithms should find application in counting approximate solutions of related optimization problems, where finding an (optimum) solution can be reduced to the computation of a shortest path in a purpose-built auxiliary graph

arXiv.org e-Print Archive

Maastricht University Research Portal

Crossref

Protein Sequencing with an Adaptive Genetic Algorithm from Tandem Mass Spectrometry

Author: Boisson Jean-Charles
Jourdan Laetitia
Rolando Christian
Talbi El-Ghazali
Publication venue: HAL CCSD
Publication date: 16/07/2006
Field of study

International audienceIn Proteomics, only the de novo peptide sequencing approach allows a partial amino acid sequence of a peptide to be found from a MS/MS spectrum. In this article a preliminary work is presented to discover a complete protein sequence from spectral data (MS and MS/MS spectra). For the moment, our approach only uses MS spectra. A Genetic Algorithm (GA) has been designed with a new evaluation function which works directly with a complete MS spectrum as input and not with a mass list like the other methods using this kind of data. Thus the mono isotopic peak extraction step which needs a human intervention is deleted. The goal of this approach is to discover the sequence of unknown proteins and to allow a better understanding of the differences between experimental proteins and proteins from databases

HAL - Lille 3

INRIA a CCSD electronic archive server