8,701 research outputs found

    GA-Novo: De Novo Peptide Sequencing via Tandem Mass Spectrometry using Genetic Algorithm

    Full text link
    Proteomics is the large-scale analysis of the proteins. The common method for identifying proteins and characterising their amino acid sequences is to digest the proteins into peptides, analyse the peptides using mass spectrometry and assign the resulting tandem mass spectra (MS/MS) to peptides using database search tools. However, database search algorithms are highly dependent on a reference protein database and they cannot identify peptides and proteins not included in the database. Therefore, de novo sequencing algorithms are developed to overcome the problem by directly reconstructing the peptide sequence of an MS/MS spectrum without using any protein database. Current de novo sequencing algorithms often fail to construct the completely matched sequences, and produce partial matches. In this study, we propose a genetic algorithm based method, GA-Novo, to solve the complex optimisation task of de novo peptide sequencing, aiming at constructing full length sequences. Given an MS/MS spectrum, GA-Novo optimises the amino acid sequences to best fit the input spectrum. On the testing dataset, GA-Novo outperforms PEAKS, the most commonly used software for this task, by constructing 8% higher number of fully matched peptide sequences, and 4% higher recall at partially matched sequences

    Using evolutionary algorithms and machine learning to explore sequence space for the discovery of antimicrobial peptides

    Get PDF
    We present a proof-of-concept methodology for efficiently optimizing a chemical trait by using an artificial evolutionary workflow. We demonstrate this by optimizing the efficacy of antimicrobial peptides (AMPs). In particular, we used a closed-loop approach that combines a genetic algorithm, machine learning, and in vitro evaluation to improve the antimicrobial activity of peptides against Escherichia coli. Starting with a 13-mer natural AMP, we identified 44 highly potent peptides, achieving up to a ca. 160-fold increase in antimicrobial activity within just three rounds of experiments. During these experiments, the conformation of the peptides selected was changed from a random coil to an α-helical form. This strategy not only establishes the potential of in vitro molecule evolution using an algorithmic genetic system but also accelerates the discovery of antimicrobial peptides and other functional molecules within a relatively small number of experiments, allowing the exploration of broad sequence and structural space

    Computational protein design with backbone plasticity

    Get PDF
    The computational algorithms used in the design of artificial proteins have become increasingly sophisticated in recent years, producing a series of remarkable successes. The most dramatic of these is the de novo design of artificial enzymes. The majority of these designs have reused naturally occurring protein structures as “scaffolds” onto which novel functionality can be grafted without having to redesign the backbone structure. The incorporation of backbone flexibility into protein design is a much more computationally challenging problem due to the greatly increase search space but promises to remove the limitations of reusing natural protein scaffolds. In this review, we outline the principles of computational protein design methods and discuss recent efforts to consider backbone plasticity in the design process

    TRAPID : an efficient online tool for the functional and comparative analysis of de novo RNA-Seq transcriptomes

    Get PDF
    Transcriptome analysis through next-generation sequencing technologies allows the generation of detailed gene catalogs for non-model species, at the cost of new challenges with regards to computational requirements and bioinformatics expertise. Here, we present TRAPID, an online tool for the fast and efficient processing of assembled RNA-Seq transcriptome data, developed to mitigate these challenges. TRAPID offers high-throughput open reading frame detection, frameshift correction and includes a functional, comparative and phylogenetic toolbox, making use of 175 reference proteomes. Benchmarking and comparison against state-of-the-art transcript analysis tools reveals the efficiency and unique features of the TRAPID system
    corecore