49,365 research outputs found
Flexible protein folding by ant colony optimization
Protein structure prediction is one of the most challenging topics in bioinformatics.
As the protein structure is found to be closely related to its functions,
predicting the folding structure of a protein to judge its functions is meaningful to
the humanity. This chapter proposes a flexible ant colony (FAC) algorithm for solving
protein folding problems (PFPs) based on the hydrophobic-polar (HP) square lattice
model. Different from the previous ant algorithms for PFPs, the pheromones in the
proposed algorithm are placed on the arcs connecting adjacent squares in the lattice.
Such pheromone placement model is similar to the one used in the traveling salesmen
problems (TSPs), where pheromones are released on the arcs connecting the cities.
Moreover, the collaboration of effective heuristic and pheromone strategies greatly
enhances the performance of the algorithm so that the algorithm can achieve good
results without local search methods. By testing some benchmark two-dimensional
hydrophobic-polar (2D-HP) protein sequences, the performance shows that the proposed
algorithm is quite competitive compared with some other well-known methods
for solving the same protein folding problems
Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome
The article presents an application of Hidden Markov Models (HMMs) for
pattern recognition on genome sequences. We apply HMM for identifying genes
encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma
brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa
causative agents of sleeping sickness and several diseases in domestic and wild
animals. These parasites have a peculiar strategy to evade the host's immune
system that consists in periodically changing their predominant cellular
surface protein (VSG). The motivation for using patterns recognition methods to
identify these genes, instead of traditional homology based ones, is that the
levels of sequence identity (amino acid and DNA sequence) amongst these genes
is often below of what is considered reliable in these methods. Among pattern
recognition approaches, HMM are particularly suitable to tackle this problem
because they can handle more naturally the determination of gene edges. We
evaluate the performance of the model using different number of states in the
Markov model, as well as several performance metrics. The model is applied
using public genomic data. Our empirical results show that the VSG genes on T.
brucei can be safely identified (high sensitivity and low rate of false
positives) using HMM.Comment: Accepted article in July, 2015 in Pattern Analysis and Applications,
Springer. The article contains 23 pages, 4 figures, 8 tables and 51
reference
The antigenic index: a novel algorithm for predicting antigenic determinants
In this paper, we introduce a computer algorithm which can
be used to predict the topological features of a protein directly
from its primary amino acid sequence. The computer program
generates values for surface accessibility parameters and combines
these values with those obtained for regional backbone
flexibility and predicted secondary structure. The output of this
algorithm, the antigenic index, is used to create a linear surface
contour profile of the protein. Because most, if not all,
antigenic sites are located within surface exposed regions of
a protein, the program offers a reliable means of predicting
potential antigenic determinants. We have tested the ability of
this program to generate accurate surface contour profiles and
predict antigenic sites from the linear amino acid sequences
of well-characterized proteins and found a strong correlation
between the predictions of the antigenic index and known structural
and biological data
Seeing the Forest for the Trees: Using the Gene Ontology to Restructure Hierarchical Clustering
Motivation: There is a growing interest in improving the cluster analysis of expression data by incorporating into it prior knowledge, such as the Gene Ontology (GO) annotations of genes, in order to improve the biological relevance of the clusters that are subjected to subsequent scrutiny. The structure of the GO is another source of background knowledge that can be exploited through the use of semantic similarity. Results: We propose here a novel algorithm that integrates semantic similarities (derived from the ontology structure) into the procedure of deriving clusters from the dendrogram constructed during expression-based hierarchical clustering. Our approach can handle the multiple annotations, from different levels of the GO hierarchy, which most genes have. Moreover, it treats annotated and unannotated genes in a uniform manner. Consequently, the clusters obtained by our algorithm are characterized by significantly enriched annotations. In both cross-validation tests and when using an external index such as protein–protein interactions, our algorithm performs better than previous approaches. When applied to human cancer expression data, our algorithm identifies, among others, clusters of genes related to immune response and glucose metabolism. These clusters are also supported by protein–protein interaction data. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.Lynne and William Frankel Center for Computer Science; Paul Ivanier center for robotics research and production; National Institutes of Health (R01 HG003367-01A1
Review of Immunoinformatic approaches to in-silico B-cell epitope prediction
In this paper, the current state of in-silico, B-cell epitope prediction is discussed. Recommendations for improving some of the approaches encountered are outlined, along with the presentation of an entirely novel technique, which uses molecular mechanics for epitope classification, evaluation and prediction
Recommended from our members
Multiomics modeling of the immunome, transcriptome, microbiome, proteome and metabolome adaptations during human pregnancy.
MotivationMultiple biological clocks govern a healthy pregnancy. These biological mechanisms produce immunologic, metabolomic, proteomic, genomic and microbiomic adaptations during the course of pregnancy. Modeling the chronology of these adaptations during full-term pregnancy provides the frameworks for future studies examining deviations implicated in pregnancy-related pathologies including preterm birth and preeclampsia.ResultsWe performed a multiomics analysis of 51 samples from 17 pregnant women, delivering at term. The datasets included measurements from the immunome, transcriptome, microbiome, proteome and metabolome of samples obtained simultaneously from the same patients. Multivariate predictive modeling using the Elastic Net (EN) algorithm was used to measure the ability of each dataset to predict gestational age. Using stacked generalization, these datasets were combined into a single model. This model not only significantly increased predictive power by combining all datasets, but also revealed novel interactions between different biological modalities. Future work includes expansion of the cohort to preterm-enriched populations and in vivo analysis of immune-modulating interventions based on the mechanisms identified.Availability and implementationDatasets and scripts for reproduction of results are available through: https://nalab.stanford.edu/multiomics-pregnancy/.Supplementary informationSupplementary data are available at Bioinformatics online
- …