22,310 research outputs found
Is protein folding problem really a NP-complete one ? First investigations
To determine the 3D conformation of proteins is a necessity to understand
their functions or interactions with other molecules. It is commonly admitted
that, when proteins fold from their primary linear structures to their final 3D
conformations, they tend to choose the ones that minimize their free energy. To
find the 3D conformation of a protein knowing its amino acid sequence,
bioinformaticians use various models of different resolutions and artificial
intelligence tools, as the protein folding prediction problem is a NP complete
one. More precisely, to determine the backbone structure of the protein using
the low resolution models (2D HP square and 3D HP cubic), by finding the
conformation that minimize free energy, is intractable exactly. Both the proof
of NP-completeness and the 2D prediction consider that acceptable conformations
have to satisfy a self-avoiding walk (SAW) requirement, as two different amino
acids cannot occupy a same position in the lattice. It is shown in this
document that the SAW requirement considered when proving NP-completeness is
different from the SAW requirement used in various prediction programs, and
that they are different from the real biological requirement. Indeed, the proof
of NP completeness and the predictions in silico consider conformations that
are not possible in practice. Consequences of this fact are investigated in
this research work.Comment: Submitted to Journal of Bioinformatics and Computational Biology,
under revie
The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases
One of the most intriguing groups of enzymes, the feruloyl esterases (FAEs), is ubiquitous in both simple and complex organisms. FAEs have gained importance in biofuel, medicine and food industries due to their capability of acting on a large range of substrates for cleaving ester bonds and synthesizing high-added value molecules through esterification and transesterification reactions. During the past two decades extensive studies have been carried out on the production and partial characterization of FAEs from fungi, while much less is known about FAEs of bacterial or plant origin. Initial classification studies on FAEs were restricted on sequence similarity and substrate specificity on just four model substrates and considered only a handful of FAEs belonging to the fungal kingdom. This study centers on the descriptor-based classification and structural analysis of experimentally verified and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training sequences and all the sequences of the blind test. The underlying functionality of the 12 proposed FAE families was validated against a combination of prediction tools and published experimental data. Another important aspect of the present work involves the development of pharmacophore models for the new FAE families, for which sufficient information on known substrates existed. Knowing the pharmacophoric features of a small molecule that are essential for binding to the members of a certain family opens a window of opportunities for tailored applications of FAEs
Recommended from our members
Fast Computation of the Fitness Function for Protein Folding Prediction in a 2D Hydrophobic-Hydrophilic Model
Protein Folding Prediction (PFP) is essentially an energy minimization problem formalised by the definition of a fitness function. Several PFP models have been proposed including the Hydrophobic-Hydrophilic (HP) model, which is widely used as a test-bed for evaluating new algorithms. The calculation of the fitness is the major computational task in determining the native conformation of a protein in the HP model and this paper presents a new efficient search algorithm (ESA) for deriving the fitness value requiring only O(n) complexity in contrast to the full search approach, which takes O(n2). The improved efficiency of ESA is achieved by exploiting some intrinsic properties of the HP model, with a resulting reduction of more than 50% in the overall time complexity when compared with the previously reported Caching Approach, with the added benefit that the additional space complexity is linear instead of quadratic
- …