32 research outputs found
Frog: a FRee Online druG 3D conformation generator
In silico screening methods based on the 3D structures of the ligands or of the proteins have become an essential tool to facilitate the drug discovery process. To achieve such process, the 3D structures of the small chemical compounds have to be generated. In addition, for ligand-based screening computations or hierarchical structure-based screening projects involving a rigid-body docking step, it is necessary to generate multi-conformer 3D models for each input ligand to increase the efficiency of the search. However, most academic or commercial compound collections are delivered in 1D SMILES (simplified molecular input line entry system) format or in 2D SDF (structure data file), highlighting the need for free 1D/2D to 3D structure generators. Frog is an on-line service aimed at generating 3D conformations for drug-like compounds starting from their 1D or 2D descriptions. Given the atomic constitution of the molecules and connectivity information, Frog can identify the different unambiguous isomers corresponding to each compound, and generate single or multiple low-to-medium energy 3D conformations, using an assembly process that does not presently consider ring flexibility. Tests show that Frog is able to generate bioactive conformations close to those observed in crystallographic complexes. Frog can be accessed at http://bioserv.rpbs.jussieu.fr/Frog.html
RPBS: a web resource for structural bioinformatics
RPBS (Ressource Parisienne en Bioinformatique Structurale) is a resource dedicated primarily to structural bioinformatics. It is the result of a joint effort by several teams to set up an interface that offers original and powerful methods in the field. As an illustration, we focus here on three such methods uniquely available at RPBS: AUTOMAT for sequence databank scanning, YAKUSA for structure databank scanning and WLOOP for homology loop modelling. The RPBS server can be accessed at and the specific services at
Inference of Co-Evolving Site Pairs: an Excellent Predictor of Contact Residue Pairs in Protein 3D structures
Residue-residue interactions that fold a protein into a unique
three-dimensional structure and make it play a specific function impose
structural and functional constraints on each residue site. Selective
constraints on residue sites are recorded in amino acid orders in homologous
sequences and also in the evolutionary trace of amino acid substitutions. A
challenge is to extract direct dependences between residue sites by removing
indirect dependences through other residues within a protein or even through
other molecules. Recent attempts of disentangling direct from indirect
dependences of amino acid types between residue positions in multiple sequence
alignments have revealed that the strength of inferred residue pair couplings
is an excellent predictor of residue-residue proximity in folded structures.
Here, we report an alternative attempt of inferring co-evolving site pairs from
concurrent and compensatory substitutions between sites in each branch of a
phylogenetic tree. First, branch lengths of a phylogenetic tree inferred by the
neighbor-joining method are optimized as well as other parameters by maximizing
a likelihood of the tree in a mechanistic codon substitution model. Mean
changes of quantities, which are characteristic of concurrent and compensatory
substitutions, accompanied by substitutions at each site in each branch of the
tree are estimated with the likelihood of each substitution. Partial
correlation coefficients of the characteristic changes along branches between
sites are calculated and used to rank co-evolving site pairs. Accuracy of
contact prediction based on the present co-evolution score is comparable to
that achieved by a maximum entropy model of protein sequences for 15 protein
families taken from the Pfam release 26.0. Besides, this excellent accuracy
indicates that compensatory substitutions are significant in protein evolution.Comment: 17 pages, 4 figures, and 4 tables with supplementary information of 5
figure
Improving Internal Peptide Dynamics in the Coarse-Grained MARTINI Model: Toward Large-Scale Simulations of Amyloid- and Elastin-like Peptides
We present an extension of the coarse-grained MARTINI
model for
proteins and apply this extension to amyloid- and elastin-like peptides.
Atomistic simulations of tetrapeptides, octapeptides, and longer peptides
in solution are used as a reference to parametrize a set of pseudodihedral
potentials that describe the internal flexibility of MARTINI peptides.
We assess the performance of the resulting model in reproducing various
structural properties computed from atomistic trajectories of peptides
in water. The addition of new dihedral angle potentials improves agreement
with the contact maps computed from atomistic simulations significantly.
We also address the question of which parameters derived from atomistic
trajectories are transferable between different lengths of peptides.
The modified coarse-grained model shows reasonable transferability
of parameters for the amyloid- and elastin-like peptides. In addition,
the improved coarse-grained model is also applied to investigate the
self-assembly of β-sheet forming peptides on the microsecond
time scale. The octapeptides SNNFGAIL and (GV)4 are used
to examine peptide aggregation in different environments, in water,
and at the water–octane interface. At the interface, peptide
adsorption occurs rapidly, and peptides spontaneously aggregate in
favor of stretched conformers resembling β-strands
Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints
BACKGROUND: Modern modelling techniques may potentially provide more accurate predictions of binary outcomes than classical techniques. We aimed to study the predictive performance of different modelling techniques in relation to the effective sample size (“data hungriness”). METHODS: We performed simulation studies based on three clinical cohorts: 1282 patients with head and neck cancer (with 46.9% 5 year survival), 1731 patients with traumatic brain injury (22.3% 6 month mortality) and 3181 patients with minor head injury (7.6% with CT scan abnormalities). We compared three relatively modern modelling techniques: support vector machines (SVM), neural nets (NN), and random forests (RF) and two classical techniques: logistic regression (LR) and classification and regression trees (CART). We created three large artificial databases with 20 fold, 10 fold and 6 fold replication of subjects, where we generated dichotomous outcomes according to different underlying models. We applied each modelling technique to increasingly larger development parts (100 repetitions). The area under the ROC-curve (AUC) indicated the performance of each model in the development part and in an independent validation part. Data hungriness was defined by plateauing of AUC and small optimism (difference between the mean apparent AUC and the mean validated AUC <0.01). RESULTS: We found that a stable AUC was reached by LR at approximately 20 to 50 events per variable, followed by CART, SVM, NN and RF models. Optimism decreased with increasing sample sizes and the same ranking of techniques. The RF, SVM and NN models showed instability and a high optimism even with >200 events per variable. CONCLUSIONS: Modern modelling techniques such as SVM, NN and RF may need over 10 times as many events per variable to achieve a stable AUC and a small optimism than classical modelling techniques such as LR. This implies that such modern techniques should only be used in medical prediction problems if very large data sets are available. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/1471-2288-14-137) contains supplementary material, which is available to authorized users
SCit: web tools for protein side chain conformation analysis
SCit is a web server providing services for protein side chain conformation analysis and side chain positioning. Specific services use the dependence of the side chain conformations on the local backbone conformation, which is described using a structural alphabet that describes the conformation of fragments of four-residue length in a limited library of structural prototypes. Based on this concept, SCit uses sets of rotameric conformations dependent on the local backbone conformation of each protein for side chain positioning and the identification of side chains with unlikely conformations. The SCit web server is accessible at http://bioserv.rpbs.jussieu.fr/SCit