24 research outputs found
Recommended from our members
Classifying the world anti-doping agency's 2005 prohibited list using the Chemistry Development Kit fingerprint
Presented at CompLife 2006, Cambridge, 27-29 September 2006.We used the freely available Chemistry Development Kit (CDK) fingerprint to classify 5235 representative molecules taken from ten banned classes in the 2005 World Anti-Doping Agency’s (WADA) prohibited list, including molecules taken from the corresponding activity classes in the MDL Drug Data Report (MDDR). We used both Random Forest and k-Nearest Neighbours (kNN)algorithms to generate classifiers. The kNN classifiers with k = 1 gave a very slightly better Matthews Correlation Coefficient than the Random Forest classifiers; the latter, however, predicted fewer false positives. The performance of kNN classifiers tended to decline with increasing k. The performance of the CDK fingerprint is essentially equivalent to that of Unity 2D. Our results suggest that it will be possible to use freely available chemoinformatics tools to aid the fight against drugs in sport, while minimising the risk of wrongfully penalising innocent athletes.EPSRC
Unileve
Full "Laplacianised" posterior naive Bayesian algorithm
BACKGROUND: In the last decade the standard Naive Bayes (SNB) algorithm has been widely employed in multi–class classification problems in cheminformatics. This popularity is mainly due to the fact that the algorithm is simple to implement and in many cases yields respectable classification results. Using clever heuristic arguments “anchored” by insightful cheminformatics knowledge, Xia et al. have simplified the SNB algorithm further and termed it the Laplacian Corrected Modified Naive Bayes (LCMNB) approach, which has been widely used in cheminformatics since its publication. In this note we mathematically illustrate the conditions under which Xia et al.’s simplification holds. It is our hope that this clarification could help Naive Bayes practitioners in deciding when it is appropriate to employ the LCMNB algorithm to classify large chemical datasets. RESULTS: A general formulation that subsumes the simplified Naive Bayes version is presented. Unlike the widely used NB method, the Standard Naive Bayes description presented in this work is discriminative (not generative) in nature, which may lead to possible further applications of the SNB method. CONCLUSIONS: Starting from a standard Naive Bayes (SNB) algorithm, we have derived mathematically the relationship between Xia et al.’s ingenious, but heuristic algorithm, and the SNB approach. We have also demonstrated the conditions under which Xia et al.’s crucial assumptions hold. We therefore hope that the new insight and recommendations provided can be found useful by the cheminformatics community
Molecular Dynamics of Mesophilic-Like Mutants of a Cold-Adapted Enzyme: Insights into Distal Effects Induced by the Mutations
Networks and clusters of intramolecular interactions, as well as their “communication” across the three-dimensional architecture have a prominent role in determining protein stability and function. Special attention has been dedicated to their role in thermal adaptation. In the present contribution, seven previously experimentally characterized mutants of a cold-adapted α-amylase, featuring mesophilic-like behavior, have been investigated by multiple molecular dynamics simulations, essential dynamics and analyses of correlated motions and electrostatic interactions. Our data elucidate the molecular mechanisms underlying the ability of single and multiple mutations to globally modulate dynamic properties of the cold-adapted α-amylase, including both local and complex unpredictable distal effects. Our investigation also shows, in agreement with the experimental data, that the conversion of the cold-adapted enzyme in a warm-adapted variant cannot be completely achieved by the introduction of few mutations, also providing the rationale behind these effects. Moreover, pivotal residues, which are likely to mediate the effects induced by the mutations, have been identified from our analyses, as well as a group of suitable candidates for protein engineering. In fact, a subset of residues here identified (as an isoleucine, or networks of mesophilic-like salt bridges in the proximity of the catalytic site) should be considered, in experimental studies, to get a more efficient modification of the features of the cold-adapted enzyme
Recommended from our members
CMLSnap: Animated reaction mechanisms
Reactions with many steps can be represented by a single XML-based table of the atoms, bonds and electrons. For each step the complete Chemical Markup Language1 representation of all components is given. These snapshots can then be combined to give an animated description of the complete reaction, both in "2D" chemical structure diagrams and in three dimensions. Here we demonstrate the method's power with enzymatic reactions.Preprint submitted to the Internet Journal of Chemistry and archived as a PRE-REFEREED PREPRINT under the Journal's ROMEO-GREEN policy. The manuscsript is an HTML + SVG hyperdocument of many components. The main paper is deposited but the hyperlinks have not been added so will appear broken. A major theme of the article is the animation of reactions using SVG and for this the reader should view the individual document components. (As a last resort they may download the ZIP file, unpack it, and view it in a modern browser). If they do not have SVG they should install a plugin, e.g. from http://www.adobe.com/svg. The MAIN PAPER is paper.html THE MATERIAL IS COPYRIGHT AND MAY NOT CURRENTLY BE ALTERED OR REDISTRIBUTED. For more information, and animated demos, see http://wwmm.ch.cam.ac.uk/moin/CmlSna
Anisotropic repulsion potentials for cyanuric chloride (C3N3Cl3) and their application to modeling the crystal structures of azaaromatic chlorides
A series of nonempirical intermolecular potentials has been developed for the cyanuric chloride dimer, using the overlap model to determine the anisotropy of the repulsive wall around each atom. Calibration against intermolecular perturbation theory calculations enables the penetration and charge-transfer energy to be explicitly included with the exchange-repulsion to give a simple repulsion model in an anisotropic atom-atom form. These model repulsion potentials are used in conjunction with an atomic multipole electrostatic model and an atom-atom dispersion model to give nonempirical potential models, which are tested for their ability to reproduce the crystal structure of cyanuric chloride. The best nonempirical potential is successfully used to construct a simpler transferable model for closely related azaaromatic chlorides. The nonempirical potential reproduces the experimental space group of cyanuric chloride, unlike some empirically fitted repulsion-dispersion potentials. This first nonempirical repulsion potential to model the polar flattening of Cl atoms also reproduces the N . . . Cl and Cl . . . Cl interactions in other crystal structures.</p
Ligand and structure-based methodologies for the prediction of the activity of G protein-coupled receptor ligands
Accurate in silico models for the quantitative prediction of the activity of G protein-coupled receptor (GPCR) ligands would greatly facilitate the process of drug discovery and development. Several methodologies have been developed based on the properties of the ligands, the direct study of the receptor-ligand interactions, or a combination of both approaches. Ligand-based three-dimensional quantitative structure-activity relationships (3D-QSAR) techniques, not requiring knowledge of the receptor structure, have been historically the first to be applied to the prediction of the activity of GPCR ligands. They are generally endowed with robustness and good ranking ability; however they are highly dependent on training sets. Structure-based techniques generally do not provide the level of accuracy necessary to yield meaningful rankings when applied to GPCR homology models. However, they are essentially independent from training sets and have a sufficient level of accuracy to allow an effective discrimination between binders and nonbinders, thus qualifying as viable lead discovery tools. The combination of ligand and structure-based methodologies in the form of receptor-based 3D-QSAR and ligand and structure-based consensus models results in robust and accurate quantitative predictions. The contribution of the structure-based component to these combined approaches is expected to become more substantial and effective in the future, as more sophisticated scoring functions are developed and more detailed structural information on GPCRs is gathered