6,745 research outputs found
Computer Aided Aroma Design. I. Molecular knowledge framework
Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and when their odour quality are definitely subjective whereas their odour intensity are partly subjective as stated in Rossitier’s review (1996). In part I, provided that classification rules like those presented in part II exist to assess the odour quality, the CAAD methodology presented proceeds with a multilevel approach matched by a versatile and novel molecular framework. It can distinguish the infinitesimal chemical structure differences, like in isomers, that are responsible for different odour quality and intensity. Besides, its chemical graph concepts are well suited for genetic algorithm sampling techniques used for an efficient screening of large molecules such as aroma. Finally, an input/output XML format based on the aggregation of CML and ThermoML enables to store the molecular classes but also any subjective or objective property values computed during the CAAD process
Automated Identification and Classification of Stereochemistry: Chirality and Double Bond Stereoisomerism
Stereoisomers have the same molecular formula and the same atom connectivity
and their existence can be related to the presence of different
three-dimensional arrangements. Stereoisomerism is of great importance in many
different fields since the molecular properties and biological effects of the
stereoisomers are often significantly different. Most drugs for example, are
often composed of a single stereoisomer of a compound, and while one of them
may have therapeutic effects on the body, another may be toxic. A challenging
task is the automatic detection of stereoisomers using line input
specifications such as SMILES or InChI since it requires information about
group theory (to distinguish stereoisomers using mathematical information about
its symmetry), topology and geometry of the molecule. There are several
software packages that include modules to handle stereochemistry, especially
the ones to name a chemical structure and/or view, edit and generate chemical
structure diagrams. However, there is a lack of software capable of
automatically analyzing a molecule represented as a graph and generate a
classification of the type of isomerism present in a given atom or bond.
Considering the importance of stereoisomerism when comparing chemical
structures, this report describes a computer program for analyzing and
processing steric information contained in a chemical structure represented as
a molecular graph and providing as output a binary classification of the isomer
type based on the recommended conventions. Due to the complexity of the
underlying issue, specification of stereochemical information is currently
limited to explicit stereochemistry and to the two most common types of
stereochemistry caused by asymmetry around carbon atoms: chiral atom and double
bond. A Webtool to automatically identify and classify stereochemistry is
available at http://nams.lasige.di.fc.ul.pt/tools.ph
A real-time proximity querying algorithm for haptic-based molecular docking
Intermolecular binding underlies every metabolic and regulatory processes of the cell, and the therapeutic and pharmacological properties of drugs. Molecular docking systems model and simulate these interactions in silico and allow us to study the binding process. Haptic-based docking provides an immersive virtual docking environment where the user can interact with and guide the molecules to their binding pose. Moreover, it allows human perception, intuition and knowledge to assist and accelerate the docking process, and reduces incorrect binding poses. Crucial for interactive docking is the real-time calculation of interaction forces. For smooth and accurate haptic exploration and manipulation, force-feedback cues have to be updated at a rate of 1 kHz. Hence, force calculations must be performed within 1ms. To achieve this, modern haptic-based docking approaches often utilize pre-computed force grids and linear interpolation. However, such grids are time-consuming to pre-compute (especially for large molecules), memory hungry, can induce rough force transitions at cell boundaries and cannot be applied to flexible docking. Here we propose an efficient proximity querying method for computing intermolecular forces in real time. Our motivation is the eventual development of a haptic-based docking solution that can model molecular flexibility. Uniquely in a haptics application we use octrees to decompose the 3D search space in order to identify the set of interacting atoms within a cut-off distance. Force calculations are then performed on this set in real time. The implementation constructs the trees dynamically, and computes the interaction forces of large molecular structures (i.e. consisting of thousands of atoms) within haptic refresh rates. We have implemented this method in an immersive, haptic-based, rigid-body, molecular docking application called Haptimol_RD. The user can use the haptic device to orientate the molecules in space, sense the interaction forces on the device, and guide the molecules to their binding pose. Haptimol_RD is designed to run on consumer level hardware, i.e. there is no need for specialized/proprietary hardware
Computer Aided Aroma Design. II. Quantitative structure-odour relationship
Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and their odour quality are definitely subjective or their odour intensity are partly subjective as stated in Rossitier’s review (1996). The CAAD methodology and a novel molecular framework were presented in part I. Part II focuses on a classification methodology to characterize the odour quality of molecules based on Structure – Odour Relation (SOR). Using 2D and 3D molecular descriptors, Linear Discriminant Analysis (LDA) and Artificial Neural Network are compared in favour of LDA. The classification into balsamic / non balsamic quality was satisfactorily solved. The classification among five sub notes of the balsamic quality was less successful, partly due to the selection of the Aldrich’s Catalog as the reference classification. For the second case, it is shown that the sweet sub note considered in Aldrich’s Catalog is not a relevant sub note, confirming the alternative and popular classification of Jaubert et al., (1995), the field of odours
Applying forces to elastic network models of large biomolecules using a haptic feedback device
Elastic network models of biomolecules have proved to be relatively good at predicting global conformational changes particularly in large systems. Software that facilitates rapid and intuitive exploration of conformational change in elastic network models of large biomolecules in response to externally applied forces would therefore be of considerable use, particularly if the forces mimic those that arise in the interaction with a functional ligand. We have developed software that enables a user to apply forces to individual atoms of an elastic network model of a biomolecule through a haptic feedback device or a mouse. With a haptic feedback device the user feels the response to the applied force whilst seeing the biomolecule deform on the screen. Prior to the interactive session normal mode analysis is performed, or pre-calculated normal mode eigenvalues and eigenvectors are loaded. For large molecules this allows the memory and number of calculations to be reduced by employing the idea of the important subspace, a relatively small space of the first M lowest frequency normal mode eigenvectors within which a large proportion of the total fluctuation occurs. Using this approach it was possible to study GroEL on a standard PC as even though only 2.3% of the total number of eigenvectors could be used, they accounted for 50% of the total fluctuation. User testing has shown that the haptic version allows for much more rapid and intuitive exploration of the molecule than the mouse version
Representation and use of chemistry in the global electronic age.
We present an overview of the current state of public semantic chemistry and propose new approaches at a strategic and a detailed level. We show by example how a model for a Chemical Semantic Web can be constructed using machine-processed data and information from journal articles.This manuscript addresses questions of robotic access to data and its automatic re-use, including the role of Open Access archival of data. This is a pre-refereed preprint allowed by the publisher's (Royal Soc. Chemistry) Green policy. The author's preferred manuscript is an HTML hyperdocument with ca. 20 links to images, some of which are JPEgs and some of which are SVG (scalable vector graphics) including animations. There are also links to molecules in CML, for which the Jmol viewer is recommended. We susgeest that readers who wish to see the full glory of the manuscript, download the Zipped version and unpack on their machine. We also supply a PDF and DOC (Word) version which obviously cannot show the animations, but which may be the best palce to start, particularly for those more interested in the text
Deep Learning of Atomically Resolved Scanning Transmission Electron Microscopy Images: Chemical Identification and Tracking Local Transformations
Recent advances in scanning transmission electron and scanning probe
microscopies have opened exciting opportunities in probing the materials
structural parameters and various functional properties in real space with
angstrom-level precision. This progress has been accompanied by an exponential
increase in the size and quality of datasets produced by microscopic and
spectroscopic experimental techniques. These developments necessitate adequate
methods for extracting relevant physical and chemical information from the
large datasets, for which a priori information on the structures of various
atomic configurations and lattice defects is limited or absent. Here we
demonstrate an application of deep neural networks to extract information from
atomically resolved images including location of the atomic species and type of
defects. We develop a 'weakly-supervised' approach that uses information on the
coordinates of all atomic species in the image, extracted via a deep neural
network, to identify a rich variety of defects that are not part of an initial
training set. We further apply our approach to interpret complex atomic and
defect transformation, including switching between different coordination of
silicon dopants in graphene as a function of time, formation of peculiar
silicon dimer with mixed 3-fold and 4-fold coordination, and the motion of
molecular 'rotor'. This deep learning based approach resembles logic of a human
operator, but can be scaled leading to significant shift in the way of
extracting and analyzing information from raw experimental data
Quantitative toxicity prediction using topology based multi-task deep neural networks
The understanding of toxicity is of paramount importance to human health and
environmental protection. Quantitative toxicity analysis has become a new
standard in the field. This work introduces element specific persistent
homology (ESPH), an algebraic topology approach, for quantitative toxicity
prediction. ESPH retains crucial chemical information during the topological
abstraction of geometric complexity and provides a representation of small
molecules that cannot be obtained by any other method. To investigate the
representability and predictive power of ESPH for small molecules, ancillary
descriptors have also been developed based on physical models. Topological and
physical descriptors are paired with advanced machine learning algorithms, such
as deep neural network (DNN), random forest (RF) and gradient boosting decision
tree (GBDT), to facilitate their applications to quantitative toxicity
predictions. A topology based multi-task strategy is proposed to take the
advantage of the availability of large data sets while dealing with small data
sets. Four benchmark toxicity data sets that involve quantitative measurements
are used to validate the proposed approaches. Extensive numerical studies
indicate that the proposed topological learning methods are able to outperform
the state-of-the-art methods in the literature for quantitative toxicity
analysis. Our online server for computing element-specific topological
descriptors (ESTDs) is available at http://weilab.math.msu.edu/TopTox/Comment: arXiv admin note: substantial text overlap with arXiv:1703.1095
- …