266,950 research outputs found

    Computer Aided Aroma Design. II. Quantitative structure-odour relationship

    Get PDF
    Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and their odour quality are definitely subjective or their odour intensity are partly subjective as stated in Rossitier’s review (1996). The CAAD methodology and a novel molecular framework were presented in part I. Part II focuses on a classification methodology to characterize the odour quality of molecules based on Structure – Odour Relation (SOR). Using 2D and 3D molecular descriptors, Linear Discriminant Analysis (LDA) and Artificial Neural Network are compared in favour of LDA. The classification into balsamic / non balsamic quality was satisfactorily solved. The classification among five sub notes of the balsamic quality was less successful, partly due to the selection of the Aldrich’s Catalog as the reference classification. For the second case, it is shown that the sweet sub note considered in Aldrich’s Catalog is not a relevant sub note, confirming the alternative and popular classification of Jaubert et al., (1995), the field of odours

    Dynamics of Bound and Free Water in an Aqueous Micellar Solution : Analysis of the Lifetime and Vibrational Frequencies of Hydrogen Bonds at a Complex Interface

    Get PDF
    In order to understand the nature and dynamics of interfacial water molecules on the surface of complex systems, large scale molecular dynamics simulations of an aqueous micelle of cesium perfluorooctanoate surfactant molecules have been carried out. The lifetime and the intermolecular vibrational frequencies of the hydrogen bonds that the water molecules form with the polar headgroups of the surfactants, are calculated. Our earlier classification of the interfacial water molecules, based on structural and energetic considerations, into bound and free type is further validated by their dynamics. Lifetime correlation functions of the water-surfactant hydrogen bonds show the long lived nature of the bound water species. The water molecules that are singly hydrogen bonded to the surfactants have longer lifetime than those that form two such hydrogen bonds. The free water molecules that do not form any such hydrogen bonds behave similar to bulk water in their reorientational dynamics. A few water molecules that form two such hydrogen bonds are orientationally locked in for durations of the order of few hundreds of picoseconds. The intermolecular vibrational frequencies of these interfacial water molecules shows a significant blue shift in the librational band apart from a similar shift in the near neighbor bending modes, relative to water molecules in bulk. These blue shifts suggest an increase in rigidity in the structure around interfacial water molecules. This is in good agreement with recent incoherent, inelastic neutron scattering data on macromolecular solutions. The results of the present simulations should be relevant to the understanding of dynamics of water near any hydrophilic surface.Comment: 36 Pages including 7 Figures; Submitted to Phys. Rev.

    Computer Aided Aroma Design. I. Molecular knowledge framework

    Get PDF
    Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and when their odour quality are definitely subjective whereas their odour intensity are partly subjective as stated in Rossitier’s review (1996). In part I, provided that classification rules like those presented in part II exist to assess the odour quality, the CAAD methodology presented proceeds with a multilevel approach matched by a versatile and novel molecular framework. It can distinguish the infinitesimal chemical structure differences, like in isomers, that are responsible for different odour quality and intensity. Besides, its chemical graph concepts are well suited for genetic algorithm sampling techniques used for an efficient screening of large molecules such as aroma. Finally, an input/output XML format based on the aggregation of CML and ThermoML enables to store the molecular classes but also any subjective or objective property values computed during the CAAD process

    Towards automatic classification within the ChEBI ontology

    Get PDF
    *Background*
Appearing in a wide variety of contexts, biochemical 'small molecules' are a core element of biomedical data. Chemical ontologies, which provide stable identifiers and a shared vocabulary for use in referring to such biochemical small molecules, are crucial to enable the interoperation of such data. One such chemical ontology is ChEBI (Chemical Entities of Biological Interest), a candidate member ontology of the OBO Foundry. ChEBI is a publicly available, manually annotated database of chemical entities and contains around 18000 annotated entities as of the last release (May 2009). ChEBI provides stable unique identifiers for chemical entities; a controlled vocabulary in the form of recommended names (which are unique and unambiguous), common synonyms, and systematic chemical names; cross-references to other databases; and a structural and role-based classification within the ontology. ChEBI is widely used for annotation of chemicals within biological databases, text-mining, and data integration. ChEBI can be accessed online at "http://www.ebi.ac.uk/chebi/":http://www.ebi.ac.uk/chebi/ and the full dataset is available for download in various formats including SDF and OBO.

*Automated Classification*
The selection of chemical entities for inclusion in the ChEBI database is user-driven. As the use of ChEBI has grown, so too has the backlog of user-requested entries. Inevitably, the annotation backlog creates a bottleneck, and to speed up the annotation process, ChEBI has recently released a submission tool which allows community submissions of chemical entities, groups, and classes. However, classification of chemical entities within the ontology is a difficult and niche activity, and it is unlikely that the community as a whole will be able or willing to correctly and consistently classify each submitted entity, creating required classes where they are missing. As a result, it is likely that while the size of the database grows, the ontological classification will become less sophisticated, unless the classification of new entities is assisted computationally. In addition, the ChEBI database is expecting substantial size growth in the next year, so automatic classification, which has up till now not been possible, is urgently required. Automatic classification would also enable the ChEBI ontology classes to be applied to other compound databases such as PubChem. 

*Description Logic Reasoning*
Description logic based reasoning technology is a prime candidate for development of such an automatic classification system as it allows the rules of the classification system to be encoded within the knowledgebase. Already at 18000 entities, ChEBI is a fair size for a real-world application of description logic reasoning technology, and as the ontology is enhanced with a richer density of asserted relationships, the classification will become more complex and challenging. We have successfully tested a description logic-based classification of chemical entities based on specified structural properties using the hypertableaux-based HermiT reasoner, and found it to be sufficiently efficient to be feasible for use in a production environment on a database of the size that ChEBI is now. However, much work still remains to enrich the ChEBI knowledgebase itself with the properties needed to provide the formal class definitions for use in the automated classification, and to assess the efficiency of the available description logic reasoning technology on a database the size of ChEBI's forecast future growth.

*Acknowledgements*
ChEBI is funded by the European Commission under SLING, grant agreement number 226073 (Integrating Activity) within Research Infrastructures of the FP7 Capacities Specific Programme, and by the BBSRC, grant agreement number BB/G022747/1 within the “Bioinformatics and biological resources” fund

    A Fuzzy Classification Framework to Identify Equivalent Atoms in Complex Materials and Molecules

    Full text link
    The nature of an atom in a bonded structure -- such as in molecules, in nanoparticles or solids, at surfaces or interfaces -- depends on its local atomic environment. In atomic-scale modeling and simulation, identifying groups of atoms with equivalent environments is a frequent task, to gain an understanding of the material function, to interpret experimental results or to simply restrict demanding first-principles calculations. While routine, this task can often be challenging for complex molecules or non-ideal materials with breaks of symmetries or long-range order. To automatize this task, we here present a general machine-learning framework to identify groups of (nearly) equivalent atoms. The initial classification rests on the representation of the local atomic environment through a high-dimensional smooth overlap of atomic positions (SOAP) vector. Recognizing that not least thermal vibrations may lead to deviations from ideal positions, we then achieve a fuzzy classification by mean-shift clustering within a low-dimensional embedded representation of the SOAP points as obtained through multidimensional scaling. The performance of this classification framework is demonstrated for simple aromatic molecules and crystalline Pd surface examples.Comment: Accepted manuscript in Journal of Chemical Physics. Repositories of the package (DECAF): DOI:10.17617/3.U7VKBM or https://gitlab.mpcdf.mpg.de/klai/deca

    The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases

    Get PDF
    One of the most intriguing groups of enzymes, the feruloyl esterases (FAEs), is ubiquitous in both simple and complex organisms. FAEs have gained importance in biofuel, medicine and food industries due to their capability of acting on a large range of substrates for cleaving ester bonds and synthesizing high-added value molecules through esterification and transesterification reactions. During the past two decades extensive studies have been carried out on the production and partial characterization of FAEs from fungi, while much less is known about FAEs of bacterial or plant origin. Initial classification studies on FAEs were restricted on sequence similarity and substrate specificity on just four model substrates and considered only a handful of FAEs belonging to the fungal kingdom. This study centers on the descriptor-based classification and structural analysis of experimentally verified and putative FAEs; nevertheless, the framework presented here is applicable to every poorly characterized enzyme family. 365 FAE-related sequences of fungal, bacterial and plantae origin were collected and they were clustered using Self Organizing Maps followed by k-means clustering into distinct groups based on amino acid composition and physico-chemical composition descriptors derived from the respective amino acid sequence. A Support Vector Machine model was subsequently constructed for the classification of new FAEs into the pre-assigned clusters. The model successfully recognized 98.2% of the training sequences and all the sequences of the blind test. The underlying functionality of the 12 proposed FAE families was validated against a combination of prediction tools and published experimental data. Another important aspect of the present work involves the development of pharmacophore models for the new FAE families, for which sufficient information on known substrates existed. Knowing the pharmacophoric features of a small molecule that are essential for binding to the members of a certain family opens a window of opportunities for tailored applications of FAEs

    From Fischer projections to quantum mechanics of tetrahedral molecules: new perspectives in chirality

    Full text link
    The algebraic structure of central molecular chirality can be achieved starting from the geometrical representation of bonds of tetrahedral molecules, as complex numbers in polar form, and the empirical Fischer projections used in organic chemistry. A general orthogonal O(4) algebra is derived from which we obtain a chirality index related to the classification of a molecule as achiral, diastereoisomer or enantiomer. Consequently, the chiral features of tetrahedral chains can be predicted by means of a molecular Aufbau. Moreover, a consistent Schroedinger equation is developed, whose solutions are the bonds of tetrahedral molecules in complex number representation. Starting from this result, the O(4) algebra can be considered as a quantum chiral algebra. It is shown that the operators of such an algebra preserve the parity of the whole system.Comment: 33 pages, to appear in Adv. Quantum Che

    Deciphering complex metabolite mixtures by unsupervised and supervised substructure discovery and semi-automated annotation from MS/MS spectra

    Get PDF
    Complex metabolite mixtures are challenging to unravel. Mass spectrometry (MS) is a widely used and sensitive technique to obtain structural information on complex mixtures. However, just knowing the molecular masses of the mixture’s constituents is almost always insufficient for confident assignment of the associated chemical structures. Structural information can be augmented through MS fragmentation experiments whereby detected metabolites are fragmented giving rise to MS/MS spectra. However, how can we maximize the structural information we gain from fragmentation spectra? We recently proposed a substructure-based strategy to enhance metabolite annotation for complex mixtures by considering metabolites as the sum of (bio)chemically relevant moieties that we can detect through mass spectrometry fragmentation approaches. Our MS2LDA tool allows us to discover - unsupervised - groups of mass fragments and/or neutral losses termed Mass2Motifs that often correspond to substructures. After manual annotation, these Mass2Motifs can be used in subsequent MS2LDA analyses of new datasets, thereby providing structural annotations for many molecules that are not present in spectral databases. Here, we describe how additional strategies, taking advantage of i) combinatorial in-silico matching of experimental mass features to substructures of candidate molecules, and ii) automated machine learning classification of molecules, can facilitate semi-automated annotation of substructures. We show how our approach accelerates the Mass2Motif annotation process and therefore broadens the chemical space spanned by characterized motifs. Our machine learning model used to classify fragmentation spectra learns the relationships between fragment spectra and chemical features. Classification prediction on these features can be aggregated for all molecules that contribute to a particular Mass2Motif and guide Mass2Motif annotations. To make annotated Mass2Motifs available to the community, we also present motifDB: an open database of Mass2Motifs that can be browsed and accessed programmatically through an Application Programming Interface (API). MotifDB is integrated within ms2lda.org, allowing users to efficiently search for characterized motifs in their own experiments. We expect that with an increasing number of Mass2Motif annotations available through a growing database we can more quickly gain insight in the constituents of complex mixtures. That will allow prioritization towards novel or unexpected chemistries and faster recognition of known biochemical building blocks
    corecore