22 research outputs found

    Encoding, Storing and Searching of Analytical Properties and Assigned Metabolite Structures

    Get PDF
    Informationen über Metabolite und andere kleine organische Moleküle sind von entscheidender Bedeutung in vielen verschiedenen Bereichen der Naturwissenschaften. Sie spielen z.B. eine entscheidende Rolle in metabolischen Netzwerken und das Wissen über ihre Eigenschaften, hilft komplexe biologische Prozesse und komplette biologische Systeme zu verstehen. Da in biologischen und chemischen Laboren täglich Daten anfallen, welche diese Moleküle beschreiben, existiert eine umfassende Datengrundlage, die sich kontinuierlich erweitert. Um Wissenschaftlern die Verarbeitung, den Austausch, die Archivierung und die Suche innerhalb dieser Informationen unter Erhaltung der semantischen Zusammenhänge zu ermöglichen, sind komplexe Softwaresysteme und Datenformate nötig. Das Ziel dieses Projektes bestand darin, Anwendungen und Algorithmen zu entwickeln, welche für die effiziente Kodierung, Sammlung, Normalisierung und Analyse molekularer Daten genutzt werden können. Diese sollen Wissenschaftler bei der Strukturaufklärung, der Dereplikation, der Analyse von molekularen Wechselwirkungen und bei der Veröffentlichung des so gewonnenen Wissens unterstützen. Da die direkte Beschreibung der Struktur und der Funktionsweise einer unbekannten Verbindung sehr schwierig und aufwändig ist, wird dies hauptsächlich indirekt, mit Hilfe beschreibender Eigenschaften erreicht. Diese werden dann zur Vorhersage struktureller und funktioneller Charakteristika genutzt. In diesem Zusammenhang wurden Programmmodule entwickelt, welche sowohl die Visualisierung von Struktur- und Spektroskopiedaten, die gegliederte Darstellung und Veränderung von Metadaten und Eigenschaften, als auch den Import und Export von verschiedenen Datenformaten erlauben. Diese wurden durch Methoden erweitert, welche es ermöglichen, die gewonnenen Informationen weitergehend zu analysieren und Struktur- und Spektroskopiedaten einander zuzuweisen. Außerdem wurde ein System zur strukturierten Archivierung und Verwaltung großer Mengen molekularer Daten und spektroskopischer Informationen, unter Beibehaltung der semantischen Zusammenhänge, sowohl im Dateisystem, als auch in Datenbanken, entwickelt. Um die verlustfreie Speicherung zu gewährleisten, wurde ein offenes und standardisiertes Datenformat definiert (CMLSpect). Dieses erweitert das existierende CML (Chemical Markup Language) Vokabular und erlaubt damit die einfache Handhabung von verknüpften Struktur- und Spektroskopiedaten. Die entwickelten Anwendungen wurden in das Bioclipse System für Bio- und Chemoinformatik eingebunden und bieten dem Nutzer damit eine hochqualitative Benutzeroberfläche und dem Entwickler eine leicht zu erweiternde modulare Programmarchitektur

    Snap2Diverse: Coordinating Information Visualizations and Virtual Environments

    Get PDF
    The field of Information Visualization is concerned with improving with how users perceive, understand, and interact with visual representations of data sets. Immersive Virtual Environments (VEs) excel at providing researchers and designers a greater comprehension of the spatial features and relations of their data, models, and scenes. This project addresses the intersection of these two fields where information is visualized in a virtual environment. Specifically we are interested in visualizing abstract information in relation to spatial information in the context of a virtual environment. We describe a set of design issues for this type of integrated visualization and demonstrate a coordinated, multiple-views system supporting 2D and 3D visualization tasks such as overview, navigation, details-on-demand, and brushing-and-linking selection. Software architecture issues are discussed with details of our implementation applied to the domain of chemical information and visualization. Lastly, we subject our system to an informal usability evaluation and identify usability issues with interaction and navigation that may guide future work in these situations

    Computer Aided Aroma Design. I. Molecular knowledge framework

    Get PDF
    Computer Aided Aroma Design (CAAD) is likely to become a hot issue as the REACH EC document targets many aroma compounds to require substitution. The two crucial steps in CAMD are the generation of candidate molecules and the estimation of properties, which can be difficult when complex molecular structures like odours are sought and when their odour quality are definitely subjective whereas their odour intensity are partly subjective as stated in Rossitier’s review (1996). In part I, provided that classification rules like those presented in part II exist to assess the odour quality, the CAAD methodology presented proceeds with a multilevel approach matched by a versatile and novel molecular framework. It can distinguish the infinitesimal chemical structure differences, like in isomers, that are responsible for different odour quality and intensity. Besides, its chemical graph concepts are well suited for genetic algorithm sampling techniques used for an efficient screening of large molecules such as aroma. Finally, an input/output XML format based on the aggregation of CML and ThermoML enables to store the molecular classes but also any subjective or objective property values computed during the CAAD process

    Polyfilling Accessible Chemistry Diagrams

    Get PDF

    Mining chemical information from Open patents

    Get PDF
    RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.Abstract Linked Open Data presents an opportunity to vastly improve the quality of science in all fields by increasing the availability and usability of the data upon which it is based. In the chemical field, there is a huge amount of information available in the published literature, the vast majority of which is not available in machine-understandable formats. PatentEye, a prototype system for the extraction and semantification of chemical reactions from the patent literature has been implemented and is discussed. A total of 4444 reactions were extracted from 667 patent documents that comprised 10 weeks' worth of publications from the European Patent Office (EPO), with a precision of 78% and recall of 64% with regards to determining the identity and amount of reactants employed and an accuracy of 92% with regards to product identification. NMR spectra reported as product characterisation data are additionally captured.Peer Reviewe

    Open Babel: An open chemical toolbox

    Get PDF
    Background: A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendorneutral formats. Results: We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license fro

    Языки разметки семантического веба: практические аспекты: [учебно-методическое пособие по направлению "Электронные образовательные ресурсы"]

    Get PDF
    Основной целью руководства является описание специализированных языков разметки, построенных на основе XML. Дано краткое описание XML, DTD, XML Schema, XML Namespace, XSL, приведены примеры, иллюстрирующие назначение и особенности указанных технологий. Показано, как начать проектирование собственного языка разметки на основе XML. Знакомство с уже созданными специализированными языками разметки, описанными в руководстве, призвано помочь читателю ориентироваться в постоянно расширяющемся множестве языков разметки семантического веба. Для научных работников, преподавателей, аспирантов и студентов, специализирующихся в области естественных наукЭлектронные образовательные ресурсыбакалавриа

    Computer aided framework for designing bio-based commodity molecules with enhanced properties

    Get PDF
    We investigate the use of computer aided molecular design (CAMD) approach for enhancing the properties of existing molecules by modifying their chemical structure to match target property values. The activity of tailoring molecules requires to aggregate knowledge disseminated across the whole chemical enterprise hierarchy, from the manager level to the chemists and chemical engineers, with different backgrounds and perception of what the ideal molecule would be. So, we propose a framework that allows the search to be successful in matching all requirements while capitalizing this knowledge spread among actors with different backgrounds with the help of SBVR (Semantics of Business Vocabulary and Rules) and OCL (Object Constraint Language). In the context of using biomass as the feedstock, we discuss the coupling of CAMD tools with computer aided organic synthesis tools so as to propose enhanced bio-sourced molecule candidates which could be synthesized with eco-friendly pathways. Finally, we evaluate the sustainability of the molecules and of the whole decision-process as well. Specific applications that concern the use of bio-sourced molecules are presented: a case of typical derivatives of chemical platform molecules issued from the itaconic acid to substitute N-methyl-2-pyrrolidone NMP or dimethyl-formamide DMF solvents and a case of derivatives of lipids to be used a biolubricants
    corecore