21 research outputs found

    Fast 3D shape screening of large chemical databases through alignment-recycling

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Large chemical databases require fast, efficient, and simple ways of looking for similar structures. Although such tasks are now fairly well resolved for graph-based similarity queries, they remain an issue for 3D approaches, particularly for those based on 3D shape overlays. Inspired by a recent technique developed to compare molecular shapes, we designed a hybrid methodology, alignment-recycling, that enables efficient retrieval and alignment of structures with similar 3D shapes.</p> <p>Results</p> <p>Using a dataset of more than one million PubChem compounds of limited size (< 28 heavy atoms) and flexibility (< 6 rotatable bonds), we obtained a set of a few thousand diverse structures covering entirely the 3D shape space of the conformers of the dataset. Transformation matrices gathered from the overlays between these diverse structures and the 3D conformer dataset allowed us to drastically (100-fold) reduce the CPU time required for shape overlay. The alignment-recycling heuristic produces results consistent with <it>de novo </it>alignment calculation, with better than 80% hit list overlap on average.</p> <p>Conclusion</p> <p>Overlay-based 3D methods are computationally demanding when searching large databases. Alignment-recycling reduces the CPU time to perform shape similarity searches by breaking the alignment problem into three steps: selection of diverse shapes to describe the database shape-space; overlay of the database conformers to the diverse shapes; and non-optimized overlay of query and database conformers using common reference shapes. The precomputation, required by the first two steps, is a significant cost of the method; however, once performed, querying is two orders of magnitude faster. Extensions and variations of this methodology, for example, to handle more flexible and larger small-molecules are discussed.</p

    Trends in the development of digital agriculture: a review of international practices

    No full text
    In the modern world, information and communication technologies play an important role in the development of agricultural production, influencing the social, economic and political life of society and the state as a whole. The introduction of such technologies makes it possible to improve the quality of products and services, and to increase the export of agricultural and food products. Existing agricultural technologies make it possible to analyze and process large amounts of information, combine various information resources on one platform, control and reduce production risks, meet the information needs of a wide range of stakeholders, from the state to the end consumer, and guarantee security in cyberspace. An important role in the digitalization of agriculture is played by the resource potential of people employed in agriculture. Particular attention is paid to the development of scientific centers, training courses, where modern highprecision agricultural technologies are studied in-depth. The authors of the article examined the trends in the development of digital agriculture in the countries of Europe and Central Asia, where agricultural production is the fundamental basis of state policy

    Using the Structured Product Labeling format to index versatile chemical data

    No full text
    Presentation at American Chemical Society Spring Meeting April 201

    Reaction SPL - extension of a public document markup standard to chemical reactions

    No full text
    There are numerous formats and data models for describing reaction-related data. However, each offers only a limited coverage of the multitude of information that can be of interest to a broad user base in the context of chemical reactions. Structured Product Labeling (SPL) is a robust yet fairly light public XML document standard. It uses a highly generic but usefully refinable data schema, which is, like a language, highly expressive. We are therefore presenting an extension of SPL to chemical reactions (“Reaction SPL”). This extension is designed to support chemical manufacturing processes, which include as a minimum the chemical reaction and the procedures and conditions to run it. We provide an overview of the SPL reaction specification structures followed by some examples of documents with reaction data: predicted single-step reactions, a two-step synthesis, an enzymatic reaction, an example how to represent a reaction center, a patent, and a fully annotated reaction with by-products. Special attention is given to a mechanism for atom-atom mapping of reactions as well as to the possibility to integrate Reaction SPL with laboratory automation equipment, in particular automated synthesis devices

    Fast 3D shape screening of large chemical databases through alignment-recycling-4

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Fast 3D shape screening of large chemical databases through alignment-recycling"</p><p>http://journal.chemistrycentral.com/content/1/1/12</p><p>Chemistry Central Journal 2007;1():12-12.</p><p>Published online 6 Jun 2007</p><p>PMCID:PMC1994057.</p><p></p>nimoto equal to 0.73. The quality of the correlation improves as the Transform-Tanimoto threshold is decreased. Isocontours represent the distribution of alignments for each 0.01 ROCS shape Tanimoto interval. Distribution is successively partitioned at first percentile, first decile, first quartile, median, last quartile, last decile and last percentile. The scale on the side of each plot is proportional to the number of alignments in each partition

    Fast 3D shape screening of large chemical databases through alignment-recycling-6

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Fast 3D shape screening of large chemical databases through alignment-recycling"</p><p>http://journal.chemistrycentral.com/content/1/1/12</p><p>Chemistry Central Journal 2007;1():12-12.</p><p>Published online 6 Jun 2007</p><p>PMCID:PMC1994057.</p><p></p>ent (dark blue) and alignment (cyan). Right side: Same as left side but with query removed.shape Tanimoto 0.01 worse than ROCS. shape Tanimoto 0.03 worse than ROCS. shape Tanimoto 0.05 worse than ROCS

    Fast 3D shape screening of large chemical databases through alignment-recycling-5

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Fast 3D shape screening of large chemical databases through alignment-recycling"</p><p>http://journal.chemistrycentral.com/content/1/1/12</p><p>Chemistry Central Journal 2007;1():12-12.</p><p>Published online 6 Jun 2007</p><p>PMCID:PMC1994057.</p><p></p> ROCS alignments for each 0.01 ROCS shape Tanimoto interval. performs better than ROCS when the difference is above 0, and vice-versa. Distribution is successively partitioned at first percentile, first decile, first quartile, median, last quartile, last decile and last percentile. The scale on the side of each plot is proportional to the number of alignments in each partition. Points A, B and C correspond to the examples shown in

    Selection of study compound subset from the entire PubChem Compound database

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Fast 3D shape screening of large chemical databases through alignment-recycling"</p><p>http://journal.chemistrycentral.com/content/1/1/12</p><p>Chemistry Central Journal 2007;1():12-12.</p><p>Published online 6 Jun 2007</p><p>PMCID:PMC1994057.</p><p></p

    Fast 3D shape screening of large chemical databases through alignment-recycling-7

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Fast 3D shape screening of large chemical databases through alignment-recycling"</p><p>http://journal.chemistrycentral.com/content/1/1/12</p><p>Chemistry Central Journal 2007;1():12-12.</p><p>Published online 6 Jun 2007</p><p>PMCID:PMC1994057.</p><p></p>0 and 0.85 shape Tanimoto cut-offs. Crosses represent the mean of each distribution. retrieves fewer compounds than ROCS using the same cut-off
    corecore