Search CORE

1,254 research outputs found

PubChem atom environments

Author
Publication venue: Springer
Publication date: 19/08/2015
Field of study

The octet rule in chemical space: Generating virtual molecules

Author: Hamaekers Jan
Israels Rafel
Maaß Astrid
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

We present a generator of virtual molecules that selects valid chemistry on the basis of the octet rule. Also, we introduce a mesomer group key that allows a fast detection of duplicates in the generated structures. Compared to existing approaches, our model is simpler and faster, generates new chemistry and avoids invalid chemistry. Its versatility is illustrated by the correct generation of molecules containing third-row elements and a surprisingly adept handling of complex boron chemistry. Without any empirical parameters, our model is designed to be valid also in unexplored regions of chemical space. One first unexpected finding is the high prevalence of dipolar structures among generated molecules.Comment: 24 pages, 10 figure

arXiv.org e-Print Archive

Fraunhofer-ePrints

Recommended from our members

Robocrystallographer: Automated crystal structure text descriptions and analysis

Author: Ganose AM
Jain A
Publication venue: eScholarship, University of California
Publication date: 01/09/2019
Field of study

Our ability to describe crystal structure features is of crucial importance when attempting to understand structure-property relationships in the solid state. In this paper, the authors introduce robocrystallographer, an open-source toolkit for analyzing crystal structures. This package combines new and existing open-source analysis tools to provide structural information, including the local coordination and polyhedral type, polyhedral connectivity, octahedral tilt angles, component-dimensionality, and molecule-within-crystal and fuzzy prototype identification. Using this information, robocrystallographer can generate text-based descriptions of crystal structures that resemble descriptions written by human crystallographers. The authors use robocrystallographer to investigate the dimensionalities of all compounds in the Materials Project database and highlight its potential in machine learning studies

eScholarship - University of California

Communication and re-use of chemical information in bioscience.

Author: Mitchell John BO
Murray-Rust Peter
Rzepa Henry S
Publication venue: BMC Bioinformatics
Publication date: 18/07/2005
Field of study

The current methods of publishing chemical information in bioscience articles are analysed. Using 3 papers as use-cases, it is shown that conventional methods using human procedures, including cut-and-paste are time-consuming and introduce errors. The meaning of chemical terms and the identity of compounds is often ambiguous. valuable experimental data such as spectra and computational results are almost always omitted. We describe an Open XML architecture at proof-of-concept which addresses these concerns. Compounds are identified through explicit connection tables or links to persistent Open resources such as PubChem. It is argued that if publishers adopt these tools and protocols, then the quality and quantity of chemical information available to bioscientists will increase and the authors, publishers and readers will find the process cost-effective.An article submitted to BiomedCentral Bioinformatics, created on request with their Publicon system. The transformed manuscript is archived as PDF. Although it has been through the publishers system this is purely automatic and the contents are those of a pre-refereed preprint. The formatting is provided by the system and tables and figures appear at the end. An accommpanying submission, http://www.dspace.cam.ac.uk/handle/1810/34580, describes the rationale and cultural aspects of publishing , abstracting and aggregating chemical information. BMC is an Open Access publisher and we emphasize that all content is re-usable under Creative Commons Licens

Springer - Publisher Connector

PubMed Central

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

University of St. Andrews - Pure

St Andrews Research Repository

Stereo-Aware Extension of HOSE Codes

Author: Johnson Sean R.
Kuhn Stefan
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/04/2019
Field of study

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Descriptions of molecular environments have many applications in chemoinformatics, including chemical shift prediction. Hierarchically ordered spherical environment (HOSE) codes are the most popular such descriptions. We developed a method to extend these with stereochemistry information. It enables distinguishing atoms which would be considered identical in traditional HOSE codes. The use of our method is demonstrated by chemical shift predictions for molecules in the nmrshiftdb2 database. We give a full specification and an implementation

Directory of Open Access Journals

De Montfort University Open Research Archive

Espaloma-0.3.0: Machine-learned molecular mechanics force field for the simulation of protein-ligand systems and beyond

Author: Chodera John D.
Henry Mike
MacDermott-Opeskin Hugo
Pulido Iván
Takaba Kenichiro
Wang Yuanqing
Publication venue
Publication date: 13/07/2023
Field of study

Molecular mechanics (MM) force fields -- the models that characterize the energy landscape of molecular systems via simple pairwise and polynomial terms -- have traditionally relied on human expert-curated, inflexible, and poorly extensible discrete chemical parameter assignment rules, namely atom or valence types. Recently, there has been significant interest in using graph neural networks to replace this process, while enabling the parametrization scheme to be learned in an end-to-end differentiable manner directly from quantum chemical calculations or condensed-phase data. In this paper, we extend the Espaloma end-to-end differentiable force field construction approach by incorporating both energy and force fitting directly to quantum chemical data into the training process. Building on the OpenMM SPICE dataset, we curate a dataset containing chemical spaces highly relevant to the broad interest of biomolecular modeling, covering small molecules, proteins, and RNA. The resulting force field, espaloma 0.3.0, self-consistently parametrizes these diverse biomolecular species, accurately predicts quantum chemical energies and forces, and maintains stable quantum chemical energy-minimized geometries. Surprisingly, this simple approach produces highly accurate protein-ligand binding free energies when self-consistently parametrizing protein and ligand. This approach -- capable of fitting new force fields to large quantum chemical datasets in one GPU-day -- shows significant promise as a path forward for building systematically more accurate force fields that can be easily extended to new chemical domains of interest

arXiv.org e-Print Archive

PubChem3D: a new resource for scientists

Author: A Nicholls
AD Andricopulo
B Musafia
Bo Yu
D Hull
EE Bolton
EE Bolton
EE Bolton
EE Bolton
Evan E Bolton
EW Sayers
F Fontaine
H Sun
ID Kuntz
J Bostrom
J Sadowski
JA Grant
JA Grant
JA Grant
JEJ Mills
Jian Zhang
Jie Chen
Jiyao Wang
JM Barnard
KJ Simmons
Lianyi Han
ML Mansfield
MS Lajiness
NA Meanwell
Paul A Thiessen
PCD Hawkins
RLM van Montfort
S Kim
S Kim
Siqian He
Stephen H Bryant
Sunghwan Kim
TA Halgren
TA Halgren
TA Halgren
V Mohan
Vahan Simonyan
Wenyao Shi
X Chan
Yan Sun
YL Wang
YL Wang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background PubChem is an open repository for small molecules and their experimental biological activity. PubChem integrates and provides search, retrieval, visualization, analysis, and programmatic access tools in an effort to maximize the utility of contributed information. There are many diverse chemical structures with similar biological efficacies against targets available in PubChem that are difficult to interrelate using traditional 2-D similarity methods. A new layer called PubChem3D is added to PubChem to assist in this analysis. Description PubChem generates a 3-D conformer model description for 92.3% of all records in the PubChem Compound database (when considering the parent compound of salts). Each of these conformer models is sampled to remove redundancy, guaranteeing a minimum (non-hydrogen atom pair-wise) RMSD between conformers. A diverse conformer ordering gives a maximal description of the conformational diversity of a molecule when only a subset of available conformers is used. A pre-computed search per compound record gives immediate access to a set of 3-D similar compounds (called "Similar Conformers") in PubChem and their respective superpositions. Systematic augmentation of PubChem resources to include a 3-D layer provides users with new capabilities to search, subset, visualize, analyze, and download data. A series of retrospective studies help to demonstrate important connections between chemical structures and their biological function that are not obvious using 2-D similarity but are readily apparent by 3-D similarity. Conclusions The addition of PubChem3D to the existing contents of PubChem is a considerable achievement, given the scope, scale, and the fact that the resource is publicly accessible and free. With the ability to uncover latent structure-activity relationships of chemical structures, while complementing 2-D similarity analysis approaches, PubChem3D represents a new resource for scientists to exploit when exploring the biological annotations in PubChem.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

MATCH: An atom‐typing toolset for molecular mechanics force fields

Author: Bolton
Bone
Brooks
Brooks
Cornell
Downs
Downs
Gillet
Halgren
Halgren
Holliday
Holliday
Jorgensen
Klepeis
Kleywegt
Lee
MacKerell
Mackerell
Murphy
Oostenbrink
Reddy
Schreiber
Schuttelkopf
Tiernan
Vanommeslaeghe
Wang
Wang
Weiner
Welford
Publication venue: 'Wiley'
Publication date: 15/01/2012
Field of study

We introduce a toolset of program libraries collectively titled multipurpose atom‐typer for CHARMM (MATCH) for the automated assignment of atom types and force field parameters for molecular mechanics simulation of organic molecules. The toolset includes utilities for the conversion of multiple chemical structure file formats into a molecular graph. A general chemical pattern‐matching engine using this graph has been implemented whereby assignment of molecular mechanics atom types, charges, and force field parameters are achieved by comparison against a customizable list of chemical fragments. While initially designed to complement the CHARMM simulation package and force fields by generating the necessary input topology and atom‐type data files, MATCH can be expanded to any force field and program, and has core functionality that makes it extendable to other applications such as fragment‐based property prediction. In this work, we demonstrate the accurate construction of atomic parameters of molecules within each force field included in CHARMM36 through exhaustive cross validation studies illustrating that bond charge increment rules derived from one force field can be transferred to another. In addition, using leave‐one‐out substitution it is shown that it is also possible to substitute missing intra and intermolecular parameters with ones included in a force field to complete the parameterization of novel molecules. Finally, to demonstrate the robustness of MATCH and the coverage of chemical space offered by the recent CHARMM general force field (Vanommeslaeghe, et al., J Comput Chem 2010, 31, 671), one million molecules from the PubChem database of small molecules are typed, parameterized, and minimized. © 2011 Wiley Periodicals, Inc. J Comput Chem, 2011Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/88100/1/JCC_21963_sm_SuppInfo.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/88100/2/21963_ftp.pd

Crossref

PubMed Central

Deep Blue Documents at the University of Michigan

Application and Development of Computational Methods for Ligand-Based Virtual Screening

Author: Heikamp Kathrin
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

The detection of novel active compounds that are able to modulate the biological function of a target is the primary goal of drug discovery. Different screening methods are available to identify hit compounds having the desired bioactivity in a large collection of molecules. As a computational method, virtual screening (VS) is used to search compound libraries in silico and identify those compounds that are likely to exhibit a specific activity. Ligand-based virtual screening (LBVS) is a subdiscipline that uses the information of one or more known active compounds in order to identify new hit compounds. Different LBVS methods exist, e.g. similarity searching and support vector machines (SVMs). In order to enable the application of these computational approaches, compounds have to be described numerically. Fingerprints derived from the two-dimensional compound structure, called 2D fingerprints, are among the most popular molecular descriptors available. This thesis covers the usage of 2D fingerprints in the context of LBVS. The first part focuses on a detailed analysis of 2D fingerprints. Their performance range against a wide range of pharmaceutical targets is globally estimated through fingerprint-based similarity searching. Additionally, mechanisms by which fingerprints are capable of detecting structurally diverse active compounds are identified. For this purpose, two different feature selection methods are applied to find those fingerprint features that are most relevant for the active compounds and distinguish them from other compounds. Then, 2D fingerprints are used in SVM calculations. The SVM methodology provides several opportunities to include additional information about the compounds in order to direct LBVS search calculations. In a first step, a variant of the SVM approach is applied to the multi-class prediction problem involving compounds that are active against several related targets. SVM linear combination is used to recover compounds with desired activity profiles and deprioritize compounds with other activities. Then, the SVM methodology is adopted for potency-directed VS. Compound potency is incorporated into the SVM approach through potencyoriented SVM linear combination and kernel function design to direct search calculations to the preferential detection of potent hit compounds. Next, SVM calculations are applied to address an intrinsic limitation of similarity-based methods, i.e., the presence of similar compounds having large differences in their potency. An especially designed SVM approach is introduced to predict compound pairs forming such activity cliffs. Finally, the impact of different training sets on the recall performance of SVM-based VS is analyzed and caveats are identified

bonndoc – Der Publikationsserver der Universität Bonn