Search CORE

53 research outputs found

Kernel-based estimation of the applicability domain of QSAR models

Author: A Jahn
A Zell
Georg Hinselmann
Nikolas Fechner
Publication venue: Springer Nature
Publication date: 04/05/2010
Field of study

Springer - Publisher Connector

PubMed Central

Estimation of the applicability domain of kernel-based machine learning models for virtual screening

Author: Fechner Nikolas
Hinselmann Georg
Jahn Andreas
Zell Andreas
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Springer - Publisher Connector

PubMed Central

jCompoundMapper: An open source Java library and command-line tool for chemical fingerprints

Author: Fechner Nikolas
Hinselmann Georg
Jahn Andreas
Rosenbaum Lars
Zell Andreas
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background The decomposition of a chemical graph is a convenient approach to encode information of the corresponding organic compound. While several commercial toolkits exist to encode molecules as so-called fingerprints, only a few open source implementations are available. The aim of this work is to introduce a library for exactly defined molecular decompositions, with a strong focus on the application of these features in machine learning and data mining. It provides several options such as search depth, distance cut-offs, atom- and pharmacophore typing. Furthermore, it provides the functionality to combine, to compare, or to export the fingerprints into several formats. Results We provide a Java 1.6 library for the decomposition of chemical graphs based on the open source Chemistry Development Kit toolkit. We reimplemented popular fingerprinting algorithms such as depth-first search fingerprints, extended connectivity fingerprints, autocorrelation fingerprints (e.g. CATS2D), radial fingerprints (e.g. Molprint2D), geometrical Molprint, atom pairs, and pharmacophore fingerprints. We also implemented custom fingerprints such as the all-shortest path fingerprint that only includes the subset of shortest paths from the full set of paths of the depth-first search fingerprint. As an application of jCompoundMapper, we provide a command-line executable binary. We measured the conversion speed and number of features for each encoding and described the composition of the features in detail. The quality of the encodings was tested using the default parametrizations in combination with a support vector machine on the Sutherland QSAR data sets. Additionally, we benchmarked the fingerprint encodings on the large-scale Ames toxicity benchmark using a large-scale linear support vector machine. The results were promising and could often compete with literature results. On the large Ames benchmark, for example, we obtained an AUC ROC performance of 0.87 with a reimplementation of the extended connectivity fingerprint. This result is comparable to the performance achieved by a non-linear support vector machine using state-of-the-art descriptors. On the Sutherland QSAR data set, the best fingerprint encodings showed a comparable or better performance on 5 of the 8 benchmarks when compared against the results of the best descriptors published in the paper of Sutherland et al. Conclusions jCompoundMapper is a library for chemical graph fingerprints with several tweaking possibilities and exporting options for open source data mining toolkits. The quality of the data mining results, the conversion speed, the LPGL software license, the command-line interface, and the exporters should be useful for many applications in cheminformatics like benchmarks against literature methods, comparison of data mining algorithms, similarity searching, and similarity-based data mining.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Efficient extraction of canonical spatial relationships using a recursive enumeration of k-subsets

Author: A Jahn
Andreas Zell
Georg Hinselmann
Nikolas Fechner
P Mahé
T Rolfe
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

The spatial arrangement of a chemical compound plays an important role regarding the related properties or activities. A straightforward approach to encode the geometry is to enumerate pairwise spatial relationships between k substructures, like functional groups or subgraphs. This leads to a combinatorial explosion with th

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Automatic pharmacophore model generation using weighted substructure assignments

Author: A Jahn
A Zell
Andreas Jahn
Georg Hinselmann
H Planatscher
JF Truchon
N Huang
Nikolas Fechner
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Optimal assignment methods for ligand-based virtual screening

Abstract Background Ligand-based virtual screening experiments are an important task in the early drug discovery stage. An ambitious aim in each experiment is to disclose active structures based on new scaffolds. To perform these "scaffold-hoppings" for individual problems and targets, a plethora of different similarity methods based on diverse techniques were published in the last years. The optimal assignment approach on molecular graphs, a successful method in the field of quantitative structure-activity relationships, has not been tested as a ligand-based virtual screening method so far. Results We evaluated two already published and two new optimal assignment methods on various data sets. To emphasize the "scaffold-hopping" ability, we used the information of chemotype clustering analyses in our evaluation metrics. Comparisons with literature results show an improved early recognition performance and comparable results over the complete data set. A new method based on two different assignment steps shows an increased "scaffold-hopping" behavior together with a good early recognition performance. Conclusion The presented methods show a good combination of chemotype discovery and enrichment of active structures. Additionally, the optimal assignment on molecular graphs has the advantage to investigate and interpret the mappings, allowing precise modifications of internal parameters of the similarity measure for specific targets. All methods have low computation times which make them applicable to screen large data sets.</p

CiteSeerX

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Cover art to "25 years of small molecule optimization at Novartis: A retrospective analysis of chemical series evolution"

Author: Beckers Maximilian
Fechner Nikolas
Stiefl Nikolaus
Publication venue: 'American Chemical Society (ACS)'
Publication date: 01/01/2022
Field of study

In the internal Novartis compound databases, a set of ~3000 chemical series has been retrospectively reconstructed. Using the registration dates of the compounds, the evolution over time of structural properties, ADMET and target activities during optimization of the compounds has been analyzed, which revealed multiple trends. Furthermore, general properties of the chemical series and their inter-relations are investigated

The Novartis Repository

Automated Identification of Chemical Series: Classifying like a Medicinal Chemist

Author: Franziska Kruger
Nikolas Fechner
Nikolaus Stiefl
Publication venue: American Chemical Society (ACS)
Publication date: 06/05/2020
Field of study

Crossref

Molecular Descriptors

Author: Georg Hinselmann
Jörg Wegner
Nikolas Fechner
Publication venue: Chapman and Hall/CRC
Publication date: 21/04/2010
Field of study

Crossref