4 research outputs found
Discriminative Chemical Patterns: Automatic and Interactive Design
The
classification of molecules with respect to their inhibiting,
activating, or toxicological potential constitutes a central aspect
in the field of cheminformatics. Often, a discriminative feature is
needed to distinguish two different molecule sets. Besides physicochemical
properties, substructures and chemical patterns belong to the descriptors
most frequently applied for this purpose. As a commonly used example
of this descriptor class, SMARTS strings represent a powerful concept
for the representation and processing of abstract chemical patterns.
While their usage facilitates a convenient way to apply previously
derived classification rules on new molecule sets, the manual generation
of useful SMARTS patterns remains a complex and time-consuming process.
Here, we introduce SMARTSminer, a new algorithm for the automatic
derivation of discriminative SMARTS patterns from preclassified molecule
sets. Based on a specially adapted subgraph mining algorithm, SMARTSminer
identifies structural features that are frequent in only one of the
given molecule classes. In comparison to elemental substructures,
it also supports the consideration of general and specific SMARTS
features. Furthermore, SMARTSminer is integrated into an interactive
pattern editor named SMARTSeditor. This allows for an intuitive visualization
on the basis of the SMARTSviewer concept as well as interactive adaption
and further improvement of the generated patterns. Additionally, a
new molecular matching feature provides an immediate feedback on a
pattern’s matching behavior across the molecule sets. We demonstrate
the utility of the SMARTSminer functionality and its integration into
the SMARTSeditor software in several different classification scenarios
Fast Protein Binding Site Comparison via an Index-Based Screening Technology
We present TrixP, a new index-based method for fast protein
binding site comparison and function prediction. TrixP determines
binding site similarities based on the comparison of descriptors that
encode pharmacophoric and spatial features. Therefore, it adopts the
efficient core components of TrixX, a structure-based virtual screening
technology for large compound libraries. TrixP expands this technology
by new components in order to allow a screening of protein libraries.
TrixP accounts for the inherent flexibility of proteins employing
a partial shape matching routine. After the identification of structures
with matching pharmacophoric features and geometric shape, TrixP superimposes
the binding sites and, finally, assesses their similarity according
to the fit of pharmacophoric properties. TrixP is able to find analogies
between closely and distantly related binding sites. Recovery rates
of 81.8% for similar binding site pairs, assisted by rejecting rates
of 99.5% for dissimilar pairs on a test data set containing 1331 pairs,
confirm this ability. TrixP exclusively identifies members of the
same protein family on top ranking positions out of a library consisting
of 9802 binding sites. Furthermore, 30 predicted kinase binding sites
can almost perfectly be classified into their known subfamilies
Large-Scale Analysis of Hydrogen Bond Interaction Patterns in Protein–Ligand Interfaces
Protein–ligand
interactions are the fundamental basis for
molecular design in pharmaceutical research, biocatalysis, and agrochemical
development. Especially hydrogen bonds are known to have special geometric
requirements and therefore deserve a detailed analysis. In modeling
approaches a more general description of hydrogen bond geometries,
using distance and directionality, is applied. A first study of their
geometries was performed based on 15 protein structures in 1982. Currently
there are about 95 000 protein–ligand structures available
in the PDB, providing a solid foundation for a new large-scale statistical
analysis. Here, we report a comprehensive investigation of geometric
and functional properties of hydrogen bonds. Out of 22 defined functional
groups, eight are fully in accordance with theoretical predictions
while 14 show variations from expected values. On the basis of these
results, we derived interaction geometries to improve current computational
models. It is expected that these observations will be useful in designing
new chemical structures for biological applications
Large-Scale Analysis of Hydrogen Bond Interaction Patterns in Protein–Ligand Interfaces
Protein–ligand
interactions are the fundamental basis for
molecular design in pharmaceutical research, biocatalysis, and agrochemical
development. Especially hydrogen bonds are known to have special geometric
requirements and therefore deserve a detailed analysis. In modeling
approaches a more general description of hydrogen bond geometries,
using distance and directionality, is applied. A first study of their
geometries was performed based on 15 protein structures in 1982. Currently
there are about 95 000 protein–ligand structures available
in the PDB, providing a solid foundation for a new large-scale statistical
analysis. Here, we report a comprehensive investigation of geometric
and functional properties of hydrogen bonds. Out of 22 defined functional
groups, eight are fully in accordance with theoretical predictions
while 14 show variations from expected values. On the basis of these
results, we derived interaction geometries to improve current computational
models. It is expected that these observations will be useful in designing
new chemical structures for biological applications