In order to predict the metabolic fate of an arbitrary compound based solely on
structure, it is useful to be able to identify substructural ‘functional groups’ that are
biochemically reactive. These functional groups are the substructural elements that
can be removed and replaced to transform one compound into another. This problem
of identifying functional groups is related to the problem of classifying compounds.
The research presented here discusses the state of the art in biochemical databases
and how these sources may be applied to the problem of classifying compounds based
solely on structure. We describe a biochemical informatics system for processing
molecular data and describe how 100 255 compositional (hasA) relationships are
inferred between 835 abstractions and 9500 metabolites from the KEGG Ligand
database. Specifically, we focus on the identification of amino acids and consider ways
in which the inference of biochemical ontologies for metabolites will be improved in
the future