Identifying metabolites from protein identifiers with P2M

Abstract

The identification of metabolites from complex biological samples often involves matching experimental mass spectrometry data to signatures of compounds derived from massive chemical databases. However, misidentifications may result due to the complexity of potential chemical space that leads to databases containing compounds with nearly identical structures. Prior knowledge of compounds that may be enzymatically consumed or produced by an organism can help reduce misidentifications by restricting initial database searching to compounds that are likely to be present in a biological system. While databases such as UniProt allow for the identification of small molecules that may be consumed or generated by enzymes encoded in an organism's genome, currently no tool exists for identifying SMILES strings of metabolites associated with protein identifiers and expanding R-containing substructures to fully defined, biologically relevant chemical structures. Here we present Proteome2Metabolome (P2M), a tool that performs these tasks using external database querying behind a simple command line interface. Beyond mass spectrometry based applications, P2M can be generally used to identify biologically relevant chemical structures likely to be observed in a biological system

    Similar works

    Full text

    thumbnail-image

    Available Versions