5,831 research outputs found

    A text-mining system for extracting metabolic reactions from full-text articles

    Get PDF
    Background: Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway—metabolic pathways—has been largely neglected. Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein–protein interactions. Results: When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. Conclusions: We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein–protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed

    Automated Identification and Classification of Stereochemistry: Chirality and Double Bond Stereoisomerism

    Full text link
    Stereoisomers have the same molecular formula and the same atom connectivity and their existence can be related to the presence of different three-dimensional arrangements. Stereoisomerism is of great importance in many different fields since the molecular properties and biological effects of the stereoisomers are often significantly different. Most drugs for example, are often composed of a single stereoisomer of a compound, and while one of them may have therapeutic effects on the body, another may be toxic. A challenging task is the automatic detection of stereoisomers using line input specifications such as SMILES or InChI since it requires information about group theory (to distinguish stereoisomers using mathematical information about its symmetry), topology and geometry of the molecule. There are several software packages that include modules to handle stereochemistry, especially the ones to name a chemical structure and/or view, edit and generate chemical structure diagrams. However, there is a lack of software capable of automatically analyzing a molecule represented as a graph and generate a classification of the type of isomerism present in a given atom or bond. Considering the importance of stereoisomerism when comparing chemical structures, this report describes a computer program for analyzing and processing steric information contained in a chemical structure represented as a molecular graph and providing as output a binary classification of the isomer type based on the recommended conventions. Due to the complexity of the underlying issue, specification of stereochemical information is currently limited to explicit stereochemistry and to the two most common types of stereochemistry caused by asymmetry around carbon atoms: chiral atom and double bond. A Webtool to automatically identify and classify stereochemistry is available at http://nams.lasige.di.fc.ul.pt/tools.ph

    Backbone chemical shift assignments of human 14-3-3σ\sigma

    Get PDF
    14-3-3 proteins are a group of seven dimeric adapter proteins that exert their biological function by interacting with hundreds of phosphorylated proteins, thus influencing their sub-cellular localization, activity or stability in the cell. Due to this remarkable interaction network, 14-3-3 proteins have been associated with several pathologies and the protein-protein interactions established with a number of partners are now considered promising drug targets. The activity of 14-3-3 proteins is often isoform specific and to our knowledge only one out of seven isoforms, 14-3-3ζ\zeta, has been assigned. Despite the availability of the crystal structures of all seven isoforms of 14-3-3, the additional NMR assignments of 14-3-3 proteins are important for both biological mechanism studies and chemical biology approaches. Herein, we present a robust backbone assignment of 14-3-3σ\sigma, which will allow advances in the discovery of potential therapeutic compounds. This assignment is now being applied to the discovery of both inhibitors and stabilizers of 14-3-3 protein-protein interactions

    A near infrared line list for \NH: Analysis of a Kitt Peak spectrum after 35 years

    Get PDF
    A Fourier Transform (FT) absorption spectrum of room temperature NH3 in the region 7400 - 8600 cm-1 is analysed using a variational line list and ground state energies determined using the MARVEL procedure. The spectrum was measured by Dr Catherine de Bergh in 1980 and is available from the Kitt Peak data center. The centers and intensities of 8468 ammonia lines were retrieved using a multiline fitting procedure. 2474 lines are assigned to 21 bands providing 1692 experimental energies in the range 7000 - 9000 cm-1. The spectrum was assigned by the joint use of the BYTe variational line list and combination differences. The assignments and experimental energies presented in this work are the first for ammonia in the region 7400 - 8600 cm-1, considerably extending the range of known vibrational-excited statesComment: 27 pages, 6 table, 5 figures. Accepted for publication in Journal of Molecular Spectroscop

    Comprehensive comparison of in silico MS/MS fragmentation tools of the CASMI contest: database boosting is needed to achieve 93% accuracy.

    Get PDF
    In mass spectrometry-based untargeted metabolomics, rarely more than 30% of the compounds are identified. Without the true identity of these molecules it is impossible to draw conclusions about the biological mechanisms, pathway relationships and provenance of compounds. The only way at present to address this discrepancy is to use in silico fragmentation software to identify unknown compounds by comparing and ranking theoretical MS/MS fragmentations from target structures to experimental tandem mass spectra (MS/MS). We compared the performance of four publicly available in silico fragmentation algorithms (MetFragCL, CFM-ID, MAGMa+ and MS-FINDER) that participated in the 2016 CASMI challenge. We found that optimizing the use of metadata, weighting factors and the manner of combining different tools eventually defined the ultimate outcomes of each method. We comprehensively analysed how outcomes of different tools could be combined and reached a final success rate of 93% for the training data, and 87% for the challenge data, using a combination of MAGMa+, CFM-ID and compound importance information along with MS/MS matching. Matching MS/MS spectra against the MS/MS libraries without using any in silico tool yielded 60% correct hits, showing that the use of in silico methods is still important

    Structural and spectroscopic characterisation of C4 oxygenates relevant to structure/activity relationships of the hydrogenation of α,β-unsaturated carbonyls

    Get PDF
    In the present work, we have investigated the conformational isomerism and calculated the vibrational spectra of the C4 oxygenates: 3-butyne-2-one, 3-butene-2-one, 2-butanone and 2-butanol using density functional theory. The calculations are validated by comparison to structural data where available and new, experimental inelastic neutron scattering and infrared spectra of the compounds. We find that for 3-butene-2-one and 2-butanol the spectra show clear evidence for the presence of conformational isomerism and this is supported by the calculations. Complete vibrational assignments for all four molecules are provided and this provides the essential information needed to generate structure/activity relationships for the sequential catalytic hydrogenation of 3-butyne-2-one to 2-butanol

    Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures

    Get PDF
    This paper compares several published methods for clustering chemical structures, using both graph- and fingerprint-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size and composition. Our experiments suggest that both graph- and fingerprint-based similarity measures can be used effectively for generating chemical clusterings; it is also suggested that the CAST and Yin–Chen methods, suggested recently for the clustering of gene expression patterns, may also prove effective for the clustering of 2D chemical structures
    corecore