84 research outputs found

    Data mining for the identification of metabolic syndrome status

    Get PDF
    Metabolic syndrome (MS) is a condition associated with metabolic abnormalities that are characterized by central obesity (e.g. waist circumference or body mass index), hypertension (e.g. systolic or diastolic blood pressure), hyperglycemia (e.g. fasting plasma glucose) and dyslipidemia (e.g. triglyceride and high-density lipoprotein cholesterol). It is also associated with the development of diabetes mellitus (DM) type 2 and cardiovascular disease (CVD). Therefore, the rapid identification of MS is required to prevent the occurrence of such diseases. Herein, we review the utilization of data mining approaches for MS identification. Furthermore, the concept of quantitative population-health relationship (QPHR) is also presented, which can be defined as the elucidation/ understanding of the relationship that exists between health parameters and health status. The QPHR modeling uses data mining techniques such as artificial neural network (ANN), support vector machine (SVM), principal component analysis (PCA), decision tree (DT), random forest (RF) and association analysis (AA) for modeling and construction of predictive models for MS characterization. The DT method has been found to outperform other data mining techniques in the identification of MS status. Moreover, the AA technique has proved useful in the discovery of in-depth as well as frequently occurring health parameters that can be used for revealing the rules of MS development. This review presents the potential benefits on the applications of data mining as a rapid identification tool for classifying MS

    Recognition of DNA Splice Junction via Machine Learning Approaches

    Get PDF
    Successful recognition of splice junction sites of human DNA sequences was achieved via three machine learning approaches. Both unsupervised (Kohonen's Self-Organizing Map, KSOM) and supervised (Back-propagation Neural Network, BNN; and Support Vector Machine, SVM) machine learning techniques were used for the classification of sequences from the testing set into one of three categories: transition from exon to intron, transition from intron to exon, and no transition. The dataset used in this study is comprised of 1,424 DNA sequences obtained from the National Center for Bioinformatics Information (NCBI). Performance of the machine learning approaches were assessed by the construction of learning models from 1,000 sequences of the training set and evaluated on the 424 sequences of the testing set that is unknown to the learning model. Each sequence is a window of 32 nucleotides long with regions comprising -15 to +15 nucleotides from the dinucleotide splice site. Since the nucleotides (A, C, G, and T) are represented by four digit binary code (e.g. 0001, 0010, 0100, and 1000) the number of descriptors increased from 32 to 128. The performance of machine learning techniques in order of increasing accuracy are as follows SVM > BNN > KSOM, suggesting that SVM is a robust method in the identification of unknown splice site. Although KSOM gave lower prediction accuracy than the two supervised methods, it is fascinating that it was able to make such prediction based only on knowledge of the input whereas the supervised method requires that the output be known during training. It is expected that the Support Vector Machine method can provide a powerful computational tool for predicting the splice junction sites of uncharacterized DNA

    iBitter-Fuse: A Novel Sequence-Based Bitter Peptide Predictor by Fusing Multi-View Features.

    Get PDF
    Accurate identification of bitter peptides is of great importance for better understanding their biochemical and biophysical properties. To date, machine learning-based methods have become effective approaches for providing a good avenue for identifying potential bitter peptides from large-scale protein datasets. Although few machine learning-based predictors have been developed for identifying the bitterness of peptides, their prediction performances could be improved. In this study, we developed a new predictor (named iBitter-Fuse) for achieving more accurate identification of bitter peptides. In the proposed iBitter-Fuse, we have integrated a variety of feature encoding schemes for providing sufficient information from different aspects, namely consisting of compositional information and physicochemical properties. To enhance the predictive performance, the customized genetic algorithm utilizing self-assessment-report (GA-SAR) was employed for identifying informative features followed by inputting optimal ones into a support vector machine (SVM)-based classifier for developing the final model (iBitter-Fuse). Benchmarking experiments based on both 10-fold cross-validation and independent tests indicated that the iBitter-Fuse was able to achieve more accurate performance as compared to state-of-the-art methods. To facilitate the high-throughput identification of bitter peptides, the iBitter-Fuse web server was established and made freely available online. It is anticipated that the iBitter-Fuse will be a useful tool for aiding the discovery and de novo design of bitter peptides

    SCMTHP: A New Approach for Identifying and Characterizing of Tumor-Homing Peptides Using Estimated Propensity Scores of Amino Acids.

    Get PDF
    Tumor-homing peptides (THPs) are small peptides that can recognize and bind cancer cells specifically. To gain a better understanding of THPs' functional mechanisms, the accurate identification and characterization of THPs is required. Although some computational methods for in silico THP identification have been proposed, a major drawback is their lack of model interpretability. In this study, we propose a new, simple and easily interpretable computational approach (called SCMTHP) for identifying and analyzing tumor-homing activities of peptides via the use of a scoring card method (SCM). To improve the predictability and interpretability of our predictor, we generated propensity scores of 20 amino acids as THPs. Finally, informative physicochemical properties were used for providing insights on characteristics giving rise to the bioactivity of THPs via the use of SCMTHP-derived propensity scores. Benchmarking experiments from independent test indicated that SCMTHP could achieve comparable performance to state-of-the-art method with accuracies of 0.827 and 0.798, respectively, when evaluated on two benchmark datasets consisting of Main and Small datasets. Furthermore, SCMTHP was found to outperform several well-known machine learning-based classifiers (e.g., decision tree, k-nearest neighbor, multi-layer perceptron, naive Bayes and partial least squares regression) as indicated by both 10-fold cross-validation and independent tests. Finally, the SCMTHP web server was established and made freely available online. SCMTHP is expected to be a useful tool for rapid and accurate identification of THPs and for providing better understanding on THP biophysical and biochemical properties

    Aromatase inhibitory activity of 1,4-naphthoquinone derivatives and QSAR study

    Get PDF
    A series of 2-amino(chloro)-3-chloro-1,4-naphthoquinone derivatives (1-11) were investigated for their aromatase inhibitory activities. 1,4-Naphthoquinones 1 and 4 were found to be the most potent compounds affording IC50 values 5.2 times lower than the reference drug, ketoconazole. A quantitative structure-activity relationship (QSAR) model provided good predictive performance (R2 CV = 0.9783 and RMSECV = 0.0748) and indicated mass (Mor04m and H8m), electronegativity (Mor08e), van der Waals volume (G1v) and structural information content index (SIC2) descriptors as key descriptors governing the activity. To investigate the effects of structural modifications on aromatase inhibitory activity, the model was employed to predict the activities of an additional set of 39 structurally modified compounds constructed in silico. The prediction suggested that the 2,3-disubstitution of 1,4-naphthoquinone ring with halogen atoms (i.e., Br, I and F) is the most effective modification for potent activity (1a, 1b and 1c). Importantly, compound 1b was predicted to be more potent than its parent compound 1 (11.90-fold) and the reference drug, letrozole (1.03-fold). The study suggests the 1,4-naphthoquinone derivatives as promising compounds to be further developed as a novel class of aromatase inhibitors

    Elucidating the Structure-Activity Relationships of the Vasorelaxation and Antioxidation Properties of Thionicotinic Acid Derivatives

    Get PDF
    Nicotinic acid, known as vitamin B3, is an effective lipid lowering drug and intense cutaneous vasodilator. This study reports the effect of 2-(1-adamantylthio)nicotinic acid (6) and its amide 7 and nitrile analog 8 on phenylephrine-induced contraction of rat thoracic aorta as well as antioxidative activity. It was found that the tested thionicotinic acid analogs 6-8 exerted maximal vasorelaxation in a dose-dependent manner, but their effects were less than acetylcholine (ACh)-induced nitric oxide (NO) vasorelaxation. The vasorelaxations were reduced, apparently, in both NG-nitro-L-arginine methyl ester (L-NAME) and indomethacin (INDO). Synergistic effects were observed in the presence of L-NAME plus INDO, leading to loss of vasorelaxation of both the ACh and the tested nicotinic acids. Complete loss of the vasorelaxation was noted under removal of endothelial cells. This infers that the vasorelaxations are mediated partially by endothelium-induced NO and prostacyclin. The thionicotinic acid analogs all exhibited antioxidant properties in both 2,2-diphenyl-1-picrylhydrazyl (DPPH) and superoxide dismutase (SOD) assays. Significantly, the thionicotinic acid 6 is the most potent vasorelaxant with ED50 of 21.3 nM and is the most potent antioxidant (as discerned from DPPH assay). Molecular modeling was also used to provide mechanistic insights into the vasorelaxant and antioxidative activities. The findings reveal that the thionicotinic acid analogs are a novel class of vasorelaxant and antioxidant compounds which have potential to be further developed as promising therapeutics

    The MicroRNA Interaction Network of Lipid Diseases

    Get PDF
    Background: Dyslipidemia is one of the major forms of lipid disorder, characterized by increased triglycerides (TGs), increased low-density lipoprotein-cholesterol (LDL-C), and decreased high-density lipoprotein-cholesterol (HDL-C) levels in blood. Recently, MicroRNAs (miRNAs) have been reported to involve in various biological processes; their potential usage being a biomarkers and in diagnosis of various diseases. Computational approaches including text mining have been used recently to analyze abstracts from the public databases to observe the relationships/associations between the biological molecules, miRNAs, and disease phenotypes.Materials and Methods: In the present study, significance of text mined extracted pair associations (miRNA-lipid disease) were estimated by one-sided Fisher's exact test. The top 20 significant miRNA-disease associations were visualized on Cytoscape. The CyTargetLinker plug-in tool on Cytoscape was used to extend the network and predicts new miRNA target genes. The Biological Networks Gene Ontology (BiNGO) plug-in tool on Cytoscape was used to retrieve gene ontology (GO) annotations for the targeted genes.Results: We retrieved 227 miRNA-lipid disease associations including 148 miRNAs. The top 20 significant miRNAs analysis on CyTargetLinker provides defined, predicted and validated gene targets, further targeted genes analyzed by BiNGO showed targeted genes were significantly associated with lipid, cholesterol, apolipoprotein, and fatty acids GO terms.Conclusion: We are the first to provide a reliable miRNA-lipid disease association network based on text mining. This could help future experimental studies that aim to validate predicted gene targets

    Toward insights on determining factors for high activity in antimicrobial peptides via machine learning

    No full text
    The continued and general rise of antibiotic resistance in pathogenic microbes is a well-recognized global threat. Host defense peptides (HDPs), a component of the innate immune system have demonstrated promising potential to become a next generation antibiotic effective against a plethora of pathogens. While the effectiveness of antimicrobial HDPs has been extensively demonstrated in experimental studies, theoretical insights on the mechanism by which these peptides function is comparably limited. In particular, experimental studies of AMP mechanisms are limited in the number of different peptides investigated and the type of peptide parameters considered. This study makes use of the random forest algorithm for classifying the antimicrobial activity as well for identifying molecular descriptors underpinning the antimicrobial activity of investigated peptides. Subsequent manual interpretation of the identified important descriptors revealed that polarity-solubility are necessary for the membrane lytic antimicrobial activity of HDPs

    Conceptual Map of Computational Drug Discovery

    No full text
    Conceptual Map of Computational Drug Discover
    corecore