55 research outputs found

    A unifying view for performance measures in multi-class prediction

    Get PDF
    In the last few years, many different performance measures have been introduced to overcome the weakness of the most natural metric, the Accuracy. Among them, Matthews Correlation Coefficient has recently gained popularity among researchers not only in machine learning but also in several application fields such as bioinformatics. Nonetheless, further novel functions are being proposed in literature. We show that Confusion Entropy, a recently introduced classifier performance measure for multi-class problems, has a strong (monotone) relation with the multi-class generalization of a classical metric, the Matthews Correlation Coefficient. Computational evidence in support of the claim is provided, together with an outline of the theoretical explanation

    Verifying the fully “Laplacianised” posterior Naïve Bayesian approach and more

    Get PDF
    Mussa and Glen would like to thank Unilever for financial support, whereas Mussa and Mitchell thank the BBSRC for funding this research through grant BB/I00596X/1. Mitchell thanks the Scottish Universities Life Sciences Alliance (SULSA) for financial support.Background In a recent paper, Mussa, Mitchell and Glen (MMG) have mathematically demonstrated that the “Laplacian Corrected Modified Naïve Bayes” (LCMNB) algorithm can be viewed as a variant of the so-called Standard Naïve Bayes (SNB) scheme, whereby the role played by absence of compound features in classifying/assigning the compound to its appropriate class is ignored. MMG have also proffered guidelines regarding the conditions under which this omission may hold. Utilising three data sets, the present paper examines the validity of these guidelines in practice. The paper also extends MMG’s work and introduces a new version of the SNB classifier: “Tapered Naïve Bayes” (TNB). TNB does not discard the role of absence of a feature out of hand, nor does it fully consider its role. Hence, TNB encapsulates both SNB and LCMNB. Results LCMNB, SNB and TNB performed differently on classifying 4,658, 5,031 and 1,149 ligands (all chosen from the ChEMBL Database) distributed over 31 enzymes, 23 membrane receptors, and one ion-channel, four transporters and one transcription factor as their target proteins. When the number of features utilised was equal to or smaller than the “optimal” number of features for a given data set, SNB classifiers systematically gave better classification results than those yielded by LCMNB classifiers. The opposite was true when the number of features employed was markedly larger than the “optimal” number of features for this data set. Nonetheless, these LCMNB performances were worse than the classification performance achieved by SNB when the “optimal” number of features for the data set was utilised. TNB classifiers systematically outperformed both SNB and LCMNB classifiers. Conclusions The classification results obtained in this study concur with the mathematical based guidelines given in MMG’s paper—that is, ignoring the role of absence of a feature out of hand does not necessarily improve classification performance of the SNB approach; if anything, it could make the performance of the SNB method worse. The results obtained also lend support to the rationale, on which the TNB algorithm rests: handled judiciously, taking into account absence of features can enhance (not impair) the discriminatory classification power of the SNB approach.Publisher PDFPeer reviewe

    Artificial Neural Network for Predictingthe Success Rate beforeGraft Transplant

    Get PDF
    The artificial learning models such as artificial neural network, radial basis function and art map have shown a promising application in the medical industry.The present work is acomparative analysis of the above mentioned.The results of the investigation have indicated that among artificial neural network, radial basis function and art map the numeric values obtained fromartificial neural network werecomparativelybetter. Further, the analysis of the accuracy among the three selected algorithms was found98.9708%, 97.2556%, and 58.1475% respectively. According to literature survey performed, it is evident that most studies in this regard have received lesser attention, especially in India. Based on the findings it seems that artificial neural network could be the best mode topredict the graft survivals during liver transplantation

    A Comprehensive Study of k-Portfolios of Recent SAT Solvers

    Get PDF
    These are the slides for the paper "A Comprehensive Study of k-Portfolios of Recent SAT Solvers", presented at the conference [*SAT 2022*](http://satisfiability.org/SAT22/). You can find the paper [here](https://www.doi.org/10.4230/LIPIcs.SAT.2022.2)

    Cascading Machine Learning to Attack Bitcoin Anonymity

    Full text link
    Bitcoin is a decentralized, pseudonymous cryptocurrency that is one of the most used digital assets to date. Its unregulated nature and inherent anonymity of users have led to a dramatic increase in its use for illicit activities. This calls for the development of novel methods capable of characterizing different entities in the Bitcoin network. In this paper, a method to attack Bitcoin anonymity is presented, leveraging a novel cascading machine learning approach that requires only a few features directly extracted from Bitcoin blockchain data. Cascading, used to enrich entities information with data from previous classifications, led to considerably improved multi-class classification performance with excellent values of Precision close to 1.0 for each considered class. Final models were implemented and compared using different machine learning models and showed significantly higher accuracy compared to their baseline implementation. Our approach can contribute to the development of effective tools for Bitcoin entity characterization, which may assist in uncovering illegal activities.Comment: 15 pages,7 figures, 4 tables, presented in 2019 IEEE International Conference on Blockchain (Blockchain

    A deep learning approach for lower back-pain risk prediction during manual lifting

    Full text link
    Occupationally-induced back pain is a leading cause of reduced productivity in industry. Detecting when a worker is lifting incorrectly and at increased risk of back injury presents significant possible benefits. These include increased quality of life for the worker due to lower rates of back injury and fewer workers' compensation claims and missed time for the employer. However, recognizing lifting risk provides a challenge due to typically small datasets and subtle underlying features in accelerometer and gyroscope data. A novel method to classify a lifting dataset using a 2D convolutional neural network (CNN) and no manual feature extraction is proposed in this paper; the dataset consisted of 10 subjects lifting at various relative distances from the body with 720 total trials. The proposed deep CNN displayed greater accuracy (90.6%) compared to an alternative CNN and multilayer perceptron (MLP). A deep CNN could be adapted to classify many other activities that traditionally pose greater challenges in industrial environments due to their size and complexity.Comment: 21 pages, 10 figure

    A fast and efficient gene-network reconstruction method from multiple over-expression experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Reverse engineering of gene regulatory networks presents one of the big challenges in systems biology. Gene regulatory networks are usually inferred from a set of single-gene over-expressions and/or knockout experiments. Functional relationships between genes are retrieved either from the steady state gene expressions or from respective time series.</p> <p>Results</p> <p>We present a novel algorithm for gene network reconstruction on the basis of steady-state gene-chip data from over-expression experiments. The algorithm is based on a straight forward solution of a linear gene-dynamics equation, where experimental data is fed in as a first predictor for the solution. We compare the algorithm's performance with the NIR algorithm, both on the well known <it>E. coli </it>experimental data and on in-silico experiments.</p> <p>Conclusion</p> <p>We show superiority of the proposed algorithm in the number of correctly reconstructed links and discuss computational time and robustness. The proposed algorithm is not limited by combinatorial explosion problems and can be used in principle for large networks.</p
    corecore