713 research outputs found

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Visual diagnosis of tree boosting methods

    Get PDF
    Tree boosting, which combines weak learners (typically decision trees) to generate a strong learner, is a highly effective and widely used machine learning method. However, the development of a high performance tree boosting model is a time-consuming process that requires numerous trial-and-error experiments. To tackle this issue, we have developed a visual diagnosis tool, BOOSTVis, to help experts quickly analyze and diagnose the training process of tree boosting. In particular, we have designed a temporal confusion matrix visualization, and combined it with a t-SNE projection and a tree visualization. These visualization components work together to provide a comprehensive overview of a tree boosting model, and enable an effective diagnosis of an unsatisfactory training process. Two case studies that were conducted on the Otto Group Product Classification Challenge dataset demonstrate that BOOSTVis can provide informative feedback and guidance to improve understanding and diagnosis of tree boosting algorithms

    From predicting to analyzing {HIV}-1 resistance to broadly neutralizing antibodies

    No full text
    Treatment with broadly neutralizing antibodies (bNAbs) has recently proven effective against HIV-1 infections in humanized mice, non-human primates, and humans. For optimal treatment, susceptibility of the patient’s viral strains to a particular bNAb has to be ensured. Since no computational approaches are so far available, susceptibility can only be tested in expensive and time-consuming neutralization experiments. Here, we present well-performing computational models (AUC up to 0.84) that can predict HIV-1 resistance to bNAbs given the envelope sequence of the virus. Having learnt important binding sites of the bNAbs from the envelope sequence, the models are also biologically meaningful and useful for epitope recognition. Additional to the prediction result, we provide a motif logo that displays the contribution of the pivotal residues of the test sequence to the prediction. As our prediction models are based on non-linear kernels, we introduce a new visualization technique to improve the model interpretability. Moreover, we confirmed previous experimental findings that there is a trend towards antibody resistance for the subtype B population of the virus. While previous experiments considered rather small and selected cohorts, we were able to show a similar trend for the global HIV-1 population comprising all major subtypes by predicting the neutralization sensitivity for around 36,000 HIV-1 sequences- a scale-up which is very difficult to achieve in an experimental setting

    To trust or not to trust an explanation: using LEAF to evaluate local linear XAI methods

    Get PDF
    The main objective of eXplainable Artificial Intelligence (XAI) is to provide effective explanations for black-box classifiers. The existing literature lists many desirable properties for explanations to be useful, but there is no consensus on how to quantitatively evaluate explanations in practice. Moreover, explanations are typically used only to inspect black-box models, and the proactive use of explanations as a decision support is generally overlooked. Among the many approaches to XAI, a widely adopted paradigm is Local Linear Explanations - with LIME and SHAP emerging as state-of-the-art methods. We show that these methods are plagued by many defects including unstable explanations, divergence of actual implementations from the promised theoretical properties, and explanations for the wrong label. This highlights the need to have standard and unbiased evaluation procedures for Local Linear Explanations in the XAI field. In this paper we address the problem of identifying a clear and unambiguous set of metrics for the evaluation of Local Linear Explanations. This set includes both existing and novel metrics defined specifically for this class of explanations. All metrics have been included in an open Python framework, named LEAF. The purpose of LEAF is to provide a reference for end users to evaluate explanations in a standardised and unbiased way, and to guide researchers towards developing improved explainable techniques.Comment: 16 pages, 8 figure
    corecore