401 research outputs found

    Feature construction using explanations of individual predictions

    Full text link
    Feature construction can contribute to comprehensibility and performance of machine learning models. Unfortunately, it usually requires exhaustive search in the attribute space or time-consuming human involvement to generate meaningful features. We propose a novel heuristic approach for reducing the search space based on aggregation of instance-based explanations of predictive models. The proposed Explainable Feature Construction (EFC) methodology identifies groups of co-occurring attributes exposed by popular explanation methods, such as IME and SHAP. We empirically show that reducing the search to these groups significantly reduces the time of feature construction using logical, relational, Cartesian, numerical, and threshold num-of-N and X-of-N constructive operators. An analysis on 10 transparent synthetic datasets shows that EFC effectively identifies informative groups of attributes and constructs relevant features. Using 30 real-world classification datasets, we show significant improvements in classification accuracy for several classifiers and demonstrate the feasibility of the proposed feature construction even for large datasets. Finally, EFC generated interpretable features on a real-world problem from the financial industry, which were confirmed by a domain expert.Comment: 54 pages, 10 figures, 22 table

    Adaptive Response Function Neurons

    Get PDF
    Biological neurons that show a locally tuned response to input may arise from the network topology of interneurons in the system. By considering such a subnetwork, a learning algorithm is developed for the online learning of the centre, width and shape of locally tuned response functions. The response function for each input is trained independently, resulting in a very good fit for the presented data. Two example networks utilising these neurons were considered. The first was a completely supervised network while the second utilised a Kohonen-like training scheme for the hidden layer. The adaptive response function neurons (ARFNs) were able to achieve excellent class separation while maintaining good generalisation with relatively few neurons

    Gradient-based training and pruning of radial basis function networks with an application in materials physics

    Get PDF
    Many applications, especially in physics and other sciences, call for easily interpretable and robust machine learning techniques. We propose a fully gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation. We derive novel closed form optimization criteria for pruning the models for continuous as well as binary data which arise in a challenging real-world material physics problem. The pruned models are optimized to provide compact and interpretable versions of larger models based on informed assumptions about the data distribution. Visualizations of the pruned models provide insight into the atomic configurations that determine atom-level migration processes in solid matter; these results may inform future research on designing more suitable descriptors for use with machine learning algorithms. (c) 2020 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Peer reviewe

    Gradient-Based Training and Pruning of Radial Basis Function Networks with an Application in Materials Physics

    Full text link
    Many applications, especially in physics and other sciences, call for easily interpretable and robust machine learning techniques. We propose a fully gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation. We derive novel closed-form optimization criteria for pruning the models for continuous as well as binary data which arise in a challenging real-world material physics problem. The pruned models are optimized to provide compact and interpretable versions of larger models based on informed assumptions about the data distribution. Visualizations of the pruned models provide insight into the atomic configurations that determine atom-level migration processes in solid matter; these results may inform future research on designing more suitable descriptors for use with machine learning algorithms

    A Review of Codebook Models in Patch-Based Visual Object Recognition

    No full text
    The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods

    Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer

    Get PDF
    Support vector machines (SVMs) are established as highly successful classifiers in a broad range of applications, including numerous medical ones. Nevertheless, their current employment is restricted by a limitation in the manner in which they are trained, most often the training-validation-test or k-fold cross-validation approaches, which are wasteful both in terms of the use of the available data as well as computational resources. This is a particularly important consideration in many medical problems, in which data availability is low (be it because of the inherent difficulty in obtaining sufficient data, or because of practical reasons, e.g. pertaining to privacy and data sharing). In this paper we propose a novel approach to training SVMs which does not suffer from the aforementioned limitation, which is at the same time much more rigorous in nature, being built upon solid information theoretic grounds. Specifically, we show how the training process, that is the process of hyperparameter inference, can be formulated as a search for the optimal model under the minimum description length (MDL) criterion, allowing for theory rather than empiricism driven selection and removing the need for validation data. The effectiveness and superiority of our approach are demonstrated on the Wisconsin Diagnostic Breast Cancer Data Set.PostprintPeer reviewe
    • …
    corecore