401 research outputs found
Feature construction using explanations of individual predictions
Feature construction can contribute to comprehensibility and performance of
machine learning models. Unfortunately, it usually requires exhaustive search
in the attribute space or time-consuming human involvement to generate
meaningful features. We propose a novel heuristic approach for reducing the
search space based on aggregation of instance-based explanations of predictive
models. The proposed Explainable Feature Construction (EFC) methodology
identifies groups of co-occurring attributes exposed by popular explanation
methods, such as IME and SHAP. We empirically show that reducing the search to
these groups significantly reduces the time of feature construction using
logical, relational, Cartesian, numerical, and threshold num-of-N and X-of-N
constructive operators. An analysis on 10 transparent synthetic datasets shows
that EFC effectively identifies informative groups of attributes and constructs
relevant features. Using 30 real-world classification datasets, we show
significant improvements in classification accuracy for several classifiers and
demonstrate the feasibility of the proposed feature construction even for large
datasets. Finally, EFC generated interpretable features on a real-world problem
from the financial industry, which were confirmed by a domain expert.Comment: 54 pages, 10 figures, 22 table
Adaptive Response Function Neurons
Biological neurons that show a locally tuned response to
input may arise from the network topology of
interneurons in the system. By considering such a subnetwork,
a learning algorithm is developed for the online
learning of the centre, width and shape of locally
tuned response functions. The response function for
each input is trained independently, resulting in a very
good fit for the presented data. Two example networks
utilising these neurons were considered. The first was a
completely supervised network while the second utilised
a Kohonen-like training scheme for the hidden layer. The
adaptive response function neurons (ARFNs) were able
to achieve excellent class separation while maintaining
good generalisation with relatively few neurons
Gradient-based training and pruning of radial basis function networks with an application in materials physics
Many applications, especially in physics and other sciences, call for easily interpretable and robust machine learning techniques. We propose a fully gradient-based technique for training radial basis function networks with an efficient and scalable open-source implementation. We derive novel closed form optimization criteria for pruning the models for continuous as well as binary data which arise in a challenging real-world material physics problem. The pruned models are optimized to provide compact and interpretable versions of larger models based on informed assumptions about the data distribution. Visualizations of the pruned models provide insight into the atomic configurations that determine atom-level migration processes in solid matter; these results may inform future research on designing more suitable descriptors for use with machine learning algorithms. (c) 2020 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).Peer reviewe
Gradient-Based Training and Pruning of Radial Basis Function Networks with an Application in Materials Physics
Many applications, especially in physics and other sciences, call for easily
interpretable and robust machine learning techniques. We propose a fully
gradient-based technique for training radial basis function networks with an
efficient and scalable open-source implementation. We derive novel closed-form
optimization criteria for pruning the models for continuous as well as binary
data which arise in a challenging real-world material physics problem. The
pruned models are optimized to provide compact and interpretable versions of
larger models based on informed assumptions about the data distribution.
Visualizations of the pruned models provide insight into the atomic
configurations that determine atom-level migration processes in solid matter;
these results may inform future research on designing more suitable descriptors
for use with machine learning algorithms
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Principled and data efficient support vector machine training using the minimum description length principle, with application in breast cancer
Support vector machines (SVMs) are established as highly successful classifiers in a broad range of applications, including numerous medical ones. Nevertheless, their current employment is restricted by a limitation in the manner in which they are trained, most often the training-validation-test or k-fold cross-validation approaches, which are wasteful both in terms of the use of the available data as well as computational resources. This is a particularly important consideration in many medical problems, in which data availability is low (be it because of the inherent difficulty in obtaining sufficient data, or because of practical reasons, e.g. pertaining to privacy and data sharing). In this paper we propose a novel approach to training SVMs which does not suffer from the aforementioned limitation, which is at the same time much more rigorous in nature, being built upon solid information theoretic grounds. Specifically, we show how the training process, that is the process of hyperparameter inference, can be formulated as a search for the optimal model under the minimum description length (MDL) criterion, allowing for theory rather than empiricism driven selection and removing the need for validation data. The effectiveness and superiority of our approach are demonstrated on the Wisconsin Diagnostic Breast Cancer Data Set.PostprintPeer reviewe
- …