339 research outputs found

    Bio-mimetic Spiking Neural Networks for unsupervised clustering of spatio-temporal data

    Get PDF
    Spiking neural networks aspire to mimic the brain more closely than traditional artificial neural networks. They are characterised by a spike-like activation function inspired by the shape of an action potential in biological neurons. Spiking networks remain a niche area of research, perform worse than the traditional artificial networks, and their real-world applications are limited. We hypothesised that neuroscience-inspired spiking neural networks with spike-timing-dependent plasticity demonstrate useful learning capabilities. Our objective was to identify features which play a vital role in information processing in the brain but are not commonly used in artificial networks, implement them in spiking networks without copying constraints that apply to living organisms, and to characterise their effect on data processing. The networks we created are not brain models; our approach can be labelled as artificial life. We performed a literature review and selected features such as local weight updates, neuronal sub-types, modularity, homeostasis and structural plasticity. We used the review as a guide for developing the consecutive iterations of the network, and eventually a whole evolutionary developmental system. We analysed the model’s performance on clustering of spatio-temporal data. Our results show that combining evolution and unsupervised learning leads to a faster convergence on the optimal solutions, better stability of fit solutions than each approach separately. The choice of fitness definition affects the network’s performance on fitness-related and unrelated tasks. We found that neuron type-specific weight homeostasis can be used to stabilise the networks, thus enabling longer training. We also demonstrated that networks with a rudimentary architecture can evolve developmental rules which improve their fitness. This interdisciplinary work provides contributions to three fields: it proposes novel artificial intelligence approaches, tests the possible role of the selected biological phenomena in information processing in the brain, and explores the evolution of learning in an artificial life system

    Training of supermodels - in the context of weather and climate forecasting

    Get PDF
    Given a set of imperfect weather or climate models, predictions can be improved by combining the models dynamically into a so called `supermodel'. The models are optimally combined to compensate their individual errors. This is different from the standard multi-model ensemble approach (MME), where the model output is statistically combined after the simulations. Instead, the supermodel can create a trajectory closer to observations than any of the imperfect models. By intervening during the forecast, errors can be reduced at an early stage and the ensemble can exhibit different dynamical behavior than any of the individual models. In this way, common errors between the models can be removed and new, physically correct behavior can appear. In our simplified context of models sharing the same evolution function and phase space, we can define either a connected or a weighted supermodel. A connected supermodel uses nudging to bring the models closer together, while in a weighted supermodel all model states are replaced at regular time intervals (i.e., restarted) by the weighted average of the individual model states. To obtain optimal connection coefficients or weights, we need to train the supermodel on the basis of historical observations. A standard training approach such as minimization of a cost function requires many model simulations, which is computationally very expensive. This thesis has focused on developing two new methods to efficiently train supermodels. The first method is based on an idea called cross pollination in time, where models exchange states during the training. The second method is a synchronization-based learning rule, originally developed for parameter estimation. The techniques are developed on low-order systems, such as Lorenz63, and later applied to different versions of the intermediate-complexity global coupled atmosphere-ocean-land model SPEEDO. Here the observations are from the same models, but with different parameters. The applicability of the method to real observations is tested using sensitivity to noisy and incomplete data. The characteristics the individual models should have in order to be combined together into a supermodel are identified, as well as which physical variables should be connected in a supermodel, and which ones should not. Both training methods result in supermodels that outperform both the individual models and the MME, for short term predictions as well as long term simulations. Furthermore, we show that the novel use of negative weights can improve predictions in cases where model errors do not cancel (for instance, all models are too warm with respect to the truth). A crucial advantage of the proposed training schemes is that in the present context relatively short training periods suffice to find good solutions. Although the validity of our conclusions in the context of real observations and model scenarios has yet to be proved, our results are very encouraging. In principle, the methods are suitable to train supermodels constructed using state-of-the art weather and climate models.Doktorgradsavhandlin

    Approximation of Ensemble Boundary Using Spectral Coefficients

    Get PDF
    IEEE A spectral analysis of a Boolean function is proposed for approximating the decision boundary of an ensemble of classifiers, and an intuitive explanation of computing Walsh coefficients for the functional approximation is provided. It is shown that the difference between the first- and third-order coefficient approximations is a good indicator of optimal base classifier complexity. When combining neural networks, the experimental results on a variety of artificial and real two-class problems demonstrate under what circumstances ensemble performance can be improved. For tuned base classifiers, the first-order coefficients provide performance similar to the majority vote. However, for weak/fast base classifiers, higher order coefficient approximation may give better performance. It is also shown that higher order coefficient approximation is superior to the Adaboost logarithmic weighting rule when boosting weak decision tree base classifiers

    Improving binary classification using filtering based on k-NN proximity graphs

    Get PDF
    © 2020, The Author(s). One of the ways of increasing recognition ability in classification problem is removing outlier entries as well as redundant and unnecessary features from training set. Filtering and feature selection can have large impact on classifier accuracy and area under the curve (AUC), as noisy data can confuse classifier and lead it to catch wrong patterns in training data. The common approach in data filtering is using proximity graphs. However, the problem of the optimal filtering parameters selection is still insufficiently researched. In this paper filtering procedure based on k-nearest neighbours proximity graph was used. Filtering parameters selection was adopted as the solution of outlier minimization problem: k-NN proximity graph, power of distance and threshold parameters are selected in order to minimize outlier percentage in training data. Then performance of six commonly used classifiers (Logistic Regression, Naïve Bayes, Neural Network, Random Forest, Support Vector Machine and Decision Tree) and one heterogeneous classifiers combiner (DES-LA) are compared with and without filtering. Dynamic ensemble selection (DES) systems work by estimating the level of competence of each classifier from a pool of classifiers. Only the most competent ones are selected to classify a given test sample. This is achieved by defining a criterion to measure the level of competence of base classifiers, such as, its accuracy in local regions of the feature space around the query instance. In our case the combiner is based on the local accuracy of single classifiers and its output is a linear combination of single classifiers ranking. As results of filtering, accuracy of DES-LA combiner shows big increase for low-accuracy datasets. But filtering doesn’t have sufficient impact on DES-LA performance while working with high-accuracy datasets. The results are discussed, and classifiers, which performance was highly affected by pre-processing filtering step, are defined. The main contribution of the paper is introducing modifications to the DES-LA combiner, as well as comparative analysis of filtering impact on the classifiers of various type. Testing the filtering algorithm on real case dataset (Taiwan default credit card dataset) confirmed the efficiency of automatic filtering approach

    Combining diverse neural nets

    Get PDF
    An appropriate use of neural computing techniques is to apply them to problems such as condition monitoring, fault diagnosis, control and sensing, where conventional solutions can be hard to obtain. However, when neural computing techniques are used, it is important that they are employed so as to maximise their performance, and improve their reliability. Their performance is typically assessed in terms of their ability to generalise to a previously unseen test set, although unless the training set is very carefully chosen, 100% accuracy is rarely achieved. Improved performance can result when sets of neural nets are combined in ensembles and ensembles can be viewed as an example of the reliability through redundancy approach that is recommended for conventional software and hardware in safety-critical or safety-related applications. Although there has been recent interest in the use of neural net ensembles, such techniques have yet to be applied to the tasks of condition monitoring and fault diagnosis. In this paper, we focus on the benefits of techniques which promote diversity amongst the members of an ensemble, such that there is a minimum number of coincident failures. The concept of ensemble diversity is considered in some detail, and a hierarchy of four levels of diversity is presented. This hierarchy is then used in the description of the application of ensemble-based techniques to the case study of fault diagnosis of a diesel engine

    Deep Stacked Stochastic Configuration Networks for Lifelong Learning of Non-Stationary Data Streams

    Full text link
    The concept of SCN offers a fast framework with universal approximation guarantee for lifelong learning of non-stationary data streams. Its adaptive scope selection property enables for proper random generation of hidden unit parameters advancing conventional randomized approaches constrained with a fixed scope of random parameters. This paper proposes deep stacked stochastic configuration network (DSSCN) for continual learning of non-stationary data streams which contributes two major aspects: 1) DSSCN features a self-constructing methodology of deep stacked network structure where hidden unit and hidden layer are extracted automatically from continuously generated data streams; 2) the concept of SCN is developed to randomly assign inverse covariance matrix of multivariate Gaussian function in the hidden node addition step bypassing its computationally prohibitive tuning phase. Numerical evaluation and comparison with prominent data stream algorithms under two procedures: periodic hold-out and prequential test-then-train processes demonstrate the advantage of proposed methodology.Comment: This paper has been published in Information Science
    • …
    corecore