15 research outputs found

    A System for Induction of Oblique Decision Trees

    Full text link
    This article describes a new system for induction of oblique decision trees. This system, OC1, combines deterministic hill-climbing with two forms of randomization to find a good oblique split (in the form of a hyperplane) at each node of a decision tree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We present extensive empirical studies, using both real and artificial data, that analyze OC1's ability to construct oblique trees that are smaller and more accurate than their axis-parallel counterparts. We also examine the benefits of randomization for the construction of oblique decision trees.Comment: See http://www.jair.org/ for an online appendix and other files accompanying this articl

    History and Philosophy of Neural Networks

    Get PDF
    This chapter conceives the history of neural networks emerging from two millennia of attempts to rationalise and formalise the operation of mind. It begins with a brief review of early classical conceptions of the soul, seating the mind in the heart; then discusses the subsequent Cartesian split of mind and body, before moving to analyse in more depth the twentieth century hegemony identifying mind with brain; the identity that gave birth to the formal abstractions of brain and intelligence we know as ‘neural networks’. The chapter concludes by analysing this identity - of intelligence and mind with mere abstractions of neural behaviour - by reviewing various philosophical critiques of formal connectionist explanations of ‘human understanding’, ‘mathematical insight’ and ‘consciousness’; critiques which, if correct, in an echo of Aristotelian insight, sug- gest that cognition may be more profitably understood not just as a result of [mere abstractions of] neural firings, but as a consequence of real, embodied neural behaviour, emerging in a brain, seated in a body, embedded in a culture and rooted in our world; the so called 4Es approach to cognitive science: the Embodied, Embedded, Enactive, and Ecological conceptions of mind

    Hardware neuromorphic learning systems utilizing memristive devices

    Get PDF
    As the efficiency of neuromorphic systems improves, biologically-inspired learning techniques are becoming more and more appealing for various computing applications, ranging from pattern and character recognition to general purpose reconfigurable logic. Due to their functional similarities to synapses in the brain, memristors are becoming a key element in the hardware realization of perceptron-based learning systems. By pairing memristive devices with a perceptron-based neuron model, previous work has shown that an efficient and low area neural logic block (NLB) can be developed. However, the use of a simple threshold activation function has limited the set of learnable functions for a single block, resulting in the need for multiple layers to implement certain functions. This complicates the training process, decreases the scalability of the system, and increases the overall energy and delay of large networks. In this work, three novel NLB designs are presented that overcome the limitations of previous hardware NLBs. First, an Adaptive Neural Logic Block (ANLB) and Robust Adaptive Neural Logic Block (RANLB) are proposed. By integrating an adaptive activation function into a perceptron model, these designs are capable of rapidly learning any function in a single layer. Next, a Multi Threshold Neural Logic Block (MTNLB) is proposed in which a static activation function is used to obtain the same functionality with minimal overhead. Using a Verilog-AMS model of a physical memristor, the proposed NLBs are applied to implement both reconfigurable logic and an Optical Character Recognition (OCR) system. When considering the MTNLB as a building block for ISCAS-85 benchmark circuits, it provides EDP improvements of over 90 percent over a standard LUT implementation on all benchmark circuits and up to a 99 percent improvement over a threshold NLB implementation. As a compromise, the ANLB and RANLB provide less of an EDP improvement in a static system, but achieve faster training convergence times for all functions. To show how the proposed design can simplify an OCR application, a simple 8x8 digit recognition system is developed. Using only four 16-input NLBs for each digit, the system is able to develop a model of each digit in only 90 us and correctly classify the majority of test images

    Intelligent video surveillance

    Get PDF
    In the focus of this thesis are the new and modified algorithms for object detection, recognition and tracking within the context of video analytics. The manual video surveillance has been proven to have low effectiveness and, at the same time, high expense because of the need in manual labour of operators, which are additionally prone to erroneous decisions. Along with increase of the number of surveillance cameras, there is a strong need to push for automatisation of the video analytics. The benefits of this approach can be found both in military and civilian applications. For military applications, it can help in localisation and tracking of objects of interest. For civilian applications, the similar object localisation procedures can make the criminal investigations more effective, extracting the meaningful data from the massive video footage. Recently, the wide accessibility of consumer unmanned aerial vehicles has become a new threat as even the simplest and cheapest airborne vessels can carry some cargo that means they can be upgraded to a serious weapon. Additionally they can be used for spying that imposes a threat to a private life. The autonomous car driving systems are now impossible without applying machine vision methods. The industrial applications require automatic quality control, including non-destructive methods and particularly methods based on the video analysis. All these applications give a strong evidence in a practical need in machine vision algorithms for object detection, tracking and classification and gave a reason for writing this thesis. The contributions to knowledge of the thesis consist of two main parts: video tracking and object detection and recognition, unified by the common idea of its applicability to video analytics problems. The novel algorithms for object detection and tracking, described in this thesis, are unsupervised and have only a small number of parameters. The approach is based on rigid motion segmentation by Bayesian filtering. The Bayesian filter, which was proposed specially for this method and contributes to its novelty, is formulated as a generic approach, and then applied to the video analytics problems. The method is augmented with optional object coordinate estimation using plain two-dimensional terrain assumption which gives a basis for the algorithm usage inside larger sensor data fusion models. The proposed approach for object detection and classification is based on the evolving systems concept and the new Typicality-Eccentricity Data Analytics (TEDA) framework. The methods are capable of solving classical problems of data mining: clustering, classification, and regression. The methods are proposed in a domain-independent way and are capable of addressing shift and drift of the data streams. Examples are given for the clustering and classification of the imagery data. For all the developed algorithms, the experiments have shown sustainable results on the testing data. The practical applications of the proposed algorithms are carefully examined and tested

    The hardware implementation of an artificial neural network using stochastic pulse rate encoding principles

    Get PDF
    In this thesis the development of a hardware artificial neuron device and artificial neural network using stochastic pulse rate encoding principles is considered. After a review of neural network architectures and algorithmic approaches suitable for hardware implementation, a critical review of hardware techniques which have been considered in analogue and digital systems is presented. New results are presented demonstrating the potential of two learning schemes which adapt by the use of a single reinforcement signal. The techniques for computation using stochastic pulse rate encoding are presented and extended with new novel circuits relevant to the hardware implementation of an artificial neural network. The generation of random numbers is the key to the encoding of data into the stochastic pulse rate domain. The formation of random numbers and multiple random bit sequences from a single PRBS generator have been investigated. Two techniques, Simulated Annealing and Genetic Algorithms, have been applied successfully to the problem of optimising the configuration of a PRBS random number generator for the formation of multiple random bit sequences and hence random numbers. A complete hardware design for an artificial neuron using stochastic pulse rate encoded signals has been described, designed, simulated, fabricated and tested before configuration of the device into a network to perform simple test problems. The implementation has shown that the processing elements of the artificial neuron are small and simple, but that there can be a significant overhead for the encoding of information into the stochastic pulse rate domain. The stochastic artificial neuron has the capability of on-line weight adaption. The implementation of reinforcement schemes using the stochastic neuron as a basic element are discussed

    Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

    No full text
    This paper investigates an algorithm for the construction of decisions trees comprised of linear threshold units and also presents a novel algorithm for the learning of nonlinearly separable boolean functions using Madalinestyle networks which are isomorphic to decision trees. The construction of such networks is discussed, and their performance in learning is compared with standard BackPropagation on a sample problem in which many irrelevant attributes are introduced. Littlestone's Winnow algorithm is also explored within this architecture as a means of learning in the presence of many irrelevant attributes. The learning ability of this Madaline-style architecture on non-optimal (larger than necessary) networks is also explored. Introduction We initially examine a non-incremental algorithm that learns binary classification tasks by producing decision trees of linear threshold units (LTU trees). This decision tree bears some similarity to the decision trees produced by ID3 (Quinlan 19..

    Learning Non-Linearly Separable Boolean Functions With Linear Threshold Unit Trees and Madaline-Style Networks

    No full text
    This paper investigates an algorithm for the construction of decisions trees comprised of linear threshold units and also presents a novel algorithm for the learning of nonlinearly separable boolean functions using Madalinestyle networks which are isomorphic to decision trees. The construction of such networks is discussed, and their performance in learning is compared with standard Back-Propagation on a sample problem in which many irrelevant attributes are introduced. Littlestone's Winnow algorithm is also explored within this architecture as a means of learning in the presence of many irrelevant attributes. The learning ability of this Madaline-style architecture on non-optimal (larger than necessary) networks is also explored

    Evolving Artificial Neural Networks using Cartesian Genetic Programming

    Get PDF
    NeuroEvolution is the application of Evolutionary Algorithms to the training of Artificial Neural Networks. NeuroEvolution is thought to possess many benefits over traditional training methods including: the ability to train recurrent network structures, the capability to adapt network topology, being able to create heterogeneous networks of arbitrary transfer functions, and allowing application to reinforcement as well as supervised learning tasks. This thesis presents a series of rigorous empirical investigations into many of these perceived advantages of NeuroEvolution. In this work it is demonstrated that the ability to simultaneously adapt network topology along with connection weights represents a significant advantage of many NeuroEvolutionary methods. It is also demonstrated that the ability to create heterogeneous networks comprising a range of transfer functions represents a further significant advantage. This thesis also investigates many potential benefits and drawbacks of NeuroEvolution which have been largely overlooked in the literature. This includes the presence and role of genetic redundancy in NeuroEvolution's search and whether program bloat is a limitation. The investigations presented focus on the use of a recently developed NeuroEvolution method based on Cartesian Genetic Programming. This thesis extends Cartesian Genetic Programming such that it can represent recurrent program structures allowing for the creation of recurrent Artificial Neural Networks. Using this newly developed extension, Recurrent Cartesian Genetic Programming, and its application to Artificial Neural Networks, are demonstrated to be extremely competitive in the domain of series forecasting

    Understanding deep architectures and the effect of unsupervised pre-training

    Full text link
    Cette thèse porte sur une classe d'algorithmes d'apprentissage appelés architectures profondes. Il existe des résultats qui indiquent que les représentations peu profondes et locales ne sont pas suffisantes pour la modélisation des fonctions comportant plusieurs facteurs de variation. Nous sommes particulièrement intéressés par ce genre de données car nous espérons qu'un agent intelligent sera en mesure d'apprendre à les modéliser automatiquement; l'hypothèse est que les architectures profondes sont mieux adaptées pour les modéliser. Les travaux de Hinton (2006) furent une véritable percée, car l'idée d'utiliser un algorithme d'apprentissage non-supervisé, les machines de Boltzmann restreintes, pour l'initialisation des poids d'un réseau de neurones supervisé a été cruciale pour entraîner l'architecture profonde la plus populaire, soit les réseaux de neurones artificiels avec des poids totalement connectés. Cette idée a été reprise et reproduite avec succès dans plusieurs contextes et avec une variété de modèles. Dans le cadre de cette thèse, nous considérons les architectures profondes comme des biais inductifs. Ces biais sont représentés non seulement par les modèles eux-mêmes, mais aussi par les méthodes d'entraînement qui sont souvent utilisés en conjonction avec ceux-ci. Nous désirons définir les raisons pour lesquelles cette classe de fonctions généralise bien, les situations auxquelles ces fonctions pourront être appliquées, ainsi que les descriptions qualitatives de telles fonctions. L'objectif de cette thèse est d'obtenir une meilleure compréhension du succès des architectures profondes. Dans le premier article, nous testons la concordance entre nos intuitions---que les réseaux profonds sont nécessaires pour mieux apprendre avec des données comportant plusieurs facteurs de variation---et les résultats empiriques. Le second article est une étude approfondie de la question: pourquoi l'apprentissage non-supervisé aide à mieux généraliser dans un réseau profond? Nous explorons et évaluons plusieurs hypothèses tentant d'élucider le fonctionnement de ces modèles. Finalement, le troisième article cherche à définir de façon qualitative les fonctions modélisées par un réseau profond. Ces visualisations facilitent l'interprétation des représentations et invariances modélisées par une architecture profonde.This thesis studies a class of algorithms called deep architectures. We argue that models that are based on a shallow composition of local features are not appropriate for the set of real-world functions and datasets that are of interest to us, namely data with many factors of variation. Modelling such functions and datasets is important if we are hoping to create an intelligent agent that can learn from complicated data. Deep architectures are hypothesized to be a step in the right direction, as they are compositions of nonlinearities and can learn compact distributed representations of data with many factors of variation. Training fully-connected artificial neural networks---the most common form of a deep architecture---was not possible before Hinton (2006) showed that one can use stacks of unsupervised Restricted Boltzmann Machines to initialize or pre-train a supervised multi-layer network. This breakthrough has been influential, as the basic idea of using unsupervised learning to improve generalization in deep networks has been reproduced in a multitude of other settings and models. In this thesis, we cast the deep learning ideas and techniques as defining a special kind of inductive bias. This bias is defined not only by the kind of functions that are eventually represented by such deep models, but also by the learning process that is commonly used for them. This work is a study of the reasons for why this class of functions generalizes well, the situations where they should work well, and the qualitative statements that one could make about such functions. This thesis is thus an attempt to understand why deep architectures work. In the first of the articles presented we study the question of how well our intuitions about the need for deep models correspond to functions that they can actually model well. In the second article we perform an in-depth study of why unsupervised pre-training helps deep learning and explore a variety of hypotheses that give us an intuition for the dynamics of learning in such architectures. Finally, in the third article, we want to better understand what a deep architecture models, qualitatively speaking. Our visualization approach enables us to understand the representations and invariances modelled and learned by deeper layers