200,466 research outputs found

    A Neural Network Architecture for Autonomous Learning, Recognition, and Prediction in a Nonstationary World

    Full text link
    In a constantly changing world, humans are adapted to alternate routinely between attending to familiar objects and testing hypotheses about novel ones. We can rapidly learn to recognize and narne novel objects without unselectively disrupting our memories of familiar ones. We can notice fine details that differentiate nearly identical objects and generalize across broad classes of dissimilar objects. This chapter describes a class of self-organizing neural network architectures--called ARTMAP-- that are capable of fast, yet stable, on-line recognition learning, hypothesis testing, and naming in response to an arbitrary stream of input patterns (Carpenter, Grossberg, Markuzon, Reynolds, and Rosen, 1992; Carpenter, Grossberg, and Reynolds, 1991). The intrinsic stability of ARTMAP allows the system to learn incrementally for an unlimited period of time. System stability properties can be traced to the structure of its learned memories, which encode clusters of attended features into its recognition categories, rather than slow averages of category inputs. The level of detail in the learned attentional focus is determined moment-by-moment, depending on predictive success: an error due to over-generalization automatically focuses attention on additional input details enough of which are learned in a new recognition category so that the predictive error will not be repeated. An ARTMAP system creates an evolving map between a variable number of learned categories that compress one feature space (e.g., visual features) to learned categories of another feature space (e.g., auditory features). Input vectors can be either binary or analog. Computational properties of the networks enable them to perform significantly better in benchmark studies than alternative machine learning, genetic algorithm, or neural network models. Some of the critical problems that challenge and constrain any such autonomous learning system will next be illustrated. Design principles that work together to solve these problems are then outlined. These principles are realized in the ARTMAP architecture, which is specified as an algorithm. Finally, ARTMAP dynamics are illustrated by means of a series of benchmark simulations.Advanced Research Projects Agency (N00014-92-J-4015); British Petroleum (89A-1204); National Science Foundation (IRI-90-J-4015); Office of Naval Research (N00014-91-J-4100); Air Force Office of Scientific Research (F49620-92-J-0225

    One-to-many face recognition with bilinear CNNs

    Full text link
    The recent explosive growth in convolutional neural network (CNN) research has produced a variety of new architectures for deep learning. One intriguing new architecture is the bilinear CNN (B-CNN), which has shown dramatic performance gains on certain fine-grained recognition problems [15]. We apply this new CNN to the challenging new face recognition benchmark, the IARPA Janus Benchmark A (IJB-A) [12]. It features faces from a large number of identities in challenging real-world conditions. Because the face images were not identified automatically using a computerized face detection system, it does not have the bias inherent in such a database. We demonstrate the performance of the B-CNN model beginning from an AlexNet-style network pre-trained on ImageNet. We then show results for fine-tuning using a moderate-sized and public external database, FaceScrub [17]. We also present results with additional fine-tuning on the limited training data provided by the protocol. In each case, the fine-tuned bilinear model shows substantial improvements over the standard CNN. Finally, we demonstrate how a standard CNN pre-trained on a large face database, the recently released VGG-Face model [20], can be converted into a B-CNN without any additional feature training. This B-CNN improves upon the CNN performance on the IJB-A benchmark, achieving 89.5% rank-1 recall.Comment: Published version at WACV 201

    Neural Networks for Programming Quantum Annealers

    Full text link
    Quantum machine learning has the potential to enable advances in artificial intelligence, such as solving problems intractable on classical computers. Some fundamental ideas behind quantum machine learning are similar to kernel methods in classical machine learning. Both process information by mapping it into high-dimensional vector spaces without explicitly calculating their numerical values. We explore a setup for performing classification on labeled classical datasets, consisting of a classical neural network connected to a quantum annealer. The neural network programs the quantum annealer's controls and thereby maps the annealer's initial states into new states in the Hilbert space. The neural network's parameters are optimized to maximize the distance of states corresponding to inputs from different classes and minimize the distance between quantum states corresponding to the same class. Recent literature showed that at least some of the "learning" is due to the quantum annealer, connecting a small linear network to a quantum annealer and using it to learn small and linearly inseparable datasets. In this study, we consider a similar but not quite the same case, where a classical fully-fledged neural network is connected with a small quantum annealer. In such a setting, the fully-fledged classical neural-network already has built-in nonlinearity and learning power, and can already handle the classification problem alone, we want to see whether an additional quantum layer could boost its performance. We simulate this system to learn several common datasets, including those for image and sound recognition. We conclude that adding a small quantum annealer does not provide a significant benefit over just using a regular (nonlinear) classical neural network.Comment: 15 pages and 9 figure

    Analysis of Touchless Mouse Technology for Physical Disabilities

    Get PDF
    We provide a touchless mouse system that surpasses past attempts by utilising deep learning models such as DenseNet169 and DenseNet201, in addition to an ensemble model. For feature extraction in the touchless mouse system, we use two different state-of-the-art convolutional neural network architectures, namely DenseNet169 and DenseNet201. These models, which were trained using massive datasets, perform remarkably well regarding computer vision tasks. Touchless mouse technology's sophisticated feature extraction capabilities make the exact recognition and interpretation of hand motions and movements possible. An ensemble model is developed by integrating the results of DenseNet169 and DenseNet201. This is done to make the system's performance even more effective. The ensemble technique improves the accuracy, stability, and generalizability of hand gesture detection by capitalising on these distinctions and using them to its advantage. Comparisons are made between the DenseNet169 and DenseNet201 models, the Ensemble model and several other deep learning and ensemble learning models. Additional deep learning and ensemble learning models are also displayed. The Ensemble model reached the maximum attainable accuracy of 99.62 per cent

    Dynamic Hand Gesture Recognition Using Ultrasonic Sonar Sensors and Deep Learning

    Get PDF
    The space of hand gesture recognition using radar and sonar is dominated mostly by radar applications. In addition, the machine learning algorithms used by these systems are typically based on convolutional neural networks with some applications exploring the use of long short term memory networks. The goal of this study was to build and design a Sonar system that can classify hand gestures using a machine learning approach. Secondly, the study aims to compare convolutional neural networks to long short term memory networks as a means to classify hand gestures using sonar. A Doppler Sonar system was designed and built to be able to sense hand gestures. The Sonar system is a multi-static system containing one transmitter and three receivers. The sonar system can measure the Doppler frequency shifts caused by dynamic hand gestures. Since the system uses three receivers, three different Doppler frequency channels are measured. Three additional differential frequency channels are formed by computing the differences between the frequency of each of the receivers. These six channels are used as inputs to the deep learning models. Two different deep learning algorithms were used to classify the hand gestures; a Doppler biLSTM network [1] and a CNN [2]. Six basic hand gestures, two in each x- y- and z-axis, and two rotational hand gestures are recorded using both left and right hand at different distances. The gestures were also recorded using both left and right hands. Ten-Fold cross-validation is used to evaluate the networks' performance and classification accuracy. The LSTM was able to classify the six basic gestures with an accuracy of at least 96% but with the addition of the two rotational gestures, the accuracy drops to 47%. This result is acceptable since the basic gestures are more commonly used gestures than rotational gestures. The CNN was able to classify all the gestures with an accuracy of at least 98%. Additionally, The LSTM network is also able to classify separate left and right-hand gestures with an accuracy of 80% and The CNN with an accuracy of 83%. The study shows that CNN is the most widely used algorithm for hand gesture recognition as it can consistently classify gestures with various degrees of complexity. The study also shows that the LSTM network can also classify hand gestures with a high degree of accuracy. More experimentation, however, needs to be done in order to increase the complexity of recognisable gestures

    Recruitment of visual cortex for language processing in blind individuals: A neurobiological model

    Get PDF
    After sensory deprivation, the visual cortex is functionally recruited into non-visual cognitive language and semantic processing. Why this functional organization takes place and how its underlying mechanisms work at the neuronal circuit level is still unclear. Here, we use a biologically constrained network model implementing anatomical structure, neurophysiological function and connectivity of the fronto-tempo-occipital cortex to simulate word-meaning acquisition in visually deprived and undeprived (‘healthy control’) brains. Whereas in the ‘undeprived’ simulations only words denoting visual entities grew into the visual domain, the ‘blind’ models unexpectedly produced word-related neuronal circuits extending into visual cortex for all semantic categories (and especially for those carrying action-related meaning). Additionally, during word recognition, the blind model showed long-lasting spiking neural activity compared to the sighted model, a sign for enhanced verbal working memory due to the additional neural recruitment. Three factors are crucial for explaining this deprivation-related growth: (i) changes in the network’s activity balance brought about by the absence of uncorrelated sensory input, (ii) the connectivity structure of the network, and (iii) Hebbian correlation learning. By offering a neurobiological account for neural changes of language processing due to visual deprivation, our model bridges the gap between cellular-level mechanisms and system-level language function in blind humans
    • …