21 research outputs found

    A Minimal Architecture for General Cognition

    Full text link
    A minimalistic cognitive architecture called MANIC is presented. The MANIC architecture requires only three function approximating models, and one state machine. Even with so few major components, it is theoretically sufficient to achieve functional equivalence with all other cognitive architectures, and can be practically trained. Instead of seeking to transfer architectural inspiration from biology into artificial intelligence, MANIC seeks to minimize novelty and follow the most well-established constructs that have evolved within various sub-fields of data science. From this perspective, MANIC offers an alternate approach to a long-standing objective of artificial intelligence. This paper provides a theoretical analysis of the MANIC architecture.Comment: 8 pages, 8 figures, conference, Proceedings of the 2015 International Joint Conference on Neural Network

    GERİ-YAYILMALI ÖĞRENME ALGORİTMASINDAKİ ÖĞRENME PARAMETRELERİNİN GENETİK

    Get PDF
    Bu çalışmada, ileri beslemeli bir sinir ağının eğitiminde kullanılan geri-yayılmalı öğrenme algoritmasındaki öğrenme parametreleri genetik algoritmalar kullanılarak belirlenmiştir. Öğrenme parametreleri öğrenme ve momentum katsayıları olarak bilinmektedir. Öğrenme parametreleri ağın öğrenme hızının arttırılması, öğrenme esnasında oluşabilecek osilasyonların giderilmesi ve lokal minimumlardan kaçılması gibi özellikleri belirlemektedirler. Dolayısıyla bu parametrelerin uygun biçimde seçilmesi ağın daha etkin olarak eğitilmesinde oldukça önemlidir. Öğrenme parametrelerinin genetik algoritma ile belirlenmesi için, dört katmanlı ileri beslemeli bir ağ tasarlanmıştır. Tasarlanan ağdaki üç öğrenme ve üç momentum katsayısı, genetik bir kromozom ile ifade edilmiştir. Çalışmanın amacı; en uygun kromozomun seçilmesidir. Ortaya konulan yöntemin test edilmesinde özel tanımlı iki boyutlu regresyon problemlerinden yararlanılmıştır. Yapılan test çalışması ortaya konulan yöntemin geleneksel sabit parametreli öğrenme algoritmasına göre daha etkin olduğunu göstermiştir

    Gauss-newton Based Learning For Fully Recurrent Neural Networks

    Get PDF
    The thesis discusses a novel off-line and on-line learning approach for Fully Recurrent Neural Networks (FRNNs). The most popular algorithm for training FRNNs, the Real Time Recurrent Learning (RTRL) algorithm, employs the gradient descent technique for finding the optimum weight vectors in the recurrent neural network. Within the framework of the research presented, a new off-line and on-line variation of RTRL is presented, that is based on the Gauss-Newton method. The method itself is an approximate Newton\u27s method tailored to the specific optimization problem, (non-linear least squares), which aims to speed up the process of FRNN training. The new approach stands as a robust and effective compromise between the original gradient-based RTRL (low computational complexity, slow convergence) and Newton-based variants of RTRL (high computational complexity, fast convergence). By gathering information over time in order to form Gauss-Newton search vectors, the new learning algorithm, GN-RTRL, is capable of converging faster to a better quality solution than the original algorithm. Experimental results reflect these qualities of GN-RTRL, as well as the fact that GN-RTRL may have in practice lower computational cost in comparison, again, to the original RTRL

    Temporal nonlinear dimensionality reduction

    Full text link

    An Improved Bees Algorithm for Training Deep Recurrent Networks for Sentiment Classification

    Get PDF
    Recurrent neural networks (RNNs) are powerful tools for learning information from temporal sequences. Designing an optimum deep RNN is difficult due to configuration and training issues, such as vanishing and exploding gradients. In this paper, a novel metaheuristic optimisation approach is proposed for training deep RNNs for the sentiment classification task. The approach employs an enhanced Ternary Bees Algorithm (BA-3+), which operates for large dataset classification problems by considering only three individual solutions in each iteration. BA-3+ combines the collaborative search of three bees to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. Local learning with exploitative search utilises the greedy selection strategy. Stochastic gradient descent (SGD) learning with singular value decomposition (SVD) aims to handle vanishing and exploding gradients of the decision parameters with the stabilisation strategy of SVD. Global learning with explorative search achieves faster convergence without getting trapped at local optima to find the optimal set of trainable parameters of the proposed deep recurrent learning architecture. BA-3+ has been tested on the sentiment classification task to classify symmetric and asymmetric distribution of the datasets from different domains, including Twitter, product reviews, and movie reviews. Comparative results have been obtained for advanced deep language models and Differential Evolution (DE) and Particle Swarm Optimization (PSO) algorithms. BA-3+ converged to the global minimum faster than the DE and PSO algorithms, and it outperformed the SGD, DE, and PSO algorithms for the Turkish and English datasets. The accuracy value and F1 measure have improved at least with a 30–40% improvement than the standard SGD algorithm for all classification datasets. Accuracy rates in the RNN model trained with BA-3+ ranged from 80% to 90%, while the RNN trained with SGD was able to achieve between 50% and 60% for most datasets. The performance of the RNN model with BA-3+ has as good as for Tree-LSTMs and Recursive Neural Tensor Networks (RNTNs) language models, which achieved accuracy results of up to 90% for some datasets. The improved accuracy and convergence results show that BA-3+ is an efficient, stable algorithm for the complex classification task, and it can handle the vanishing and exploding gradients problem of deep RNNs

    A Study of Learning Issues in Feedforward Neural Networks

    Get PDF
    When training a feedforward stochastic gradient descendent trained neural network, there is a possibility of not learning a batch of patterns correctly that causes the network to fail in the predictions in the areas adjacent to those patterns. This problem has usually been resolved by directly adding more complexity to the network, normally by increasing the number of learning layers, which means it will be heavier to run on the workstation. In this paper, the properties and the effect of the patterns on the network are analysed and two main reasons why the patterns are not learned correctly are distinguished: the disappearance of the Jacobian gradient on the processing layers of the network and the opposite direction of the gradient of those patterns. A simplified experiment has been carried out on a simple neural network and the errors appearing during and after training have been monitored. Taking into account the data obtained, the initial hypothesis of causes seems to be correct. Finally, some corrections to the network are proposed with the aim of solving those training issues and to be able to offer a sufficiently correct prediction, in order to increase the complexity of the network as little as possible.The authors were supported by the government of the Basque Country through the research grant ELKARTEK KK-2021/00014 BASQNET (Estudio de nuevas técnicas de inteligencia artificial basadas en Deep Learning dirigidas a la optimización de procesos industriales)
    corecore