53 research outputs found

    Discretized Bayesian pursuit – A new scheme for reinforcement learning

    Get PDF
    The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive when pursuing actions based on their estimated reward probabilities. Learning should then ideally proceed in progressively larger steps, as the reward probability estimates turn more accurate. This paper introduces a new estimator algorithm, the Discretized Bayesian Pursuit Algorithm (DBPA), that achieves this. The DBPA is implemented by linearly discretizing the action probability space of the Bayesian Pursuit Algorithm (BPA) [1]. The key innovation is that the linear discrete updating rules mitigate the counter-intuitive behavior of the corresponding linear continuous updating rules, by augmenting them with the reward probability estimates. Extensive experimental results show the superiority of DBPA over previous estimator algorithms. Indeed, the DBPA is probably the fastest reported LA to date

    A Stochastic Search on the Line-Based Solution to Discretized Estimation

    Get PDF
    Recently, Oommen and Rueda [11] presented a strategy by which the parameters of a binomial/multinomial distribution can be estimated when the underlying distribution is nonstationary. The method has been referred to as the Stochastic Learning Weak Estimator (SLWE), and is based on the principles of continuous stochastic Learning Automata (LA). In this paper, we consider a new family of stochastic discretized weak estimators pertinent to tracking time-varying binomial distributions. As opposed to the SLWE, our proposed estimator is discretized , i.e., the estimate can assume only a finite number of values. It is well known in the field of LA that discretized schemes achieve faster convergence speed than their corresponding continuous counterparts. By virtue of discretization, our estimator realizes extremely fast adjustments of the running estimates by jumps, and it is thus able to robustly, and very quickly, track changes in the parameters of the distribution after a switch has occurred in the environment. The design principle of our strategy is based on a solution, pioneered by Oommen [7], for the Stochastic Search on the Line (SSL) problem. The SSL solution proposed in [7], assumes the existence of an Oracle which informs the LA whether to go “right” or “left”. In our application domain, in order to achieve efficient estimation, we have to first infer (or rather simulate ) such an Oracle. In order to overcome this difficulty, we rather intelligently construct an “Artificial Oracle” that suggests whether we are to increase the current estimate or to decrease it. The paper briefly reports conclusive experimental results that demonstrate the ability of the proposed estimator to cope with non-stationary environments with a high adaptation rate, and with an accuracy that depends on its resolution. The results which we present are, to the best of our knowledge, the first reported results that resolve the problem of discretized weak estimation using a SSL-based solution

    Generalized pursuit learning schemes: new families of continuous and discretized learning automata

    Full text link

    On merging the fields of neural networks and adaptive data structures to yield new pattern recognition methodologies

    Get PDF
    The aim of this talk is to explain a pioneering exploratory research endeavour that attempts to merge two completely different fields in Computer Science so as to yield very fascinating results. These are the well-established fields of Neural Networks (NNs) and Adaptive Data Structures (ADS) respectively. The field of NNs deals with the training and learning capabilities of a large number of neurons, each possessing minimal computational properties. On the other hand, the field of ADS concerns designing, implementing and analyzing data structures which adaptively change with time so as to optimize some access criteria. In this talk, we shall demonstrate how these fields can be merged, so that the neural elements are themselves linked together using a data structure. This structure can be a singly-linked or doubly-linked list, or even a Binary Search Tree (BST). While the results themselves are quite generic, in particular, we shall, as a prima facie case, present the results in which a Self-Organizing Map (SOM) with an underlying BST structure can be adaptively re-structured using conditional rotations. These rotations on the nodes of the tree are local and are performed in constant time, guaranteeing a decrease in the Weighted Path Length of the entire tree. As a result, the algorithm, referred to as the Tree-based Topology-Oriented SOM with Conditional Rotations (TTO-CONROT), converges in such a manner that the neurons are ultimately placed in the input space so as to represent its stochastic distribution. Besides, the neighborhood properties of the neurons suit the best BST that represents the data

    Modeling a teacher in a tutorial-like system using Learning Automata

    Get PDF
    The goal of this paper is to present a novel approach to model the behavior of a Teacher in a Tutorial- like system. In this model, the Teacher is capable of presenting teaching material from a Socratic-type Domain model via multiple-choice questions. Since this knowledge is stored in the Domain model in chapters with different levels of complexity, the Teacher is able to present learning material of varying degrees of difficulty to the Students. In our model, we propose that the Teacher will be able to assist the Students to learn the more difficult material. In order to achieve this, he provides them with hints that are relative to the difficulty of the learning material presented. This enables the Students to cope with the process of handling more complex knowledge, and to be able to learn it appropriately. To our knowledge, the findings of this study are novel to the field of intelligent adaptation using Learning Automata (LA). The novelty lies in the fact that the learning system has a strategy by which it can deal with increasingly more complex/difficult Environments (or domains from which the learning as to be achieved). In our approach, the convergence of the Student models (represented by LA) is driven not only by the response of the Environment (Teacher), but also by the hints that are provided by the latter. Our proposed Teacher model has been tested against different benchmark Environments, and the results of these simulations have demonstrated the salient aspects of our model. The main conclusion is that Normal and Below-Normal learners benefited significantly from the hints provided by the Teacher, while the benefits to (brilliant) Fast learners were marginal. This seems to be in-line with our subjective understanding of the behavior of real-life Students

    A Learning Automata Based Solution to Service Selection in Stochastic Environments

    Get PDF
    With the abundance of services available in today’s world, identifying those of high quality is becoming increasingly difficult. Reputation systems can offer generic recommendations by aggregating user provided opinions about service quality, however, are prone to ballot stuffing and badmouthing . In general, unfair ratings may degrade the trustworthiness of reputation systems, and changes in service quality over time render previous ratings unreliable. In this paper, we provide a novel solution to the above problems based on Learning Automata (LA), which can learn the optimal action when operating in unknown stochastic environments. Furthermore, they combine rapid and accurate convergence with low computational complexity. In additional to its computational simplicity, unlike most reported approaches, our scheme does not require prior knowledge of the degree of any of the above mentioned problems with reputation systems. Instead, it gradually learns which users provide fair ratings, and which users provide unfair ratings, even when users unintentionally make mistakes. Comprehensive empirical results show that our LA based scheme efficiently handles any degree of unfair ratings (as long as ratings are binary). Furthermore, if the quality of services and/or the trustworthiness of users change, our scheme is able to robustly track such changes over time. Finally, the scheme is ideal for decentralized processing. Accordingly, we believe that our LA based scheme forms a promising basis for improving the performance of reputation systems in general

    A hierarchical learning scheme for solving the Stochastic Point Location problem

    Get PDF
    This paper deals with the Stochastic-Point Location (SPL) problem. It presents a solution which is novel in both philosophy and strategy to all the reported related learning algorithms. The SPL problem concerns the task of a Learning Mechanism attempting to locate a point on a line. The mechanism interacts with a random environment which essentially informs it, possibly erroneously, if the unknown parameter is on the left or the right of a given point which also is the current guess. The first pioneering work [6] on the SPL problem presented a solution which operates a one-dimensional controlled Random Walk (RW) in a discretized space to locate the unknown parameter. The primary drawback of the latter scheme is the fact that the steps made are always very conservative. If the step size is decreased the scheme yields a higher accuracy, but the convergence speed is correspondingly decreased. In this paper we introduce the Hierarchical Stochastic Searching on the Line (HSSL) solution. The HSSL solution is shown to provide orders of magnitude faster convergence when compared to the original SPL solution reported in [6]. The heart of the HSSL strategy involves performing a controlled RW on a discretized space, which unlike the traditional RWs, is not structured on the line per se , but rather on a binary tree described by intervals on the line. The overall learning scheme is shown to be optimal if the effectiveness of the environment, p , is greater than the golden ratio conjugate [4] – which, in itself, is a very intriguing phenomenon. The solution has been both analytically analyzed and simulated, with extremely fascinating results. The strategy presented here can be utilized to determine the best parameter to be used in any optimization problem, and also in any application where the SPL can be applied [6]

    Pression foncière, monétarisation et individualisation des systèmes de production en zone cotonnière au Togo

    Get PDF
    Pression foncière accrue et monétarisation des échanges transforment les systèmes de production : fixation de l'agriculture, baisse des rendements et de la productivité du travail, migrations accentuées. Le développement des cultures de rapport se fait par une augmentation de la surface cultivée par actif et favorise une simplification des systèmes de culture. Identification de stratégies paysannes diverse

    Stochastic searching on the line and its applications to parameter learning in nonlinear optimization

    No full text

    Stochastic Automata-Based Estimators for Adaptively Compressing Files With Nonstationary Distributions

    No full text
    • …
    corecore