5 research outputs found

    Discretized Bayesian pursuit – A new scheme for reinforcement learning

    Get PDF
    The success of Learning Automata (LA)-based estimator algorithms over the classical, Linear Reward-Inaction ( L RI )-like schemes, can be explained by their ability to pursue the actions with the highest reward probability estimates. Without access to reward probability estimates, it makes sense for schemes like the L RI to first make large exploring steps, and then to gradually turn exploration into exploitation by making progressively smaller learning steps. However, this behavior becomes counter-intuitive when pursuing actions based on their estimated reward probabilities. Learning should then ideally proceed in progressively larger steps, as the reward probability estimates turn more accurate. This paper introduces a new estimator algorithm, the Discretized Bayesian Pursuit Algorithm (DBPA), that achieves this. The DBPA is implemented by linearly discretizing the action probability space of the Bayesian Pursuit Algorithm (BPA) [1]. The key innovation is that the linear discrete updating rules mitigate the counter-intuitive behavior of the corresponding linear continuous updating rules, by augmenting them with the reward probability estimates. Extensive experimental results show the superiority of DBPA over previous estimator algorithms. Indeed, the DBPA is probably the fastest reported LA to date

    Generalized pursuit learning schemes: new families of continuous and discretized learning automata

    Full text link

    The design of absorbing Bayesian pursuit algorithms and the formal analyses of their ε-optimality

    Get PDF
    The fundamental phenomenon that has been used to enhance the convergence speed of learning automata (LA) is that of incorporating the running maximum likelihood (ML) estimates of the action reward probabilities into the probability updating rules for selecting the actions. The frontiers of this field have been recently expanded by replacing the ML estimates with their corresponding Bayesian counterparts that incorporate the properties of the conjugate priors. These constitute the Bayesian pursuit algorithm (BPA), and the discretized Bayesian pursuit algorithm. Although these algorithms have been designed and efficiently implemented, and are, arguably, the fastest and most accurate LA reported in the literature, the proofs of their ϵϵ-optimal convergence has been unsolved. This is precisely the intent of this paper. In this paper, we present a single unifying analysis by which the proofs of both the continuous and discretized schemes are proven. We emphasize that unlike the ML-based pursuit schemes, the Bayesian schemes have to not only consider the estimates themselves but also the distributional forms of their conjugate posteriors and their higher order moments—all of which render the proofs to be particularly challenging. As far as we know, apart from the results themselves, the methodologies of this proof have been unreported in the literature—they are both pioneering and novel

    String taxonomy using learning automata

    No full text
    A typical syntactic pattern recognition (PR) problem involves comparing a noisy string with every element of a dictionary, H. The problem of classification can be greatly simplified if the dictionary is partitioned into a set of subdictionaries. In this case, the classification can be hierarchical - the noisy string is first compared to a representative element of each subdictionary and the closest match within the subdictionary is subsequently located. Indeed, the entire problem of subdividing a set of strings into subsets where each subset contains "similar" strings has been referred to as the "String Taxonomy Problem." To our knowledge there is no reported solution to this problem (see footnote 2). In this paper we present a learning-automaton based solution to string taxonomy. The solution utilizes the Object Migrating Automaton (OMA) the power of which in clustering objects and images [33], [35] has been reported. The power of the scheme for string taxonomy has been demonstrated using random strings and garbled versions of string representations of fragments of macromolecules

    String Taxonomy Using Learning Automata

    No full text
    A typical syntactic pattern recognition (PR) problem involves comparing a noisy string with every element of a dictionary, H. The problem of classification can be greatly simplified if the dictionary is partitioned into a set of sub-dictionaries. In this case, the classification can be hierarchical -- the noisy string is first compared to a representative element of each sub-dictionary and the closest match within the sub-dictionary is subsequently located. Indeed, the entire problem of sub-dividing a set of strings into subsets where each subset contains "similar" strings has been referred to as the "String Taxonomy Problem". To our knowledge there is no reported solution to this problem (see footnote on Page 2). In this paper we shall present a learning-automaton based solution to string taxonomy. The solution utilizes the Object Migrating Automaton (OMA) whose power in clustering objects and images [33,35] has been reported. The power of the scheme for string taxonomy has been demon..
    corecore