7 research outputs found

    Uniform Inductive Improvement

    Get PDF
    We examine uniform procedures for improving the scientific competence of inductive inference machines. Formally, such procedures are construed as recursive operators. Several senses of improvement are considered, including (a) enlarging the class of functions on which success is certain, and (b) transforming probable success into certain success

    PAC learning of probability distributions over a discrete domain

    Get PDF
    AbstractWe investigate learning of classes of distributions over a discrete domain in a PAC context. We introduce two paradigms of PAC learning, namely absolute PAC learning, which is independent of the representation of the class of hypotheses, and PAC learning wrt the indexes, which heavily depends on such representations. We characterize non-computable learnability in both contexts. Then we investigate efficient learning strategies which are simulated by a polynomial-time Turing machine. One strategy is the frequentist one. According to this strategy, the learner conjectures a hypothesis which is as close as possible to the distribution given by the frequency relative to the examples. We characterize the classes of distributions which are absolutely PAC learnable by means of this strategy, and we relate frequentist learning wrt the indexes to the NP=RP problem. Finally, we present another strategy for learning wrt the indexes, namely learning by tests

    On the Impact of Forgetting on Learning Machines

    Get PDF
    People tend not to have perfect memories when it comes to learning, or to anything else for that matter. Most formal studies of learning, however, assume a perfect memory. Some approaches have restricted the number of items that could be retained. We introduce a complexity theoretic accounting of memory utilization by learning machines. In our new model, memory is measured in bits as a function of the size of the input. There is a hierarchy of learnability based on increasing memory allotment. The lower bound results are proved using an unusual combination of pumping and mutual recursion theorem arguments. For technical reasons, it was necessary to consider two types of memory: long and short term

    A Framework for Aggregation of Multiple Reinforcement Learning Algorithms

    Get PDF
    Aggregation of multiple Reinforcement Learning (RL) algorithms is a new and effective technique to improve the quality of Sequential Decision Making (SDM). The quality of a SDM depends on long-term rewards rather than the instant rewards. RL methods are often adopted to deal with SDM problems. Although many RL algorithms have been developed, none is consistently better than the others. In addition, the parameters of RL algorithms significantly influence learning performances. There is no universal rule to guide the choice of algorithms and the setting of parameters. To handle this difficulty, a new multiple RL system - Aggregated Multiple Reinforcement Learning System (AMRLS) is developed. In AMRLS, each RL algorithm (learner) learns individually in a learning module and provides its output to an intelligent aggregation module. The aggregation module dynamically aggregates these outputs and provides a final decision. Then, all learners take the action and update their policies individually. The two processes are performed alternatively. AMRLS can deal with dynamic learning problems without the need to search for the optimal learning algorithm or the optimal values of learning parameters. It is claimed that several complementary learning algorithms can be integrated in AMRLS to improve the learning performance in terms of success rate, robustness, confidence, redundance, and complementariness. There are two strategies for learning an optimal policy with RL methods. One is based on Value Function Learning (VFL), which learns an optimal policy expressed as a value function. The Temporal Difference RL (TDRL) methods are examples of this strategy. The other is based on Direct Policy Search (DPS), which directly searches for the optimal policy in the potential policy space. The Genetic Algorithms (GAs)-based RL (GARL) are instances of this strategy. A hybrid learning architecture of GARL and TDRL, HGATDRL, is proposed to combine them together to improve the learning ability. AMRLS and HGATDRL are tested on several SDM problems, including the maze world problem, pursuit domain problem, cart-pole balancing system, mountain car problem, and flight control system. Experimental results show that the proposed framework and method can enhance the learning ability and improve learning performance of a multiple RL system

    Probability and plurality for aggregations of learning machines

    No full text
    corecore