23,415 research outputs found

    Learning in Real-Time Search: A Unifying Framework

    Full text link
    Real-time search methods are suited for tasks in which the agent is interacting with an initially unknown environment in real time. In such simultaneous planning and learning problems, the agent has to select its actions in a limited amount of time, while sensing only a local part of the environment centered at the agents current location. Real-time heuristic search agents select actions using a limited lookahead search and evaluating the frontier states with a heuristic function. Over repeated experiences, they refine heuristic values of states to avoid infinite loops and to converge to better solutions. The wide spread of such settings in autonomous software and hardware agents has led to an explosion of real-time search algorithms over the last two decades. Not only is a potential user confronted with a hodgepodge of algorithms, but he also faces the choice of control parameters they use. In this paper we address both problems. The first contribution is an introduction of a simple three-parameter framework (named LRTS) which extracts the core ideas behind many existing algorithms. We then prove that LRTA*, epsilon-LRTA*, SLA*, and gamma-Trap algorithms are special cases of our framework. Thus, they are unified and extended with additional features. Second, we prove completeness and convergence of any algorithm covered by the LRTS framework. Third, we prove several upper-bounds relating the control parameters and solution quality. Finally, we analyze the influence of the three control parameters empirically in the realistic scalable domains of real-time navigation on initially unknown maps from a commercial role-playing game as well as routing in ad hoc sensor networks

    Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

    Full text link
    A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.Comment: 7 page

    A convergence acceleration operator for multiobjective optimisation

    Get PDF
    A novel multiobjective optimisation accelerator is introduced that uses direct manipulation in objective space together with neural network mappings from objective space to decision space. This operator is a portable component that can be hybridized with any multiobjective optimisation algorithm. The purpose of this Convergence Acceleration Operator (CAO) is to enhance the search capability and the speed of convergence of the host algorithm. The operator acts directly in objective space to suggest improvements to solutions obtained by a multiobjective evolutionary algorithm (MOEA). These suggested improved objective vectors are then mapped into decision variable space and tested. The CAO is incorporated with two leading MOEAs, the Non-Dominated Sorting Genetic Algorithm (NSGA-II) and the Strength Pareto Evolutionary Algorithm (SPEA2) and tested. Results show that the hybridized algorithms consistently improve the speed of convergence of the original algorithm whilst maintaining the desired distribution of solutions

    Towards efficient SimRank computation on large networks

    Get PDF
    SimRank has been a powerful model for assessing the similarity of pairs of vertices in a graph. It is based on the concept that two vertices are similar if they are referenced by similar vertices. Due to its self-referentiality, fast SimRank computation on large graphs poses significant challenges. The state-of-the-art work [17] exploits partial sums memorization for computing SimRank in O(Kmn) time on a graph with n vertices and m edges, where K is the number of iterations. Partial sums memorizing can reduce repeated calculations by caching part of similarity summations for later reuse. However, we observe that computations among different partial sums may have duplicate redundancy. Besides, for a desired accuracy ϵ, the existing SimRank model requires K = [logC ϵ] iterations [17], where C is a damping factor. Nevertheless, such a geometric rate of convergence is slow in practice if a high accuracy is desirable. In this paper, we address these gaps. (1) We propose an adaptive clustering strategy to eliminate partial sums redundancy (i.e., duplicate computations occurring in partial sums), and devise an efficient algorithm for speeding up the computation of SimRank to 0(Kdn2) time, where d is typically much smaller than the average in-degree of a graph. (2) We also present a new notion of SimRank that is based on a differential equation and can be represented as an exponential sum of transition matrices, as opposed to the geometric sum of the conventional counterpart. This leads to a further speedup in the convergence rate of SimRank iterations. (3) Using real and synthetic data, we empirically verify that our approach of partial sums sharing outperforms the best known algorithm by up to one order of magnitude, and that our revised notion of SimRank further achieves a 5X speedup on large graphs while also fairly preserving the relative order of original SimRank scores

    A similarity-based cooperative co-evolutionary algorithm for dynamic interval multi-objective optimization problems

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Dynamic interval multi-objective optimization problems (DI-MOPs) are very common in real-world applications. However, there are few evolutionary algorithms that are suitable for tackling DI-MOPs up to date. A framework of dynamic interval multi-objective cooperative co-evolutionary optimization based on the interval similarity is presented in this paper to handle DI-MOPs. In the framework, a strategy for decomposing decision variables is first proposed, through which all the decision variables are divided into two groups according to the interval similarity between each decision variable and interval parameters. Following that, two sub-populations are utilized to cooperatively optimize decision variables in the two groups. Furthermore, two response strategies, rgb0.00,0.00,0.00i.e., a strategy based on the change intensity and a random mutation strategy, are employed to rapidly track the changing Pareto front of the optimization problem. The proposed algorithm is applied to eight benchmark optimization instances rgb0.00,0.00,0.00as well as a multi-period portfolio selection problem and compared with five state-of-the-art evolutionary algorithms. The experimental results reveal that the proposed algorithm is very competitive on most optimization instances
    • …
    corecore