109 research outputs found

    A new upper bound to (a variant of) the pancake problem

    Full text link
    The "pancake problem" asks how many prefix reversals are sufficient to sort any permutation Ο€βˆˆSk\pi \in \mathcal{S}_k to the identity. We write f(k)f(k) to denote this quantity. The best known bounds are that 1514kβˆ’O(1)≀f(k)≀1811k+O(1)\frac{15}{14}k -O(1) \le f(k)\le \frac{18}{11}k+O(1). The proof of the upper bound is computer-assisted, and considers thousands of cases. We consider h(k)h(k), how many prefix and suffix reversals are sufficient to sort any Ο€βˆˆSk\pi \in \mathcal{S}_k. We observe that 1514kβˆ’O(1)≀h(k)\frac{15}{14}k -O(1)\le h(k) still holds, and give a human proof that h(k)≀32k+O(1)h(k) \le \frac{3}{2}k +O(1). The constant "32\frac{3}{2}" is a natural barrier for the pancake problem and this variant, hence new techniques will be required to do better.Comment: 9 pages, comments welcome

    tree also barbed wire

    Get PDF
    This manuscript contains poems

    Planning under time pressure

    Get PDF
    Heuristic search is a technique used pervasively in artificial intelligence and automated planning. Often an agent is given a task that it would like to solve as quickly as possible. It must allocate its time between planning the actions to achieve the task and actually executing them. We call this problem planning under time pressure. Most popular heuristic search algorithms are ill-suited for this setting, as they either search a lot to find short plans or search a little and find long plans. The thesis of this dissertation is: when under time pressure, an automated agent should explicitly attempt to minimize the sum of planning and execution times, not just one or just the other. This dissertation makes four contributions. First we present new algorithms that use modern multi-core CPUs to decrease planning time without increasing execution. Second, we introduce a new model for predicting the performance of iterative-deepening search. The model is as accurate as previous offline techniques when using less training data, but can also be used online to reduce the overhead of iterative-deepening search, resulting in faster planning. Third we show offline planning algorithms that directly attempt to minimize the sum of planning and execution times. And, fourth we consider algorithms that plan online in parallel with execution. Both offline and online algorithms account for a user-specified preference between search and execution, and can greatly outperform the standard utility-oblivious techniques. By addressing the problem of planning under time pressure, these contributions demonstrate that heuristic search is no longer restricted to optimizing solution cost, obviating the need to choose between slow search times and expensive solutions

    Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks

    Full text link
    In order to avoid conventional controlling methods which created obstacles due to the complexity of systems and intense demand on data density, developing modern and more efficient control methods are required. In this way, reinforcement learning off-policy and model-free algorithms help to avoid working with complex models. In terms of speed and accuracy, they become prominent methods because the algorithms use their past experience to learn the optimal policies. In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks in MuJoCo simulation environment. All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions. In the current study, the efficiency and the speed of these three algorithms are analyzed in a controlled environment
    • …
    corecore