Search CORE

109 research outputs found

A new upper bound to (a variant of) the pancake problem

Author: Hunter Zach
Publication venue
Publication date: 26/11/2022
Field of study

The "pancake problem" asks how many prefix reversals are sufficient to sort any permutation

\pi \in \mathcal{S}_k

to the identity. We write

f(k)

to denote this quantity. The best known bounds are that

\frac{15}{14}k -O(1) \le f(k)\le \frac{18}{11}k+O(1)

. The proof of the upper bound is computer-assisted, and considers thousands of cases. We consider

h(k)

, how many prefix and suffix reversals are sufficient to sort any

\pi \in \mathcal{S}_k

. We observe that

\frac{15}{14}k -O(1)\le h(k)

still holds, and give a human proof that

h(k) \le \frac{3}{2}k +O(1)

. The constant "

\frac{3}{2}

" is a natural barrier for the pancake problem and this variant, hence new techniques will be required to do better.Comment: 9 pages, comments welcome

arXiv.org e-Print Archive

tree also barbed wire

Author: Zendarski Joseph Michael
Publication venue: eGrove
Publication date: 01/01/2014
Field of study

This manuscript contains poems

Planning under time pressure

Author: Burns Ethan
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2013
Field of study

Heuristic search is a technique used pervasively in artificial intelligence and automated planning. Often an agent is given a task that it would like to solve as quickly as possible. It must allocate its time between planning the actions to achieve the task and actually executing them. We call this problem planning under time pressure. Most popular heuristic search algorithms are ill-suited for this setting, as they either search a lot to find short plans or search a little and find long plans. The thesis of this dissertation is: when under time pressure, an automated agent should explicitly attempt to minimize the sum of planning and execution times, not just one or just the other. This dissertation makes four contributions. First we present new algorithms that use modern multi-core CPUs to decrease planning time without increasing execution. Second, we introduce a new model for predicting the performance of iterative-deepening search. The model is as accurate as previous offline techniques when using less training data, but can also be used online to reduce the overhead of iterative-deepening search, resulting in faster planning. Third we show offline planning algorithms that directly attempt to minimize the sum of planning and execution times. And, fourth we consider algorithms that plan online in parallel with execution. Both offline and online algorithms account for a user-specified preference between search and execution, and can greatly outperform the standard utility-oblivious techniques. By addressing the problem of planning under time pressure, these contributions demonstrate that heuristic search is no longer restricted to optimizing solution cost, obviating the need to choose between slow search times and expensive solutions

UNH Scholars' Repository

Off-Policy Deep Reinforcement Learning Algorithms for Handling Various Robotic Manipulator Tasks

Author: Aghaei Vahid Tavakol
Rzayev Altun
Publication venue
Publication date: 11/12/2022
Field of study

In order to avoid conventional controlling methods which created obstacles due to the complexity of systems and intense demand on data density, developing modern and more efficient control methods are required. In this way, reinforcement learning off-policy and model-free algorithms help to avoid working with complex models. In terms of speed and accuracy, they become prominent methods because the algorithms use their past experience to learn the optimal policies. In this study, three reinforcement learning algorithms; DDPG, TD3 and SAC have been used to train Fetch robotic manipulator for four different tasks in MuJoCo simulation environment. All of these algorithms are off-policy and able to achieve their desired target by optimizing both policy and value functions. In the current study, the efficiency and the speed of these three algorithms are analyzed in a controlled environment

arXiv.org e-Print Archive