6 research outputs found

    The Computational Complexity of Portal and Other 3D Video Games

    Get PDF
    We classify the computational complexity of the popular video games Portal and Portal 2. We isolate individual mechanics of the game and prove NP-hardness, PSPACE-completeness, or pseudo-polynomiality depending on the specific game mechanics allowed. One of our proofs generalizes to prove NP-hardness of many other video games such as Half-Life 2, Halo, Doom, Elder Scrolls, Fallout, Grand Theft Auto, Left 4 Dead, Mass Effect, Deus Ex, Metal Gear Solid, and Resident Evil. These results build on the established literature on the complexity of video games [Aloupis et al., 2014][Cormode, 2004][Forisek, 2010][Viglietta, 2014]

    Continuous Alternation: The Complexity of Pursuit in Continuous Domains

    Get PDF
    Complexity theory has used a game-theoretic notion, namely alternation, to great advantage in modeling parallelism and in obtaining lower bounds. The usual definition of alternation requires that transitions be made in discrete steps. The study of differential games is a classic area of optimal control, where there is continuous interaction and alternation between the players. Differential games capture many aspects of control theory and optimal control over continuous domains. In this paper, we define a generalization of the notion of alternation which applies to differential games, and which we call "continuous alternation." This approach allows us to obtain the first known complexity-theoretic results for open problems in differential games and optimal control. We concentrate our investigation on an important class of differential games, which we call polyhedral pursuit games. Pursuit games have application to many fundamental problems in autonomous robot control in the presence of an adversary. For example, this problem occurs in manufacturing environments with a single robot moving among a number of autonomous robots with unknown control programs, as well as in automatic automobile control, and collision control among aircraft and boats with unknown or adversary control. We show that in a three-dimensional pursuit game where each player's velocity is bounded (but there is no bound on acceleration), the pursuit game decision problem is hard for exponential time. This lower bound is somewhat surprising due to the sparse nature of the problem: there are only two moving objects (the players), each with only three degrees of freedom. It is also the first provably intractable result for any robotic problem with complete information; previous intractability results have relied on complexity-theoretic assumptions. Fortunately, we can counter our somewhat pessimistic lower bounds with polynomial time upper bounds for obtaining approximate solutions. In particular, we give polynomial time algorithms that approximately solve a very large class of pursuit games with arbitrarily small error. For e > 0, this algorithm finds a winning strategy for the evader provided that there is a winning strategy that always stays at least E distance from the pursuer and all obstacles. If the obstacles are described with n bits, then the algorithm runs in time (n/e)0(1), and applies to several types of pursuit games: either velocity or both acceleration and velocity may be bounded, and the bound may be of either the L2- or L&infin-norm. Our algorithms also generalize to when the obstacles have constant degree algebraic descriptions, and are allowed to have predictable movement

    Structured machine learning models for robustness against different factors of variability in robot control

    Get PDF
    An important feature of human sensorimotor skill is our ability to learn to reuse them across different environmental contexts, in part due to our understanding of attributes of variability in these environments. This thesis explores how the structure of models used within learning for robot control could similarly help autonomous robots cope with variability, hence achieving skill generalisation. The overarching approach is to develop modular architectures that judiciously combine different forms of inductive bias for learning. In particular, we consider how models and policies should be structured in order to achieve robust behaviour in the face of different factors of variation - in the environment, in objects and in other internal parameters of a policy - with the end goal of more robust, accurate and data-efficient skill acquisition and adaptation. At a high level, variability in skill is determined by variations in constraints presented by the external environment, and in task-specific perturbations that affect the specification of optimal action. A typical example of environmental perturbation would be variation in lighting and illumination, affecting the noise characteristics of perception. An example of task perturbations would be variation in object geometry, mass or friction, and in the specification of costs associated with speed or smoothness of execution. We counteract these factors of variation by exploring three forms of structuring: utilising separate data sets curated according to the relevant factor of variation, building neural network models that incorporate this factorisation into the very structure of the networks, and learning structured loss functions. The thesis is comprised of four projects exploring this theme within robotics planning and prediction tasks. Firstly, in the setting of trajectory prediction in crowded scenes, we explore a modular architecture for learning static and dynamic environmental structure. We show that factorising the prediction problem from the individual representations allows for robust and label efficient forward modelling, and relaxes the need for full model re-training in new environments. This modularity explicitly allows for a more flexible and interpretable adaptation of trajectory prediction models to using pre-trained state of the art models. We show that this results in more efficient motion prediction and allows for performance comparable to the state-of-the-art supervised 2D trajectory prediction. Next, in the domain of contact-rich robotic manipulation, we consider a modular architecture that combines model-free learning from demonstration, in particular dynamic movement primitives (DMP), with modern model-free reinforcement learning (RL), using both on-policy and off-policy approaches. We show that factorising the skill learning problem to skill acquisition and error correction through policy adaptation strategies such as residual learning can help improve the overall performance of policies in the context of contact-rich manipulation. Our empirical evaluation demonstrates how to best do this with DMPs and propose “residual Learning from Demonstration“ (rLfD), a framework that combines DMPs with RL to learn a residual correction policy. Our evaluations, performed both in simulation and on a physical system, suggest that applying residual learning directly in task space and operating on the full pose of the robot can significantly improve the overall performance of DMPs. We show that rLfD offers a gentle to the joints solution that improves the task success and generalisation of DMPs. Last but not least, our study shows that the extracted correction policies can be transferred to different geometries and frictions through few-shot task adaptation. Third, we employ meta learning to learn time-invariant reward functions, wherein both the objectives of a task (i.e., the reward functions) and the policy for performing that task optimally are learnt simultaneously. We propose a novel inverse reinforcement learning (IRL) formulation that allows us to 1) vary the length of execution by learning time-invariant costs, and 2) relax the temporal alignment requirements for learning from demonstration. We apply our method to two different types of cost formulations and evaluate their performance in the context of learning reward functions for simulated placement and peg in hole tasks executed on a 7DoF Kuka IIWA arm. Our results show that our approach enables learning temporally invariant rewards from misaligned demonstration that can also generalise spatially to out of distribution tasks. Finally, we employ our observations to evaluate adversarial robustness in the context of transfer learning from a source trained on CIFAR 100 to a target network trained on CIFAR 10. Specifically, we study the effects of using robust optimisation in the source and target networks. This allows us to identify transfer learning strategies under which adversarial defences are successfully retained, in addition to revealing potential vulnerabilities. We study the extent to which adversarially robust features can preserve their defence properties against black and white-box attacks under three different transfer learning strategies. Our empirical evaluations give insights on how well adversarial robustness under transfer learning can generalise.

    A study of mobile robot motion planning

    Get PDF
    This thesis studies motion planning for mobile robots in various environments. The basic tools for the research are the configuration space and the visibility graph. A new approach is developed which generates a smoothed minimum time path. The difference between this and the Minimum Time Path at Visibility Node (MTPVN) is that there is more clearance between the robot and the obstacles, and so it is safer. The accessibility graph plays an important role in motion planning for a massless mobile robot in dynamic environments. It can generate a minimum time motion in 0(n2»log(n)) computation time, where n is the number of vertices of all the polygonal obstacles. If the robot is not considered to be massless (that is, it requires time to accelerate), the space time approach becomes a 3D problem which requires exponential time and memory. A new approach is presented here based on the improved accessibility polygon and improved accessibility graph, which generates a minimum time motion for a mobile robot with mass in O((n+k)2»log(n+k)) time, where n is the number of vertices of the obstacles and k is the number of obstacles. Since k is much less than n, so the computation time for this approach is almost the same as the accessibility graph approach. The accessibility graph approach is extended to solve motion planning for robots in three dimensional environments. The three dimensional accessibility graph is constructed based on the concept of the accessibility polyhedron. Based on the properties of minimum time motion, an approach is proposed to search the three dimensional accessibility graph to generate the minimum time motion. Motion planning in binary image representation environment is also studied. Fuzzy logic based digital image processing has been studied. The concept of Fuzzy Principal Index Of Area Coverage (PIOAC) is proposed to recognise and match objects in consecutive images. Experiments show that PIOAC is useful in recognising objects. The visibility graph of a binary image representation environment is very inefficient, so the approach usually used to plan the motion for such an environment is the quadtree approach. In this research, polygonizing an obstacle is proposed. The approaches developed for various environments can be used to solve the motion planning problem without any modification. A simulation system is designed to simulate the approaches
    corecore