2,557 research outputs found

    Language Understanding for Text-based Games Using Deep Reinforcement Learning

    Get PDF
    In this paper, we consider the task of learning control policies for text-based games. In these games, all interactions in the virtual world are through text and the underlying state is not observed. The resulting language barrier makes such environments challenging for automatic game players. We employ a deep reinforcement learning framework to jointly learn state representations and action policies using game rewards as feedback. This framework enables us to map text descriptions into vector representations that capture the semantics of the game states. We evaluate our approach on two game worlds, comparing against baselines using bag-of-words and bag-of-bigrams for state representations. Our algorithm outperforms the baselines on both worlds demonstrating the importance of learning expressive representations.Comment: 11 pages, Appearing at EMNLP, 201

    Combining reinforcement learning and conventional control to improve automatic guided vehicles tracking of complex trajectories

    Get PDF
    With the rapid growth of logistics transportation in the framework of Industry 4.0, automated guided vehicle (AGV) technologies have developed speedily. These systems present two coupled control problems: the control of the longitudinal velocity, essential to ensure the application requirements such as throughput and tag time, and the trajectory tracking control, necessary to ensure the proper accuracy in loading and unloading manoeuvres. When the paths are very short or have abrupt changes, the kinematic constraints play a restrictive role, and the tracking control becomes more challenging. In this case, advanced control strategies such as those based on intelligent techniques, including machine learning (ML) can be useful. Hence, in this work, we present an intelligent hybrid control scheme that combines reinforcement learning-based control (RLC) with conventional PI regulators to face both control problems simultaneously. On the one hand, PIs are used to control the speed of each wheel. On the other hand, the input reference of these regulators is calculated by the RLC in order to reduce the guiding error of the path tracking and to maintain the longitudinal speed. The latter is compared with a PID path following controller. The PID regulators have been tuned by genetic algorithms. The RLC allows the vehicle to learn how to improve the trajectory tracking in an adaptive way and thus, the AGV can face disturbances or unknown physical system parameters that may change due to friction and degradation of AGV mechanical components. Extensive simulation experiments of the proposed intelligent control strategy on a hybrid tricycle and differential AGV model, that considers the kinematics and the dynamics of the vehicle, prove the efficiency of the approach when following different demanding trajectories. The performance of the RL tracking controller in comparison with the optimized PID gives errors around 70% smaller, and the average maximum error is also 48% lower.Open access funding enabled and organized by Projekt DEAL

    An Agent-Based Approach to Self-Organized Production

    Full text link
    The chapter describes the modeling of a material handling system with the production of individual units in a scheduled order. The units represent the agents in the model and are transported in the system which is abstracted as a directed graph. Since the hindrances of units on their path to the destination can lead to inefficiencies in the production, the blockages of units are to be reduced. Therefore, the units operate in the system by means of local interactions in the conveying elements and indirect interactions based on a measure of possible hindrances. If most of the units behave cooperatively ("socially"), the blockings in the system are reduced. A simulation based on the model shows the collective behavior of the units in the system. The transport processes in the simulation can be compared with the processes in a real plant, which gives conclusions about the consequencies for the production based on the superordinate planning.Comment: For related work see http://www.soms.ethz.c

    Mathematical Modelling and Methods for Load Balancing and Coordination of Multi-Robot Stations

    Get PDF
    The automotive industry is moving from mass production towards an individualized production, individualizing parts aims to improve product quality and to reduce costs and material waste. This thesis concerns aspects of load balancing and coordination of multi-robot stations in the automotive manufacturing industry, considering efficient algorithms required by an individualized production. The goal of the load balancing problem is to improve the equipment utilization. Several approaches for solving the load balancing problem are suggested along with details on mathematical tools and subroutines employed.Our contributions to the solution of the load balancing problem are fourfold. First, to circumvent robot coordination we construct disjoint robot programs, which require no coordination schemes, are flexible, admit competitive cycle times for several industrial instances, and may be preferred in an individualized production. Second, since solving the task assignment problem for generating the disjoint robot programs was found to be unreasonably time-consuming, we model it as a generalized unrelated parallel machine problem with set packing constraints and suggest a tailored Lagrangian-based branch-and-bound algorithm. Third, a continuous collision detection method needs to determine whether the sweeps of multiple moving robots are disjoint. We suggest using the maximum velocity of each robot along with distance computations at certain robot configurations to derive a function that provides lower bounds on the minimum distance between the sweeps. The lower bounding function is iteratively minimized and updated with new distance information; our method is substantially faster than previously developed methods. Fourth, to allow for load balancing of complex multi-robot stations we generalize the disjoint robot programs into sequences of such; for some instances this procedure provides a significant equipment utilization improvement in comparison with previous automated methods

    Mathematical Modelling for Load Balancing and Minimization of Coordination Losses in Multirobot Stations

    Get PDF
    The automotive industry is moving from mass production towards an individualized production, in order to improve product quality and reduce costs and material waste. This thesis concerns aspects of load balancing of industrial robots in the automotive manufacturing industry, considering efficient algorithms required by an individualized production. The goal of the load balancing problem is to improve the equipment utilization. Several approaches for solving the load balancing problem are presented along with details on mathematical tools and subroutines employed.Our contributions to the solution of the load balancing problem are manifold. First, to circumvent robot coordination we have constructed disjoint robot programs, which require no coordination schemes, are more flexible, admit competitive cycle times for some industrial instances, and may be preferred in an individualized production. Second, since solving the task assignment problem for generating the disjoint robot programs was found to be unreasonably time-consuming, we modelled it as a generalized unrelated parallel machine problem with set packing constraints and suggested a tighter model formulation, which was proven to be much more tractable for a branch--and--cut solver. Third, within continuous collision detection it needs to be determined whether the sweeps of multiple moving robots are disjoint. Our solution uses the maximum velocity of each robot along with distance computations at certain robot configurations to derive a function that provides lower bounds on the minimum distance between the sweeps. The lower bounding function is iteratively minimized and updated with new distance information; our method is substantially faster than previously developed methods
    • …
    corecore