1,913 research outputs found

    Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder

    Full text link
    In this paper, we present a hierarchical path planning framework called SG-RL (subgoal graphs-reinforcement learning), to plan rational paths for agents maneuvering in continuous and uncertain environments. By "rational", we mean (1) efficient path planning to eliminate first-move lags; (2) collision-free and smooth for agents with kinematic constraints satisfied. SG-RL works in a two-level manner. At the first level, SG-RL uses a geometric path-planning method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract paths, also called subgoal sequences. At the second level, SG-RL uses an RL method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal motion-planning policies which can generate kinematically feasible and collision-free trajectories between adjacent subgoals. The first advantage of the proposed method is that SSG can solve the limitations of sparse reward and local minima trap for RL agents; thus, LSPI can be used to generate paths in complex environments. The second advantage is that, when the environment changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI can deal with uncertainties by exploiting its generalization ability to handle changes in environments. Simulation experiments in representative scenarios demonstrate that, compared with existing methods, SG-RL can work well on large-scale maps with relatively low action-switching frequencies and shorter path lengths, and SG-RL can deal with small changes in environments. We further demonstrate that the design of reward functions and the types of training environments are important factors for learning feasible policies.Comment: 20 page

    Synthesis and inclusion behavior of a heterotritopic receptor based on hexahomotrioxacalix[3]arene

    Get PDF
    A heterotritopic hexahomotrioxacalix[3]arene receptor with the capability of binding two alkali metals and a transition metal in a cooperative fashion was synthesized. The binding model was investigated by using ¹H NMR titration experiments in CDCl₃–CD₃CN (10:1, v/v), and the results revealed that the transition metal was bound at the upper rim and the alkali metals at the lower and upper rims. Interestingly, the alkali metal ions Li⁺ and Na⁺ bind at the lower and upper rim respectively depending on the dimensions of the alkali metal ions versus the size of the cavities formed by the calix[3]arene derivative. The hexahomotrioxacalix[3]arene receptor acts as a heterotritopic receptor, binding with the transition metal ion Ag⁺ and the alkali metals ions Li⁺ and Na⁺. These findings were not applicable to other different sized alkali metals, such as K⁺ and Cs⁺

    Where to serve and return in Badminton Men's Double?

    Full text link
    This study aims to analyze the service and return landing areas in badminton men's double, based on data extracted from 20 badminton matches. We find that most services land near the center-line, while returns tend to land in the crossing areas of the serving team's court. Using generalized logit models, we are able to predict the return landing area based on features of the service and return round. We find that the direction of the service and the footwork and grip of the receiver could indicate his intended return landing area. Additionally, we discover that servers tend to intercept in specific areas based on their serving position. Our results offer valuable insights into the strategic decisions made by players in the service and return of a badminton rally

    Impacts of pollution abatement projects on happiness: an exploratory study in China

    Get PDF
    Pollution has been a global concern in recent decades, promoting related actions in an increasing number of areas. While pollution can lead to unhappiness, will pollution abatement simply increase people's happiness? We analyze relevant happiness data collected before and after China's South-to-North Water Diversion Eastern Route Pollution Control Project to test this idea. The empirical results indicate that the pollution abatement project may not enhance happiness in its duration. Some residents may temporarily sacrifice something (such as employment or income), which offsets the positive effect of environmental improvement on their happiness. In subgroup analyses, rural people are found to benefit more from environmental improvement and suffer less in happiness compared with the urban, and so are low-income people compared with the high-income ones. The findings offer new insights into the costs of pollution in terms of happiness and highlight the need to abate pollution meticulously. © 2020 Elsevier Lt

    Dual – loop force – displacement mixed control strategy and its application on the quasi – static test

    Get PDF
    The Quasi-static test is a well-known powerful methodology to evaluate the seismic performance of structural components and systems. One of the most important challenges in the Quasi-static testing is to achieve precise boundary conditions, especially for the axial loading of vertical components. The requirement of synchronized displacement loading and target axial force formed a pair of contradiction. A dual-loop force-displacement mixed control strategy is proposed. The presented approach is successfully verified through the quasi-static testing for a full-scale concrete filled steel tube column. The control targets are achieved with an excellent control performance
    corecore