1,913 research outputs found
Combining Subgoal Graphs with Reinforcement Learning to Build a Rational Pathfinder
In this paper, we present a hierarchical path planning framework called SG-RL
(subgoal graphs-reinforcement learning), to plan rational paths for agents
maneuvering in continuous and uncertain environments. By "rational", we mean
(1) efficient path planning to eliminate first-move lags; (2) collision-free
and smooth for agents with kinematic constraints satisfied. SG-RL works in a
two-level manner. At the first level, SG-RL uses a geometric path-planning
method, i.e., Simple Subgoal Graphs (SSG), to efficiently find optimal abstract
paths, also called subgoal sequences. At the second level, SG-RL uses an RL
method, i.e., Least-Squares Policy Iteration (LSPI), to learn near-optimal
motion-planning policies which can generate kinematically feasible and
collision-free trajectories between adjacent subgoals. The first advantage of
the proposed method is that SSG can solve the limitations of sparse reward and
local minima trap for RL agents; thus, LSPI can be used to generate paths in
complex environments. The second advantage is that, when the environment
changes slightly (i.e., unexpected obstacles appearing), SG-RL does not need to
reconstruct subgoal graphs and replan subgoal sequences using SSG, since LSPI
can deal with uncertainties by exploiting its generalization ability to handle
changes in environments. Simulation experiments in representative scenarios
demonstrate that, compared with existing methods, SG-RL can work well on
large-scale maps with relatively low action-switching frequencies and shorter
path lengths, and SG-RL can deal with small changes in environments. We further
demonstrate that the design of reward functions and the types of training
environments are important factors for learning feasible policies.Comment: 20 page
Synthesis and inclusion behavior of a heterotritopic receptor based on hexahomotrioxacalix[3]arene
A heterotritopic hexahomotrioxacalix[3]arene receptor with the capability of binding two alkali metals and a transition metal in a cooperative fashion was synthesized. The binding model was investigated by using ¹H NMR titration experiments in CDCl₃–CD₃CN (10:1, v/v), and the results revealed that the transition metal was bound at the upper rim and the alkali metals at the lower and upper rims. Interestingly, the alkali metal ions Li⁺ and Na⁺ bind at the lower and upper rim respectively depending on the dimensions of the alkali metal ions versus the size of the cavities formed by the calix[3]arene derivative. The hexahomotrioxacalix[3]arene receptor acts as a heterotritopic receptor, binding with the transition metal ion Ag⁺ and the alkali metals ions Li⁺ and Na⁺. These findings were not applicable to other different sized alkali metals, such as K⁺ and Cs⁺
Where to serve and return in Badminton Men's Double?
This study aims to analyze the service and return landing areas in badminton
men's double, based on data extracted from 20 badminton matches. We find that
most services land near the center-line, while returns tend to land in the
crossing areas of the serving team's court. Using generalized logit models, we
are able to predict the return landing area based on features of the service
and return round. We find that the direction of the service and the footwork
and grip of the receiver could indicate his intended return landing area.
Additionally, we discover that servers tend to intercept in specific areas
based on their serving position. Our results offer valuable insights into the
strategic decisions made by players in the service and return of a badminton
rally
Impacts of pollution abatement projects on happiness: an exploratory study in China
Pollution has been a global concern in recent decades, promoting related actions in an increasing number of areas. While pollution can lead to unhappiness, will pollution abatement simply increase people's happiness? We analyze relevant happiness data collected before and after China's South-to-North Water Diversion Eastern Route Pollution Control Project to test this idea. The empirical results indicate that the pollution abatement project may not enhance happiness in its duration. Some residents may temporarily sacrifice something (such as employment or income), which offsets the positive effect of environmental improvement on their happiness. In subgroup analyses, rural people are found to benefit more from environmental improvement and suffer less in happiness compared with the urban, and so are low-income people compared with the high-income ones. The findings offer new insights into the costs of pollution in terms of happiness and highlight the need to abate pollution meticulously. © 2020 Elsevier Lt
Dual – loop force – displacement mixed control strategy and its application on the quasi – static test
The Quasi-static test is a well-known powerful methodology to evaluate the seismic performance of structural components and systems. One of the most important challenges in the Quasi-static testing is to achieve precise boundary conditions, especially for the axial loading of vertical components. The requirement of synchronized displacement loading and target axial force formed a pair of contradiction. A dual-loop force-displacement mixed control strategy is proposed. The presented approach is successfully verified through the quasi-static testing for a full-scale concrete filled steel tube column. The control targets are achieved with an excellent control performance
- …