11,097 research outputs found
Snyder's Model -- de Sitter Special Relativity Duality and de Sitter Gravity
Between Snyder's quantized space-time model in de Sitter space of momenta and
the \dS special relativity on \dS-spacetime of radius with Beltrami
coordinates, there is a one-to-one dual correspondence supported by a minimum
uncertainty-like argument. Together with Planck length , should be a fundamental constant. They lead to a
dimensionless constant . These indicate that physics at these two scales should be dual to
each other and there is in-between gravity of local \dS-invariance
characterized by . A simple model of \dS-gravity with a gauge-like action on
umbilical manifolds may show these characters. It can pass the observation
tests and support the duality.Comment: 32 page
Newton-Hooke Limit of Beltrami-de Sitter Spacetime, Principles of Galilei-Hooke's Relativity and Postulate on Newton-Hooke Universal Time
Based on the Beltrami-de Sitter spacetime, we present the Newton-Hooke model
under the Newton-Hooke contraction of the spacetime with respect to the
transformation group, algebra and geometry. It is shown that in Newton-Hooke
space-time, there are inertial-type coordinate systems and inertial-type
observers, which move along straight lines with uniform velocity. And they are
invariant under the Newton-Hooke group. In order to determine uniquely the
Newton-Hooke limit, we propose the Galilei-Hooke's relativity principle as well
as the postulate on Newton-Hooke universal time. All results are readily
extended to the Newton-Hooke model as a contraction of Beltrami-anti-de Sitter
spacetime with negative cosmological constant.Comment: 25 pages, 3 figures; some misprints correcte
Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective
Reinforcement learning (RL) is a powerful tool for solving complex
decision-making problems, but its lack of transparency and interpretability has
been a major challenge in domains where decisions have significant real-world
consequences. In this paper, we propose a novel Advantage Actor-Critic with
Reasoner (A2CR), which can be easily applied to Actor-Critic-based RL models
and make them interpretable. A2CR consists of three interconnected networks:
the Policy Network, the Value Network, and the Reasoner Network. By predefining
and classifying the underlying purpose of the actor's actions, A2CR
automatically generates a more comprehensive and interpretable paradigm for
understanding the agent's decision-making process. It offers a range of
functionalities such as purpose-based saliency, early failure detection, and
model supervision, thereby promoting responsible and trustworthy RL.
Evaluations conducted in action-rich Super Mario Bros environments yield
intriguing findings: Reasoner-predicted label proportions decrease for
``Breakout" and increase for ``Hovering" as the exploration level of the RL
algorithm intensifies. Additionally, purpose-based saliencies are more focused
and comprehensible
- …