39 research outputs found
A robust test for the stationarity assumption in sequential decision making
Reinforcement learning (RL) is a powerful technique that allows an autonomous agent to learn an optimal policy to maximize the expected return. The optimality of various RL algorithms relies on the stationarity assumption, which requires time-invariant state transition and reward functions. However, deviations from stationarity over extended periods often occur in real-world applications like robotics control, health care and digital marketing, resulting in suboptimal policies learned under stationary assumptions. In this paper, we propose a model-based doubly robust procedure for testing the stationarity assumption and detecting change points in offline RL settings with certain degree of homogeneity. Our proposed testing procedure is robust to model misspecifications and can effectively control type-I error while achieving high statistical power, especially in high-dimensional settings. Extensive comparative simulations and a real-world interventional mobile health example illustrate the advantages of our method in detecting change points and optimizing long-term rewards in high-dimensional, non-stationary environments
A reinforcement learning framework for dynamic mediation analysis
Mediation analysis learns the causal effect transmitted via mediator variables between treatments and outcomes, and receives increasing attention in various scientific domains to elucidate causal relations. Most existing works focus on pointexposure studies where each subject only receives one treatment at a single time point. However, there are a number of applications (e.g., mobile health) where the treatments are sequentially assigned over time and the dynamic mediation effects are of primary interest. Proposing a reinforcement learning (RL) framework, we are the first to evaluate dynamic mediation effects in settings with infinite horizons. We decompose the average treatment effect into an immediate direct effect, an immediate mediation effect, a delayed direct effect, and a delayed mediation effect. Upon the identification of each effect component, we further develop robust and semi-parametrically efficient estimators under the RL framework to infer these causal effects. The superior performance of the proposed method is demonstrated through extensive numerical studies, theoretical results, and an analysis of a mobile health dataset. A Python implementation of the proposed procedure is available at https://github.com/linlinlin97/MediationRL
Multivariate Dynamic Mediation Analysis under a Reinforcement Learning Framework
Mediation analysis is an important analytic tool commonly used in a broad
range of scientific applications. In this article, we study the problem of
mediation analysis when there are multivariate and conditionally dependent
mediators, and when the variables are observed over multiple time points. The
problem is challenging, because the effect of a mediator involves not only the
path from the treatment to this mediator itself at the current time point, but
also all possible paths pointed to this mediator from its upstream mediators,
as well as the carryover effects from all previous time points. We propose a
novel multivariate dynamic mediation analysis approach. Drawing inspiration
from the Markov decision process model that is frequently employed in
reinforcement learning, we introduce a Markov mediation process paired with a
system of time-varying linear structural equation models to formulate the
problem. We then formally define the individual mediation effect, built upon
the idea of simultaneous interventions and intervention calculus. We next
derive the closed-form expression and propose an iterative estimation procedure
under the Markov mediation process model. We study both the asymptotic property
and the empirical performance of the proposed estimator, and further illustrate
our method with a mobile health application
A Reinforcement Learning Framework for Dynamic Mediation Analysis
Mediation analysis learns the causal effect transmitted via mediator
variables between treatments and outcomes and receives increasing attention in
various scientific domains to elucidate causal relations. Most existing works
focus on point-exposure studies where each subject only receives one treatment
at a single time point. However, there are a number of applications (e.g.,
mobile health) where the treatments are sequentially assigned over time and the
dynamic mediation effects are of primary interest. Proposing a reinforcement
learning (RL) framework, we are the first to evaluate dynamic mediation effects
in settings with infinite horizons. We decompose the average treatment effect
into an immediate direct effect, an immediate mediation effect, a delayed
direct effect, and a delayed mediation effect. Upon the identification of each
effect component, we further develop robust and semi-parametrically efficient
estimators under the RL framework to infer these causal effects. The superior
performance of the proposed method is demonstrated through extensive numerical
studies, theoretical results, and an analysis of a mobile health dataset
DNet: distributional network for distributional individualized treatment effects
There is a growing interest in developing methods to estimate individualized treatment effects (ITEs) for various real-world applications, such as e-commerce and public health. This paper presents a novel architecture, called DNet, to infer distributional ITEs. DNet can learn the entire outcome distribution for each treatment, whereas most existing methods primarily focus on the conditional average treatment effect and ignore the conditional variance around its expectation. Additionally, our method excels in settings with heavy-tailed outcomes and outperforms state-of-the-art methods in extensive experiments on benchmark and real-world datasets. DNet has also been successfully deployed in a widely used mobile app with millions of daily active users
Water entry of slender segmented projectile connected by spring
An object that enters the water experiences a large impact acceleration at the initial stage of water entry, which can cause structural damage to objects that are dropped or launched into the water. To reduce the peak impact acceleration, a spring-connected segmented projectile with compressible nose was designed. Through inertial measurement unit and high-speed camera, the influence of the nose compressibility on the initial impact acceleration was qualitatively investigated. The experimental results demonstrate that the introduction of a spring between the nose and the main body of the projectile can significantly suppresses the peak acceleration during the early stage of impact (0–50 ms). Furthermore, the maximum impact acceleration experienced by the main body is only related to the maximum compression of the nose without considering the spring stiffness. In addition, using the spring exerts a slight effect on the non-dimensional pinch-off times of the cavity but increases the initial velocity required for the occurrence of cavity pinch-off events on the side of the main bod
Dynamics and hydrodynamic efficiency of diving beetle while swimming
Diving beetle, an excellent biological prototype for bionic underwater vehicles, can achieve forward swimming, backward swimming, and flexible cornering by swinging its two powerful hind legs. An in-depth study of the propulsion performance of them will contribute to the micro underwater vehicles. In this paper, the kinematic and dynamic parameters, and the hydrodynamic efficiency of the diving beetle are studied by analysis of swimming videos using Motion Capture Technology, combined with CFD simulations. The results show that the hind legs of diving beetle can achieve high propulsion force and low return resistance during one propulsion cycle at both forward and backward swimming modes. The propulsion efficiencies of forward and backward swimming are 0.47 and 0.30, respectively. Although the efficiency of backward swimming is lower, the diving beetle can reach a higher speed in a short time at this mode, which can help it avoid natural enemies. At backward swimming mode, there is a long period of passive swing of hind legs, larger drag exists at higher speed during the recovery stroke, which reduces the propulsion efficiency to a certain extent. Reasonable planning of the swing speed of the hind legs during the power stroke and the recovery stroke can obtain the highest propulsion efficiency of this propulsion method. This work will be useful for the development of a bionic propulsion system of micro underwater vehicle
Effects of eigen and actual frequencies of soft elastic surfaces on droplet rebound from stationary flexible feather vanes
The aim of this paper is to investigate the effect of eigenfrequency and the actual frequency of the elastic surface for the droplet rebound. The elastic surface used in this study is the stationary flexible feather vanes. A fluid-structure interaction (FSI) numerical model is proposed to predict the phenomenon, and later is validated by the experimental that the droplets impact the stationary flexible feather vanes. The effect of mass and stiffness of the surface is analysed. First, the suitable combination of mass and stiffness of the surface will enhance the drop rebound. Second, a small mass system with higher eigenfrequency will decrease the minimum contact time. In the last, the actual frequencies of the elastic surface, approximate at 75 Hz, can accelerate the drop rebound for all cases
Two-Photon Rabi Splitting in a Coupled System of a Nanocavity and Exciton Complexes
Two-photon Rabi splitting in a cavity-dot system provides a basis for
multi-qubit coherent control in quantum photonic network. Here we report on
two-photon Rabi splitting in a strongly coupled cavity-dot system. The quantum
dot was grown intentionally large in size for large oscillation strength and
small biexciton binding energy. Both exciton and biexciton transitions couple
to a high quality factor photonic crystal cavity with large coupling strengths
over 130 eV. Furthermore, the small binding energy enables the cavity to
simultaneously couple with two exciton states. Thereby two-photon Rabi
splitting between biexciton and cavity is achieved, which can be well
reproduced by theoretical calculations with quantum master equations.Comment: 12 pages, 4 figure
Titanium Nitride Film on Sapphire Substrate with Low Dielectric Loss for Superconducting Qubits
Dielectric loss is one of the major decoherence sources of superconducting
qubits. Contemporary high-coherence superconducting qubits are formed by
material systems mostly consisting of superconducting films on substrate with
low dielectric loss, where the loss mainly originates from the surfaces and
interfaces. Among the multiple candidates for material systems, a combination
of titanium nitride (TiN) film and sapphire substrate has good potential
because of its chemical stability against oxidization, and high quality at
interfaces. In this work, we report a TiN film deposited onto sapphire
substrate achieving low dielectric loss at the material interface. Through the
systematic characterizations of a series of transmon qubits fabricated with
identical batches of TiN base layers, but different geometries of qubit
shunting capacitors with various participation ratios of the material
interface, we quantitatively extract the loss tangent value at the
substrate-metal interface smaller than in 1-nm disordered
layer. By optimizing the interface participation ratio of the transmon qubit,
we reproducibly achieve qubit lifetimes of up to 300 s and quality factors
approaching 8 million. We demonstrate that TiN film on sapphire substrate is an
ideal material system for high-coherence superconducting qubits. Our analyses
further suggest that the interface dielectric loss around the Josephson
junction part of the circuit could be the dominant limitation of lifetimes for
state-of-the-art transmon qubits