1,358 research outputs found
Finite-time analysis of single-timescale actor-critic
Actor-critic methods have achieved significant success in many challenging
applications. However, its finite-time convergence is still poorly understood
in the most practical single-timescale form. Existing works on analyzing
single-timescale actor-critic have been limited to i.i.d. sampling or tabular
setting for simplicity. We investigate the more practical online
single-timescale actor-critic algorithm on continuous state space, where the
critic assumes linear function approximation and updates with a single
Markovian sample per actor step. Previous analysis has been unable to establish
the convergence for such a challenging scenario. We demonstrate that the online
single-timescale actor-critic method provably finds an -approximate
stationary point with sample
complexity under standard assumptions, which can be further improved to
under the i.i.d. sampling. Our novel framework
systematically evaluates and controls the error propagation between the actor
and critic. It offers a promising approach for analyzing other single-timescale
reinforcement learning algorithms as well
Heterogeneous Federated Learning on a Graph
Federated learning, where algorithms are trained across multiple
decentralized devices without sharing local data, is increasingly popular in
distributed machine learning practice. Typically, a graph structure exists
behind local devices for communication. In this work, we consider parameter
estimation in federated learning with data distribution and communication
heterogeneity, as well as limited computational capacity of local devices. We
encode the distribution heterogeneity by parametrizing distributions on local
devices with a set of distinct -dimensional vectors. We then propose to
jointly estimate parameters of all devices under the -estimation framework
with the fused Lasso regularization, encouraging an equal estimate of
parameters on connected devices in . We provide a general result for our
estimator depending on , which can be further calibrated to obtain
convergence rates for various specific problem setups. Surprisingly, our
estimator attains the optimal rate under certain graph fidelity condition on
, as if we could aggregate all samples sharing the same distribution. If the
graph fidelity condition is not met, we propose an edge selection procedure via
multiple testing to ensure the optimality. To ease the burden of local
computation, a decentralized stochastic version of ADMM is provided, with
convergence rate where denotes the number of iterations.
We highlight that, our algorithm transmits only parameters along edges of
at each iteration, without requiring a central machine, which preserves
privacy. We further extend it to the case where devices are randomly
inaccessible during the training process, with a similar algorithmic
convergence guarantee. The computational and statistical efficiency of our
method is evidenced by simulation experiments and the 2020 US presidential
election data set.Comment: 61 pages, 4 figure
Optimization Landscape of Policy Gradient Methods for Discrete-time Static Output Feedback
In recent times, significant advancements have been made in delving into the
optimization landscape of policy gradient methods for achieving optimal control
in linear time-invariant (LTI) systems. Compared with state-feedback control,
output-feedback control is more prevalent since the underlying state of the
system may not be fully observed in many practical settings. This paper
analyzes the optimization landscape inherent to policy gradient methods when
applied to static output feedback (SOF) control in discrete-time LTI systems
subject to quadratic cost. We begin by establishing crucial properties of the
SOF cost, encompassing coercivity, L-smoothness, and M-Lipschitz continuous
Hessian. Despite the absence of convexity, we leverage these properties to
derive novel findings regarding convergence (and nearly dimension-free rate) to
stationary points for three policy gradient methods, including the vanilla
policy gradient method, the natural policy gradient method, and the
Gauss-Newton method. Moreover, we provide proof that the vanilla policy
gradient method exhibits linear convergence towards local minima when
initialized near such minima. The paper concludes by presenting numerical
examples that validate our theoretical findings. These results not only
characterize the performance of gradient descent for optimizing the SOF problem
but also provide insights into the effectiveness of general policy gradient
methods within the realm of reinforcement learning
Cooling and Crack Suppression of Bone Material Drilling Based on Microtextured Bit Modeled on Dung Beetle
In recent years, the number of patients with orthopedic diseases such as cervical spondylosis has increased, resulting in an increase in the demand for orthopedic surgery. However, thermal necrosis and bone cracks caused by surgery severely restrict the development and progression of orthopedic surgery. For the material of cutting tool processing bone in bone surgery of drilling high temperature lead to cell death, easy to produce the problem such as crack cause secondary damage effects to restore, in this paper, a bionic drill was designed based on the micro-structure of the dung beetle’s head and back. The microstructure configuration parameters were optimized by numerical analysis, and making use of the optical fiber laser marking machine preparation of bionic bit; through drilling test, the mathematical model of drilling temperature and crack generation based on micro-structure characteristic parameters was established by infrared thermal imaging technology and acoustic emission signal technology, and the cooling mechanism and crack suppression strategy were studied. The experimental results show that when the speed is 60 m/min, the cooling effects of the bionic bit T1 and T2 are 15.31% and 19.78%, respectively, and both kinds of bits show obvious crack suppression effect. The research in this paper provides a new idea for precision and efficient machining of bone materials, and the research results will help to improve the design and manufacturing technology and theoretical research level in the field of bone drilling tools
Lattice distortion inducing exciton splitting and coherent quantum beating in CsPbI3 perovskite quantum dots
Anisotropic exchange-splitting in semiconductor quantum dots (QDs) results in
bright-exciton fine-structure-splitting (FSS) important for quantum information
processing. Direct measurement of FSS usually requires single/few QDs at
liquid-helium temperatures, because of its sensitivity to QD size and shape,
whereas measuring and controlling FSS at an ensemble-level seem to be
impossible unless all the dots are made to be nearly the same. Here we report
strong bright-exciton FSS up to 1.6 meV in solution-processed CsPbI3 perovskite
QDs, manifested as quantum beats in ensemble-level transient absorption at
liquid-nitrogen to room temperatures. The splitting is robust to QD size and
shape heterogeneity, and increases with decreasing temperature, pointing
towards a mechanism associated with orthorhombic distortion of perovskite
lattice. Effective-mass-approximation calculations reveal an intrinsic
"fine-structure gap" that agrees well with the observed FSS. This gap stems
from an avoided crossing of bright-excitons confined in
orthorhombically-distorted QDs that are bounded by the pseudocubic {100} family
of planes
Large-scale single-photon imaging
Benefiting from its single-photon sensitivity, single-photon avalanche diode
(SPAD) array has been widely applied in various fields such as fluorescence
lifetime imaging and quantum computing. However, large-scale high-fidelity
single-photon imaging remains a big challenge, due to the complex hardware
manufacture craft and heavy noise disturbance of SPAD arrays. In this work, we
introduce deep learning into SPAD, enabling super-resolution single-photon
imaging over an order of magnitude, with significant enhancement of bit depth
and imaging quality. We first studied the complex photon flow model of SPAD
electronics to accurately characterize multiple physical noise sources, and
collected a real SPAD image dataset (64 32 pixels, 90 scenes, 10
different bit depth, 3 different illumination flux, 2790 images in total) to
calibrate noise model parameters. With this real-world physical noise model, we
for the first time synthesized a large-scale realistic single-photon image
dataset (image pairs of 5 different resolutions with maximum megapixels, 17250
scenes, 10 different bit depth, 3 different illumination flux, 2.6 million
images in total) for subsequent network training. To tackle the severe
super-resolution challenge of SPAD inputs with low bit depth, low resolution,
and heavy noise, we further built a deep transformer network with a
content-adaptive self-attention mechanism and gated fusion modules, which can
dig global contextual features to remove multi-source noise and extract
full-frequency details. We applied the technique on a series of experiments
including macroscopic and microscopic imaging, microfluidic inspection, and
Fourier ptychography. The experiments validate the technique's state-of-the-art
super-resolution SPAD imaging performance, with more than 5 dB superiority on
PSNR compared to the existing methods
- …