149 research outputs found
Imagine, Initialize, and Explore: An Effective Exploration Method in Multi-Agent Reinforcement Learning
Effective exploration is crucial to discovering optimal strategies for
multi-agent reinforcement learning (MARL) in complex coordination tasks.
Existing methods mainly utilize intrinsic rewards to enable committed
exploration or use role-based learning for decomposing joint action spaces
instead of directly conducting a collective search in the entire
action-observation space. However, they often face challenges obtaining
specific joint action sequences to reach successful states in long-horizon
tasks. To address this limitation, we propose Imagine, Initialize, and Explore
(IIE), a novel method that offers a promising solution for efficient
multi-agent exploration in complex scenarios. IIE employs a transformer model
to imagine how the agents reach a critical state that can influence each
other's transition functions. Then, we initialize the environment at this state
using a simulator before the exploration phase. We formulate the imagination as
a sequence modeling problem, where the states, observations, prompts, actions,
and rewards are predicted autoregressively. The prompt consists of
timestep-to-go, return-to-go, influence value, and one-shot demonstration,
specifying the desired state and trajectory as well as guiding the action
generation. By initializing agents at the critical states, IIE significantly
increases the likelihood of discovering potentially important under-explored
regions. Despite its simplicity, empirical results demonstrate that our method
outperforms multi-agent exploration baselines on the StarCraft Multi-Agent
Challenge (SMAC) and SMACv2 environments. Particularly, IIE shows improved
performance in the sparse-reward SMAC tasks and produces more effective
curricula over the initialized states than other generative methods, such as
CVAE-GAN and diffusion models.Comment: The 38th Annual AAAI Conference on Artificial Intelligenc
Greedy-based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning
Due to the representation limitation of the joint Q value function,
multi-agent reinforcement learning methods with linear value decomposition
(LVD) or monotonic value decomposition (MVD) suffer from relative
overgeneralization. As a result, they can not ensure optimal consistency (i.e.,
the correspondence between individual greedy actions and the maximal true Q
value). In this paper, we derive the expression of the joint Q value function
of LVD and MVD. According to the expression, we draw a transition diagram,
where each self-transition node (STN) is a possible convergence. To ensure
optimal consistency, the optimal node is required to be the unique STN.
Therefore, we propose the greedy-based value representation (GVR), which turns
the optimal node into an STN via inferior target shaping and further eliminates
the non-optimal STNs via superior experience replay. In addition, GVR achieves
an adaptive trade-off between optimality and stability. Our method outperforms
state-of-the-art baselines in experiments on various benchmarks. Theoretical
proofs and empirical results on matrix games demonstrate that GVR ensures
optimal consistency under sufficient exploration
Multiuser Resource Allocation for Semantic-Relay-Aided Text Transmissions
Semantic communication (SemCom) is an emerging technology that extracts
useful meaning from data and sends only relevant semantic information. Thus, it
has the great potential to improve the spectrum efficiency of conventional
wireless systems with bit transmissions, especially in low signal-to-noise
ratio (SNR) and small bandwidth regions. However, the existing works have
mostly overlooked the constraints of mobile devices, which may not have
sufficient capabilities to implement resource-demanding semantic
encoder/decoder based on deep learning. To address this issue, we propose in
this paper a new semantic relay (SemRelay), which is equipped with a semantic
receiver to assist multiuser text transmissions. Specifically, the SemRelay
decodes semantic information from a base station and forwards it to the users
using conventional bit transmission, hence effectively improving text
transmission efficiency. To study the multiuser resource allocation, we
formulate an optimization problem to maximize the multiuser weighted sum-rate
by jointly designing the SemRelay transmit power allocation and system
bandwidth allocation. Although this problem is non-convex and hence challenging
to solve, we propose an efficient algorithm to obtain its high-quality
suboptimal solution by using the block coordinate descent method. Last,
numerical results show the effectiveness of the proposed algorithm as well as
superior performance of the proposed SemRelay over the conventional
decode-and-forward (DF) relay, especially in small bandwidth region.Comment: 6 pages, 3 figures, accepted for IEEE Global Communication Conference
(GLOBECOM) 2023 Workshop on Semantic Communication for 6
Three-dimensional canine displacement patterns in response to translation and controlled tipping retraction strategies
OBJECTIVE: To validate whether applying a well-defined initial three-dimensional (3D) load can create consistently expected tooth movement in patients.
MATERIALS AND METHODS: Twenty-one patients who needed bilateral canine retraction to close extraction space were selected for this split-mouth clinical trial. After initial alignment and leveling, two canines in each patient were randomly assigned to receive either translation (TR) or controlled tipping (CT) load. The load was delivered by segmental T-loops designed to give specific initial moment/force ratios to the canines in each treatment interval (TI), verified with an orthodontic force tester. Maxillary dental casts were made before canine retraction and after each TI. The casts were digitized with a 3D laser scanner. The digital models were superimposed on the palatal rugae region. The 3D canine displacements and the displacement patterns in terms of TR, CT, and torque were calculated for each TI.
RESULTS: The method can reliably detect a TR displacement greater than 0.3 mm and a rotation greater than 1.5°. Ninety-two TIs had displacements that were greater than 0.3 mm and were used for further analysis. Most displacements were oriented within ±45° from the distal direction. The displacement pattern in terms of TR or CT was not uniquely controlled by the initial moment/force ratio.
CONCLUSIONS: The initial load system is not the only key factor controlling tooth movement. Using a segmental T-loop with a well-controlled load system, large variations in canine displacement can be expected clinically
Hounsfield unit change in root and alveolar bone during canine retraction
INTRODUCTION: The objective of this study was to determine the Hounsfield unit (HU) changes in the alveolar bone and root surfaces during controlled canine retractions.
METHODS: Eighteen maxillary canine retraction patients were selected for this split-mouth design clinical trial. The canines in each patient were randomly assigned to receive either translation or controlled tipping treatment. Pretreatment and posttreatment cone-beam computed tomography scans of each patient were used to determine tooth movement direction and HU changes. The alveolar bone and root surface were divided into 108 divisions, respectively. The HUs in each division were measured. Mixed-model analysis of variance was applied to test the HU change distribution at the P <0.05 significance level.
RESULTS: The HU changes varied with the directions relative to the canine movement. The HU reductions occurred at the root surfaces. Larger reductions occurred in the divisions that were perpendicular to the moving direction. However, HUs decreased in the alveolar bone in the moving direction. The highest HU reduction was at the coronal level.
CONCLUSIONS: HU reduction occurs on the root surface in the direction perpendicular to tooth movement and in the alveolar bone in the direction of tooth movement when a canine is retracted
Quantum Image Processing and Its Application to Edge Detection: Theory and Experiment
Processing of digital images is continuously gaining in volume and relevance,
with concomitant demands on data storage, transmission and processing power.
Encoding the image information in quantum-mechanical systems instead of
classical ones and replacing classical with quantum information processing may
alleviate some of these challenges. By encoding and processing the image
information in quantum-mechanical systems, we here demonstrate the framework of
quantum image processing, where a pure quantum state encodes the image
information: we encode the pixel values in the probability amplitudes and the
pixel positions in the computational basis states. Our quantum image
representation reduces the required number of qubits compared to existing
implementations, and we present image processing algorithms that provide
exponential speed-up over their classical counterparts. For the commonly used
task of detecting the edge of an image, we propose and implement a quantum
algorithm that completes the task with only one single-qubit operation,
independent of the size of the image. This demonstrates the potential of
quantum image processing for highly efficient image and video processing in the
big data era.Comment: 13 pages, including 9 figures and 5 appendixe
Does temporary transfer to preoperative hemodialysis influence postoperative outcomes in patients on peritoneal dialysis? A retrospective cohort study
BackgroundThe associations between preoperative transfer to hemodialysis (HD) and postoperative outcomes in patients on chronic peritoneal dialysis (PD) remain unknown. We conducted this retrospective cohort study to investigate whether preoperative HD could influence surgical outcomes in PD patients undergoing major surgeries.MethodsAll chronic PD patients who underwent major surgeries from January 1, 2007, to December 31, 2020, at Peking University First Hospital were screened. Major surgery was defined as surgical procedures under general, lumbar or epidural anesthesia, with more than an overnight hospital stay. Patients under the age of 18, with a dialysis duration of less than 3 months, and those who underwent renal implantation surgeries and procedures exclusively aimed at placing or removing PD catheters were excluded. Patients involved were divided into either HD or PD group based on their preoperative dialysis status for further analysis.ResultsOf 105 PD patients enrolled, 65 continued PD, and 40 switched to HD preoperatively. Patients with preoperative HD were significantly more likely to develop postoperative hyperkalemia. The total complication rates were numerically higher in patients undergoing preoperative HD. After adjustment, the incidence of postoperative hyperkalemia or any other postoperative complication rates were similar between groups. There were no differences in long-term survival between the two groups.ConclusionsIt does not seem indispensable for PD patients to switch to temporary HD before major surgeries
- …