343 research outputs found
Scalable and Sample Efficient Distributed Policy Gradient Algorithms in Multi-Agent Networked Systems
This paper studies a class of multi-agent reinforcement learning (MARL)
problems where the reward that an agent receives depends on the states of other
agents, but the next state only depends on the agent's own current state and
action. We name it REC-MARL standing for REward-Coupled Multi-Agent
Reinforcement Learning. REC-MARL has a range of important applications such as
real-time access control and distributed power control in wireless networks.
This paper presents a distributed and optimal policy gradient algorithm for
REC-MARL. The proposed algorithm is distributed in two aspects: (i) the learned
policy is a distributed policy that maps a local state of an agent to its local
action and (ii) the learning/training is distributed, during which each agent
updates its policy based on its own and neighbors' information. The learned
policy is provably optimal among all local policies and its regret bounds
depend on the dimension of local states and actions. This distinguishes our
result from most existing results on MARL, which often obtain stationary-point
policies. The experimental results of our algorithm for the real-time access
control and power control in wireless networks show that our policy
significantly outperforms the state-of-the-art algorithms and well-known
benchmarks
Joint Beamforming Design for RIS-Assisted Integrated Sensing and Communication Systems
Integrated sensing and communication (ISAC) has been envisioned as a
promising technology to tackle the spectrum congestion problem for future
networks. In this correspondence, we investigate to deploy a reconfigurable
intelligent surface (RIS) in an ISAC system for achieving better performance.
In particular, a multi-antenna base station (BS) simultaneously serves multiple
single-antenna users with the assistance of a RIS and detects potential
targets. The active beamforming of the BS and the passive beamforming of the
RIS are jointly optimized to maximize the achievable sum-rate of the
communication users while satisfying the constraint of beampattern similarity
for radar sensing, the restriction of the RIS, and the transmit power budget.
An efficient alternating algorithm based on the fractional programming (FP),
majorization-minimization (MM), and manifold optimization methods is developed
to convert the resulting non-convex optimization problem into two solvable
sub-problems and iteratively solve them. Simulation studies illustrate the
advancement of deploying RIS in ISAC systems and the effectiveness of the
proposed algorithm.Comment: Accepted by IEEE TV
Anticancer activity of a thymidine quinoxaline conjugate is modulated by cytosolic thymidine pathways
Background High levels of thymidine kinase 1 (TK1) and thymidine phosphorylase (TYMP) are key molecular targets by thymidine therapeutics in cancer treatment. The dual roles of TYMP as a tumor growth factor and a key activation enzyme of anticancer metabolites resulted in a mixed outcome in cancer patients. In this study, we investigated the roles of TK1 and TYMP on a thymidine quinoxaline conjugate to evaluate an alternative to circumvent the contradictive role of TYMP. Methods TK1 and TYMP levels in multiple liver cell lines were assessed along with the cytotoxicity of the thymidine conjugate. Cellular accumulation of the thymidine conjugate was determined with organelle-specific dyes. The impacts of TK1 and TYMP were evaluated with siRNA/shRNA suppression and pseudoviral overexpression. Immunohistochemical analysis was performed on both normal and tumor tissues. In vivo study was carried out with a subcutaneous liver tumor model. Results We found that the thymidine conjugate had varied activities in liver cancer cells with different levels of TK1 and TYMP. The conjugate mainly accumulated at endothelial reticulum and was consistent with cytosolic pathways. TK1 was responsible for the cytotoxicity yet high levels of TYMP counteracted such activities. Levels of TYMP and TK1 in the liver tumor tissues were significantly higher than those of normal liver tissues. Induced TK1 overexpression decreased the selectivity of dT-QX due to the concurring cytotoxicity in normal cells. In contrast, shRNA suppression of TYMP significantly enhanced the selective of the conjugate in vitro and reduced the tumor growth in vivo. Conclusions TK1 was responsible for anticancer activity of dT-QX while levels of TYMP counteracted such an activity. The counteraction by TYMP could be overcome with RNA silencing to significantly enhance the dT-QX selectivity in cancer cells
STUDY ON EARTHQUAKE DESTRUCTION MODE OF THE LARGEST CANAL CROSSING HIGHWAY BRIDGE BASED ON IEM BOUNDARY IN SOUTH-TO-NORTH WATER DIVERSION
To study the dynamic failure mechanism and damage development law of highway bridge structure under the boundary effect in the process of seismic dynamic duration, the Wenchang Highway Bridge with the largest canal crossing in the South-to-North Water Diversion is taken as an example for seismic design analysis. Based on the finite element and infinite element coupling theory, the infinite element method boundary is introduced, the concrete damage plasticity is introduced, and the half-space free field model is established to study the energy dispersion phenomenon of waves in the boundary and the absorption effect of the infinite element method boundary on wave energy is verified. Under different peak acceleration intensities, the seismic response analysis of the bridge structure was carried out. The results show that: Under the action of selected artificial waves, the damage location of the bridge mainly concentrated in the junction of the box girder supported by the pier, the bottom of the pier and the junction of the pier and beam. The damage tends to develop downward near the bottom of the box girder. The damage at both ends of the beam extends from both ends to the middle. And the bottom and top of the pier have penetrating damage. These are weak points in seismic design. At a horizontal peak acceleration of 0.6g, in addition to damage to the pier column, damage also occurred to the bottom of the box girder. Therefore, when the horizontal peak acceleration of the seismic wave is greater than 0.6g, the failure of the bottom of the box girder is paid attention to. Moreover, the IEM boundary has a good control effect on the far-field energy dissipation of the wave, which is simpler and more efficient than the viscous–spring boundary
An Efficient Dynamic Multi-Sources To Single-Destination (DMS-SD) Algorithm In Smart City Navigation Using Adjacent Matrix
Dijkstra's algorithm is one of the most popular classic path planning
algorithms, achieving optimal solutions across a wide range of challenging
tasks. However, it only calculates the shortest distance from one vertex to
another, which is hard to directly apply to the Dynamic Multi-Sources to
Single-Destination (DMS-SD) problem. This paper proposes a modified Dijkstra
algorithm to address the DMS-SD problem, where the destination can be
dynamically changed. Our method deploys the concept of Adjacent Matrix from
Floyd's algorithm and achieves the goal with mathematical calculations. We
formally show that all-pairs shortest distance information in Floyd's algorithm
is not required in our algorithm. Extensive experiments verify the scalability
and optimality of the proposed method.Comment: International Conference On Human-Centered Cognitive Systems (HCCS)
202
A collaborative and dynamic multi-source single-destination navigation algorithm for smart cities
- …