66 research outputs found

    Spatial-Temporal-Aware Safe Multi-Agent Reinforcement Learning of Connected Autonomous Vehicles in Challenging Scenarios

    Full text link
    Communication technologies enable coordination among connected and autonomous vehicles (CAVs). However, it remains unclear how to utilize shared information to improve the safety and efficiency of the CAV system. In this work, we propose a framework of constrained multi-agent reinforcement learning (MARL) with a parallel safety shield for CAVs in challenging driving scenarios. The coordination mechanisms of the proposed MARL include information sharing and cooperative policy learning, with Graph Convolutional Network (GCN)-Transformer as a spatial-temporal encoder that enhances the agent's environment awareness. The safety shield module with Control Barrier Functions (CBF)-based safety checking protects the agents from taking unsafe actions. We design a constrained multi-agent advantage actor-critic (CMAA2C) algorithm to train safe and cooperative policies for CAVs. With the experiment deployed in the CARLA simulator, we verify the effectiveness of the safety checking, spatial-temporal encoder, and coordination mechanisms designed in our method by comparative experiments in several challenging scenarios with the defined hazard vehicles (HAZV). Results show that our proposed methodology significantly increases system safety and efficiency in challenging scenarios.Comment: This paper has been accepted by the 2023 IEEE International Conference on Robotics and Automation (ICRA 2023). 6 pages, 5 figure

    Shared Information-Based Safe And Efficient Behavior Planning For Connected Autonomous Vehicles

    Full text link
    The recent advancements in wireless technology enable connected autonomous vehicles (CAVs) to gather data via vehicle-to-vehicle (V2V) communication, such as processed LIDAR and camera data from other vehicles. In this work, we design an integrated information sharing and safe multi-agent reinforcement learning (MARL) framework for CAVs, to take advantage of the extra information when making decisions to improve traffic efficiency and safety. We first use weight pruned convolutional neural networks (CNN) to process the raw image and point cloud LIDAR data locally at each autonomous vehicle, and share CNN-output data with neighboring CAVs. We then design a safe actor-critic algorithm that utilizes both a vehicle's local observation and the information received via V2V communication to explore an efficient behavior planning policy with safety guarantees. Using the CARLA simulator for experiments, we show that our approach improves the CAV system's efficiency in terms of average velocity and comfort under different CAV ratios and different traffic densities. We also show that our approach avoids the execution of unsafe actions and always maintains a safe distance from other vehicles. We construct an obstacle-at-corner scenario to show that the shared vision can help CAVs to observe obstacles earlier and take action to avoid traffic jams.Comment: This paper gets the Best Paper Award in the DCAA workshop of AAAI 202

    Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications

    Full text link
    Reward design is a key component of deep reinforcement learning, yet some tasks and designer's objectives may be unnatural to define as a scalar cost function. Among the various techniques, formal methods integrated with DRL have garnered considerable attention due to their expressiveness and flexibility to define the reward and requirements for different states and actions of the agent. However, how to leverage Signal Temporal Logic (STL) to guide multi-agent reinforcement learning reward design remains unexplored. Complex interactions, heterogeneous goals and critical safety requirements in multi-agent systems make this problem even more challenging. In this paper, we propose a novel STL-guided multi-agent reinforcement learning framework. The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications, and the robustness values of the STL specifications are leveraged to generate rewards. We validate the advantages of our method through empirical studies. The experimental results demonstrate significant reward performance improvements compared to MARL without STL guidance, along with a remarkable increase in the overall safety rate of the multi-agent systems

    SIRT1 Activation by Resveratrol Alleviates Cardiac Dysfunction via Mitochondrial Regulation in Diabetic Cardiomyopathy Mice

    Get PDF
    Background. Diabetic cardiomyopathy (DCM) is a major threat for diabetic patients. Silent information regulator 1 (SIRT1) has a regulatory effect on mitochondrial dynamics, which is associated with DCM pathological changes. Our study aims to investigate whether resveratrol, a SRIT1 activator, could exert a protective effect against DCM. Methods and Results. Cardiac-specific SIRT1 knockout (SIRT1KO) mice were generated using Cre-loxP system. SIRT1KO mice displayed symptoms of DCM, including cardiac hypertrophy and dysfunction, insulin resistance, and abnormal glucose metabolism. DCM and SIRT1KO hearts showed impaired mitochondrial biogenesis and function, while SIRT1 activation by resveratrol reversed this in DCM mice. High glucose caused increased apoptosis, impaired mitochondrial biogenesis, and function in cardiomyocytes, which was alleviated by resveratrol. SIRT1 deletion by both SIRT1KO and shRNA abolished the beneficial effects of resveratrol. Furthermore, the function of SIRT1 is mediated via the deacetylation effect on peroxisome proliferator-activated receptor gamma coactivator 1-alpha (PGC-1α), thus inducing increased expression of nuclear respiratory factor 1 (NRF-1), NRF-2, estrogen-related receptor-α (ERR-α), and mitochondrial transcription factor A (TFAM). Conclusions. Cardiac deletion of SIRT1 caused phenotypes resembling DCM. Activation of SIRT1 by resveratrol ameliorated cardiac injuries in DCM through PGC-1α-mediated mitochondrial regulation. Collectively, SIRT1 may serve as a potential therapeutic target for DCM

    Searching for the nano-Hertz stochastic gravitational wave background with the Chinese Pulsar Timing Array Data Release I

    Full text link
    Observing and timing a group of millisecond pulsars (MSPs) with high rotational stability enables the direct detection of gravitational waves (GWs). The GW signals can be identified from the spatial correlations encoded in the times-of-arrival of widely spaced pulsar-pairs. The Chinese Pulsar Timing Array (CPTA) is a collaboration aiming at the direct GW detection with observations carried out using Chinese radio telescopes. This short article serves as a `table of contents' for a forthcoming series of papers related to the CPTA Data Release 1 (CPTA DR1) which uses observations from the Five-hundred-meter Aperture Spherical radio Telescope (FAST). Here, after summarizing the time span and accuracy of CPTA DR1, we report the key results of our statistical inference finding a correlated signal with amplitude \log A_{\rm c}= -14.4 \,^{+1.0}_{-2.8} for spectral index in the range of α[1.8,1.5]\alpha\in [-1.8, 1.5] assuming a GW background (GWB) induced quadrupolar correlation. The search for the Hellings-Downs (HD) correlation curve is also presented, where some evidence for the HD correlation has been found that a 4.6-σ\sigma statistical significance is achieved using the discrete frequency method around the frequency of 14 nHz. We expect that the future International Pulsar Timing Array data analysis and the next CPTA data release will be more sensitive to the nHz GWB, which could verify the current results.Comment: 18 pages, 6 figures, submitted to "Research in astronomy and astrophysics" 22nd March 202

    Hominin occupation of the Chinese Loess Plateau since about 2.1 million years ago

    Get PDF
    Considerable attention has been paid to dating the earliest appearance of hominins outside Africa. The earliest skeletal and artefactual evidence for the genus Homo in Asia currently comes from Dmanisi, Georgia, and is dated to approximately 1.77-1.85 million years ago (Ma)(1). Two incisors that may belong to Homo erectus come from Yuanmou, south China, and are dated to 1.7 Ma(2); the next-oldest evidence is an H. erectus cranium from Lantian (Gongwangling)-which has recently been dated to 1.63 Ma(3) and the earliest hominin fossils from the Sangiran dome in Java, which are dated to about 1.5-1.6 Ma(4). Artefacts from Majuangou III5 and Shangshazui(6) in the Nihewan basin, north China, have also been dated to 1.6-1.7 Ma. Here we report an Early Pleistocene and largely continuous artefact sequence from Shangchen, which is a newly discovered Palaeolithic locality of the southern Chinese Loess Plateau, near Gongwangling in Lantian county. The site contains 17 artefact layers that extend from palaeosol S15-dated to approximately 1.26 Ma-to loess L28, which we date to about 2.12 Ma. This discovery implies that hominins left Africa earlier than indicated by the evidence from Dmanisi
    corecore