283 research outputs found

    Causal Reinforcement Learning: A Survey

    Full text link
    Reinforcement learning is an essential paradigm for solving sequential decision problems under uncertainty. Despite many remarkable achievements in recent decades, applying reinforcement learning methods in the real world remains challenging. One of the main obstacles is that reinforcement learning agents lack a fundamental understanding of the world and must therefore learn from scratch through numerous trial-and-error interactions. They may also face challenges in providing explanations for their decisions and generalizing the acquired knowledge. Causality, however, offers a notable advantage as it can formalize knowledge in a systematic manner and leverage invariance for effective knowledge transfer. This has led to the emergence of causal reinforcement learning, a subfield of reinforcement learning that seeks to enhance existing algorithms by incorporating causal relationships into the learning process. In this survey, we comprehensively review the literature on causal reinforcement learning. We first introduce the basic concepts of causality and reinforcement learning, and then explain how causality can address core challenges in non-causal reinforcement learning. We categorize and systematically review existing causal reinforcement learning approaches based on their target problems and methodologies. Finally, we outline open issues and future directions in this emerging field.Comment: 48 pages, 10 figure

    Training-Time-Friendly Network for Real-Time Object Detection

    Full text link
    Modern object detectors can rarely achieve short training time, fast inference speed, and high accuracy at the same time. To strike a balance among them, we propose the Training-Time-Friendly Network (TTFNet). In this work, we start with light-head, single-stage, and anchor-free designs, which enable fast inference speed. Then, we focus on shortening training time. We notice that encoding more training samples from annotated boxes plays a similar role as increasing batch size, which helps enlarge the learning rate and accelerate the training process. To this end, we introduce a novel approach using Gaussian kernels to encode training samples. Besides, we design the initiative sample weights for better information utilization. Experiments on MS COCO show that our TTFNet has great advantages in balancing training time, inference speed, and accuracy. It has reduced training time by more than seven times compared to previous real-time detectors while maintaining state-of-the-art performances. In addition, our super-fast version of TTFNet-18 and TTFNet-53 can outperform SSD300 and YOLOv3 by less than one-tenth of their training time, respectively. The code has been made available at \url{https://github.com/ZJULearning/ttfnet}.Comment: Accepted to AAAI2020 (8 pages, 3 figures

    Damage Characteristics of Argillaceous Quartz Sandstone Mesostructure under Different Wetting-drying Conditions

    Get PDF
    Extensive water–rock interaction in the Three Gorges Reservoir area of the Yangtze River leads to rock mass deterioration along the reservoir banks. However, mineral evolution behavior and its effect on the mesostructure deterioration of rocks under the wetting–drying cycle condition remain unknown. So, the wetting–drying cycle tests were conducted on peculiar argillaceous quartz sandstone in TGRA under neutral (pH = 7) and alkaline (pH = 10) water environments. Here, we provided detailed physical and microscopy images data to determine the control mechanism of mineral behavior on the evolution of sandstone’s mesostructure. Under the neutral condition, repeated “absorption and swelling–dehydration and contraction” of clay minerals leads to the repeated physical action of “squeezing–unloading” in the interior of a rock. This results in the initiation and gradual expansion of cracks in the framework mineral quartz, exhibiting failure mode from the interior to the exterior. In contrast, under the alkaline condition, the dissolution on the surface of quartz particles leads to the expansion and connection of pores, implying that the sandstone exhibits failure mode from the exterior to the interior. Moreover, the internal mechanical analysis indicates the minerals are at high pressure because of the expansion of clay minerals in the neutral solution. However, in an alkaline water environment, the extrusion pressure of framework mineral quartz decreases significantly and is not easily broken due to increased porosity. Thus, the evolution behavior of minerals in different water environments plays an important role in the damage of the rock

    CD24 Expression as a Marker for Predicting Clinical Outcome in Human Gliomas

    Get PDF
    CD24 is overexpressed in glioma cells in vitro and in vivo. However, the correlation of its expression with clinicopathological parameters of gliomas and its prognostic significance in this tumor remain largely unknown. To address this problem, 151 glioma specimens and 10 nonneoplastic brain tissues were collected. Quantitative real-time PCR, immunochemistry assay, and Western blot analysis were carried out to investigate the expression of CD24. As per the results, CD24 was overexpressed in gliomas. Its expression levels in glioma tissues with higher grade (P < 0.001) and lower KPS (P < 0.001) were significantly higher than those with lower grade and higher KPS, respectively. Cox multifactor analysis showed that CD24 (P = 0.02) was an independent prognosis factor for human glioma. Our data provides convincing evidence for the first time that the overexpression of CD24 at gene and protein levels is correlated with advanced clinicopathological parameters and poor prognosis in patients with glioma

    PI-RCNN: An Efficient Multi-sensor 3D Object Detector with Point-based Attentive Cont-conv Fusion Module

    Full text link
    LIDAR point clouds and RGB-images are both extremely essential for 3D object detection. So many state-of-the-art 3D detection algorithms dedicate in fusing these two types of data effectively. However, their fusion methods based on Birds Eye View (BEV) or voxel format are not accurate. In this paper, we propose a novel fusion approach named Point-based Attentive Cont-conv Fusion(PACF) module, which fuses multi-sensor features directly on 3D points. Except for continuous convolution, we additionally add a Point-Pooling and an Attentive Aggregation to make the fused features more expressive. Moreover, based on the PACF module, we propose a 3D multi-sensor multi-task network called Pointcloud-Image RCNN(PI-RCNN as brief), which handles the image segmentation and 3D object detection tasks. PI-RCNN employs a segmentation sub-network to extract full-resolution semantic feature maps from images and then fuses the multi-sensor features via powerful PACF module. Beneficial from the effectiveness of the PACF module and the expressive semantic features from the segmentation module, PI-RCNN can improve much in 3D object detection. We demonstrate the effectiveness of the PACF module and PI-RCNN on the KITTI 3D Detection benchmark, and our method can achieve state-of-the-art on the metric of 3D AP.Comment: 8 pages, 5 figure
    corecore