118 research outputs found

    3D Ground Truth Generation Using Pre-Trained Deep Neural Networks

    Get PDF
    Training 3D object detectors on publicly available data has been limited to small datasets due to the large amount of effort required to generate annotations. The difficulty of labeling in 3D using 2.5D sensors, such as LIDAR, is attributed to the high spatial reasoning skills required to deal with occlusion and partial viewpoints. Additionally, the current methods to label 3D objects are cognitively demanding due to frequent task switching. Reducing both task complexity and the amount of task switching done by annotators is key to reducing the effort and time required to generate 3D bounding box annotations. We therefore seek to reduce the burden on the annotators by leveraging existing 3D object detectors using deep neural networks. This work introduces a novel ground truth generation method that combines human supervision with pre-trained neural networks to generate per-instance 3D point cloud seg- mentation, 3D bounding boxes, and class annotations. The annotators provide object anchor clicks which behave as a seed to generate instance segmentation results in 3D. The points belonging to each instance are then used to regress object centroids, bounding box dimensions, and object orientation. The deep neural network model used to generate the segmentation masks and bounding box parameters is based on the PointNet architecture. We develop our approach with reliance on the KITTI dataset to analyze the quality of the generated ground truth. The neural network model is trained on KITTI training split and the 3D bounding box outputs are generated using annotation clicks collected from the validation split. The validation split of KITTI detection dataset contains 3712 frames of pointcloud and image scenes and it took 16.35 hours to label with the following method. Based on these results, our approach is 19 times faster than the latest published 3D object annotation scheme. Additionally, it is found that the annotators spent less time per object as the number of objects in the scenes increase, making it a very efficient for multi-object labeling. Furthermore, the quality of the generated 3D bounding boxes, using the labeling method, is compared against the KITTI ground truth. It is shown that the model performs on par with the current state-of-the-art 3D detectors and the labeling procedure does not negatively impact the output quality of the bounding boxes. Lastly, the proposed scheme is applied to previously unseen data from the Autonomoose self-driving vehicle to demonstrate generalization capabilities of the network

    Joint 3D Proposal Generation and Object Detection from View Aggregation

    Full text link
    We present AVOD, an Aggregate View Object Detection network for autonomous driving scenarios. The proposed neural network architecture uses LIDAR point clouds and RGB images to generate features that are shared by two subnetworks: a region proposal network (RPN) and a second stage detector network. The proposed RPN uses a novel architecture capable of performing multimodal feature fusion on high resolution feature maps to generate reliable 3D object proposals for multiple object classes in road scenes. Using these proposals, the second stage detection network performs accurate oriented 3D bounding box regression and category classification to predict the extents, orientation, and classification of objects in 3D space. Our proposed architecture is shown to produce state of the art results on the KITTI 3D object detection benchmark while running in real time with a low memory footprint, making it a suitable candidate for deployment on autonomous vehicles. Code is at: https://github.com/kujason/avodComment: For any inquiries contact aharakeh(at)uwaterloo(dot)c

    An Efficient Transmission Power Control Scheme for Temperature Variation in Wireless Sensor Networks

    Get PDF
    Wireless sensor networks collect data from several nodes dispersed at remote sites. Sensor nodes can be installed in harsh environments such as deserts, cities, and indoors, where the link quality changes considerably over time. Particularly, changes in transmission power may be caused by temperature, humidity, and other factors. In order to compensate for link quality changes, existing schemes detect the link quality changes between nodes and control transmission power through a series of feedback processes, but these approaches can cause heavy overhead with the additional control packets needed. In this paper, the change of the link quality according to temperature is examined through empirical experimentation. A new power control scheme combining both temperature-aware link quality compensation and a closed-loop feedback process to adapt to link quality changes is proposed. We prove that the proposed scheme effectively adapts the transmission power to the changing link quality with less control overhead and energy consumption

    DR.CPO: Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion

    Full text link
    In autonomous driving, data augmentation is commonly used for improving 3D object detection. The most basic methods include insertion of copied objects and rotation and scaling of the entire training frame. Numerous variants have been developed as well. The existing methods, however, are considerably limited when compared to the variety of the real world possibilities. In this work, we develop a diversified and realistic augmentation method that can flexibly construct a whole-body object, freely locate and rotate the object, and apply self-occlusion and external-occlusion accordingly. To improve the diversity of the whole-body object construction, we develop an iterative method that stochastically combines multiple objects observed from the real world into a single object. Unlike the existing augmentation methods, the constructed objects can be randomly located and rotated in the training frame because proper occlusions can be reflected to the whole-body objects in the final step. Finally, proper self-occlusion at each local object level and external-occlusion at the global frame level are applied using the Hidden Point Removal (HPR) algorithm that is computationally efficient. HPR is also used for adaptively controlling the point density of each object according to the object's distance from the LiDAR. Experiment results show that the proposed DR.CPO algorithm is data-efficient and model-agnostic without incurring any computational overhead. Also, DR.CPO can improve mAP performance by 2.08% when compared to the best 3D detection result known for KITTI dataset. The code is available at https://github.com/SNU-DRL/DRCPO.gi

    Enhanced Electrochemical Performances of Hollow-Structured N-Doped Carbon Derived from a Zeolitic Imidazole Framework (ZIF-8) Coated by Polydopamine as an Anode for Lithium-Ion Batteries

    Get PDF
    Doping heteroatoms such as nitrogen (N) and boron (B) into the framework of carbon materials is one of the most efficient methods to improve the electrical performance of carbon-based electrodes. In this study, N-doped carbon has been facilely synthesized using a ZIF-8/polydopamine precursor. The polyhedral structure of ZIF-8 and the effective surface-coating capability of dopamine enabled the formation of N-doped carbon with a hollow structure. The ZIF-8 polyhedron served as a sacrificial template for hollow structures, and dopamine participated as a donor of the nitrogen element. When compared to ZIF-8-derived carbon, the HSNC electrode showed an improved reversible capacity of approximately 1398 mAh·g−1 after 100 cycles, with excellent cycling retention at a voltage range of 0.01 to 3.0 V using a current density of 0.1 A·g−1

    Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders

    Full text link
    Knowledge distillation (KD) has been a ubiquitous method for model compression to strengthen the capability of a lightweight model with the transferred knowledge from the teacher. In particular, KD has been employed in quantization-aware training (QAT) of Transformer encoders like BERT to improve the accuracy of the student model with the reduced-precision weight parameters. However, little is understood about which of the various KD approaches best fits the QAT of Transformers. In this work, we provide an in-depth analysis of the mechanism of KD on attention recovery of quantized large Transformers. In particular, we reveal that the previously adopted MSE loss on the attention score is insufficient for recovering the self-attention information. Therefore, we propose two KD methods; attention-map and attention-output losses. Furthermore, we explore the unification of both losses to address task-dependent preference between attention-map and output losses. The experimental results on various Transformer encoder models demonstrate that the proposed KD methods achieve state-of-the-art accuracy for QAT with sub-2-bit weight quantization.Comment: EMNLP 2022 Main Track Long Pape

    Token-Scaled Logit Distillation for Ternary Weight Generative Language Models

    Full text link
    Generative Language Models (GLMs) have shown impressive performance in tasks such as text generation, understanding, and reasoning. However, the large model size poses challenges for practical deployment. To solve this problem, Quantization-Aware Training (QAT) has become increasingly popular. However, current QAT methods for generative models have resulted in a noticeable loss of accuracy. To counteract this issue, we propose a novel knowledge distillation method specifically designed for GLMs. Our method, called token-scaled logit distillation, prevents overfitting and provides superior learning from the teacher model and ground truth. This research marks the first evaluation of ternary weight quantization-aware training of large-scale GLMs with less than 1.0 degradation in perplexity and no loss of accuracy in a reasoning task
    corecore