118 research outputs found
3D Ground Truth Generation Using Pre-Trained Deep Neural Networks
Training 3D object detectors on publicly available data has been limited to small datasets
due to the large amount of effort required to generate annotations. The difficulty of labeling
in 3D using 2.5D sensors, such as LIDAR, is attributed to the high spatial reasoning skills
required to deal with occlusion and partial viewpoints. Additionally, the current methods
to label 3D objects are cognitively demanding due to frequent task switching. Reducing
both task complexity and the amount of task switching done by annotators is key to
reducing the effort and time required to generate 3D bounding box annotations. We
therefore seek to reduce the burden on the annotators by leveraging existing 3D object
detectors using deep neural networks.
This work introduces a novel ground truth generation method that combines human
supervision with pre-trained neural networks to generate per-instance 3D point cloud seg-
mentation, 3D bounding boxes, and class annotations. The annotators provide object
anchor clicks which behave as a seed to generate instance segmentation results in 3D. The
points belonging to each instance are then used to regress object centroids, bounding box
dimensions, and object orientation. The deep neural network model used to generate the
segmentation masks and bounding box parameters is based on the PointNet architecture.
We develop our approach with reliance on the KITTI dataset to analyze the quality
of the generated ground truth. The neural network model is trained on KITTI training
split and the 3D bounding box outputs are generated using annotation clicks collected
from the validation split. The validation split of KITTI detection dataset contains 3712
frames of pointcloud and image scenes and it took 16.35 hours to label with the following
method. Based on these results, our approach is 19 times faster than the latest published
3D object annotation scheme. Additionally, it is found that the annotators spent less
time per object as the number of objects in the scenes increase, making it a very efficient
for multi-object labeling. Furthermore, the quality of the generated 3D bounding boxes,
using the labeling method, is compared against the KITTI ground truth. It is shown that
the model performs on par with the current state-of-the-art 3D detectors and the labeling
procedure does not negatively impact the output quality of the bounding boxes. Lastly, the
proposed scheme is applied to previously unseen data from the Autonomoose self-driving
vehicle to demonstrate generalization capabilities of the network
Joint 3D Proposal Generation and Object Detection from View Aggregation
We present AVOD, an Aggregate View Object Detection network for autonomous
driving scenarios. The proposed neural network architecture uses LIDAR point
clouds and RGB images to generate features that are shared by two subnetworks:
a region proposal network (RPN) and a second stage detector network. The
proposed RPN uses a novel architecture capable of performing multimodal feature
fusion on high resolution feature maps to generate reliable 3D object proposals
for multiple object classes in road scenes. Using these proposals, the second
stage detection network performs accurate oriented 3D bounding box regression
and category classification to predict the extents, orientation, and
classification of objects in 3D space. Our proposed architecture is shown to
produce state of the art results on the KITTI 3D object detection benchmark
while running in real time with a low memory footprint, making it a suitable
candidate for deployment on autonomous vehicles. Code is at:
https://github.com/kujason/avodComment: For any inquiries contact aharakeh(at)uwaterloo(dot)c
An Efficient Transmission Power Control Scheme for Temperature Variation in Wireless Sensor Networks
Wireless sensor networks collect data from several nodes dispersed at remote sites. Sensor nodes can be installed in harsh environments such as deserts, cities, and indoors, where the link quality changes considerably over time. Particularly, changes in transmission power may be caused by temperature, humidity, and other factors. In order to compensate for link quality changes, existing schemes detect the link quality changes between nodes and control transmission power through a series of feedback processes, but these approaches can cause heavy overhead with the additional control packets needed. In this paper, the change of the link quality according to temperature is examined through empirical experimentation. A new power control scheme combining both temperature-aware link quality compensation and a closed-loop feedback process to adapt to link quality changes is proposed. We prove that the proposed scheme effectively adapts the transmission power to the changing link quality with less control overhead and energy consumption
DR.CPO: Diversified and Realistic 3D Augmentation via Iterative Construction, Random Placement, and HPR Occlusion
In autonomous driving, data augmentation is commonly used for improving 3D
object detection. The most basic methods include insertion of copied objects
and rotation and scaling of the entire training frame. Numerous variants have
been developed as well. The existing methods, however, are considerably limited
when compared to the variety of the real world possibilities. In this work, we
develop a diversified and realistic augmentation method that can flexibly
construct a whole-body object, freely locate and rotate the object, and apply
self-occlusion and external-occlusion accordingly. To improve the diversity of
the whole-body object construction, we develop an iterative method that
stochastically combines multiple objects observed from the real world into a
single object. Unlike the existing augmentation methods, the constructed
objects can be randomly located and rotated in the training frame because
proper occlusions can be reflected to the whole-body objects in the final step.
Finally, proper self-occlusion at each local object level and
external-occlusion at the global frame level are applied using the Hidden Point
Removal (HPR) algorithm that is computationally efficient. HPR is also used for
adaptively controlling the point density of each object according to the
object's distance from the LiDAR. Experiment results show that the proposed
DR.CPO algorithm is data-efficient and model-agnostic without incurring any
computational overhead. Also, DR.CPO can improve mAP performance by 2.08% when
compared to the best 3D detection result known for KITTI dataset. The code is
available at https://github.com/SNU-DRL/DRCPO.gi
Enhanced Electrochemical Performances of Hollow-Structured N-Doped Carbon Derived from a Zeolitic Imidazole Framework (ZIF-8) Coated by Polydopamine as an Anode for Lithium-Ion Batteries
Doping heteroatoms such as nitrogen (N) and boron (B) into the framework of carbon materials is one of the most efficient methods to improve the electrical performance of carbon-based electrodes. In this study, N-doped carbon has been facilely synthesized using a ZIF-8/polydopamine precursor. The polyhedral structure of ZIF-8 and the effective surface-coating capability of dopamine enabled the formation of N-doped carbon with a hollow structure. The ZIF-8 polyhedron served as a sacrificial template for hollow structures, and dopamine participated as a donor of the nitrogen element. When compared to ZIF-8-derived carbon, the HSNC electrode showed an improved reversible capacity of approximately 1398 mAh·g−1 after 100 cycles, with excellent cycling retention at a voltage range of 0.01 to 3.0 V using a current density of 0.1 A·g−1
Understanding and Improving Knowledge Distillation for Quantization-Aware Training of Large Transformer Encoders
Knowledge distillation (KD) has been a ubiquitous method for model
compression to strengthen the capability of a lightweight model with the
transferred knowledge from the teacher. In particular, KD has been employed in
quantization-aware training (QAT) of Transformer encoders like BERT to improve
the accuracy of the student model with the reduced-precision weight parameters.
However, little is understood about which of the various KD approaches best
fits the QAT of Transformers. In this work, we provide an in-depth analysis of
the mechanism of KD on attention recovery of quantized large Transformers. In
particular, we reveal that the previously adopted MSE loss on the attention
score is insufficient for recovering the self-attention information. Therefore,
we propose two KD methods; attention-map and attention-output losses.
Furthermore, we explore the unification of both losses to address
task-dependent preference between attention-map and output losses. The
experimental results on various Transformer encoder models demonstrate that the
proposed KD methods achieve state-of-the-art accuracy for QAT with sub-2-bit
weight quantization.Comment: EMNLP 2022 Main Track Long Pape
Token-Scaled Logit Distillation for Ternary Weight Generative Language Models
Generative Language Models (GLMs) have shown impressive performance in tasks
such as text generation, understanding, and reasoning. However, the large model
size poses challenges for practical deployment. To solve this problem,
Quantization-Aware Training (QAT) has become increasingly popular. However,
current QAT methods for generative models have resulted in a noticeable loss of
accuracy. To counteract this issue, we propose a novel knowledge distillation
method specifically designed for GLMs. Our method, called token-scaled logit
distillation, prevents overfitting and provides superior learning from the
teacher model and ground truth. This research marks the first evaluation of
ternary weight quantization-aware training of large-scale GLMs with less than
1.0 degradation in perplexity and no loss of accuracy in a reasoning task
- …