Search CORE

9 research outputs found

Real-Time Fully Unsupervised Domain Adaptation for Lane Detection in Autonomous Driving

Author: Bhardwaj Kshitij
Goldhahn Ryan
Raychowdhury Arijit
Wan Zishen
Publication venue
Publication date: 28/06/2023
Field of study

While deep neural networks are being utilized heavily for autonomous driving, they need to be adapted to new unseen environmental conditions for which they were not trained. We focus on a safety critical application of lane detection, and propose a lightweight, fully unsupervised, real-time adaptation approach that only adapts the batch-normalization parameters of the model. We demonstrate that our technique can perform inference, followed by on-device adaptation, under a tight constraint of 30 FPS on Nvidia Jetson Orin. It shows similar accuracy (avg. of 92.19%) as a state-of-the-art semi-supervised adaptation algorithm but which does not support real-time adaptation.Comment: Accepted in 2023 Design, Automation & Test in Europe Conference (DATE 2023) - Late Breaking Result

arXiv.org e-Print Archive

RobotPerf: An Open-Source, Vendor-Agnostic, Benchmarking Suite for Evaluating Robotics Computing System Performance

Author: Bakhshalipour Mohammad
Corradi Giulio
Crespo-Álvarez Martiño
Gibbons Phillip B.
Hsiao Yu-Shun
Jabbour Jason
Martínez-Fariña Alejandra
Mayoral-Vilches Víctor
Nagras Prateek
Neuman Sabrina M.
Panigrahi Smruti
Pinzger Martin
Plancher Brian
Rass Stefan
Reddi Vijay Janapa
Reina-Muñoz Juan Manuel
Roy Niladri
Stewart Matthew
Vikhe Gaurav
Wan Zishen
Publication venue
Publication date: 17/09/2023
Field of study

We introduce RobotPerf, a vendor-agnostic benchmarking suite designed to evaluate robotics computing performance across a diverse range of hardware platforms using ROS 2 as its common baseline. The suite encompasses ROS 2 packages covering the full robotics pipeline and integrates two distinct benchmarking approaches: black-box testing, which measures performance by eliminating upper layers and replacing them with a test application, and grey-box testing, an application-specific measure that observes internal system states with minimal interference. Our benchmarking framework provides ready-to-use tools and is easily adaptable for the assessment of custom ROS 2 computational graphs. Drawing from the knowledge of leading robot architects and system architecture experts, RobotPerf establishes a standardized approach to robotics benchmarking. As an open-source initiative, RobotPerf remains committed to evolving with community input to advance the future of hardware-accelerated robotics

arXiv.org e-Print Archive

QuaRL: Quantization for Sustainable Reinforcement Learning

Author: Barth-Maron Gabriel
Chitlangia Sharad
Faust Aleksandra
Krishnan Srivatsan
Lam Maximilian
Reddi Vijay Janapa
Wan Zishen
Publication venue
Publication date: 27/11/2021
Field of study

Deep reinforcement learning has achieved significant milestones, however, the computational demands of reinforcement learning training and inference remain substantial. Quantization is an effective method to reduce the computational overheads of neural networks, though in the context of reinforcement learning, it is unknown whether quantization's computational benefits outweigh the accuracy costs introduced by the corresponding quantization error. To quantify this tradeoff we perform a broad study applying quantization to reinforcement learning. We apply standard quantization techniques such as post-training quantization (PTQ) and quantization aware training (QAT) to a comprehensive set of reinforcement learning tasks (Atari, Gym), algorithms (A2C, DDPG, DQN, D4PG, PPO), and models (MLPs, CNNs) and show that policies may be quantized to 8-bits without degrading reward, enabling significant inference speedups on resource-constrained edge devices. Motivated by the effectiveness of standard quantization techniques on reinforcement learning policies, we introduce a novel quantization algorithm, \textit{ActorQ}, for quantized actor-learner distributed reinforcement learning training. By leveraging full precision optimization on the learner and quantized execution on the actors, \textit{ActorQ} enables 8-bit inference while maintaining convergence. We develop a system for quantized reinforcement learning training around \textit{ActorQ} and demonstrate end to end speedups of

>

1.5

\times

- 2.5

\times

over full precision training on a range of tasks (Deepmind Control Suite). Finally, we break down the various runtime costs of distributed reinforcement learning training (such as communication time, inference time, model load time, etc) and evaluate the effects of quantization on these system attributes.Comment: Equal contribution from first three authors. Updating with QuaRL for sustainable (carbon emissions) RL result

arXiv.org e-Print Archive

Influences of group velocity dispersion on ultrafast pulse shaping in time lens

Author: Peng Xie
Ranzani L
Xie P
Xie P
Yishan Wang
Yu Wen
Zishen Wan
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref

Improving compute in-memory ECC reliability with successive correction

Author: Crafton Brian
De Vivek
Raychowdhury Arijit
Spetalnick Samuel
Tokunaga Carlos
Wan Zishen
Wu Wei
Yoon Jong-Hyeok
Publication venue: Association for Computing Machinery
Publication date: 13/07/2022
Field of study

Compute in-memory (CIM) is an exciting technique that minimizes data transport, maximizes memory throughput, and performs computation on the bitline of memory sub-arrays. This is especially interesting for machine learning applications, where increased memory bandwidth and analog domain computation offer improved area and energy efficiency. Unfortunately, CIM faces new challenges traditional CMOS architectures have avoided. In this work, we explore the impact of device variation (calibrated with measured data on foundry RRAM arrays) and propose a new class of error correcting codes (ECC) for hard and soft errors in CIM. We demonstrate single, double, and triple error correction offering over 16,000× reduction in bit error rate over a design without ECC and over 427× over prior work, while consuming only 29.1% area and 26.3% power overhead. © 2022 ACM

DGIST Library Institutional Repository

Greedy approximation for the minimum connected dominating set with labeling

Author: AL Chiu
D Coudert
D Du
D Du
E Sampathkumar
F Dai
F Wang
GR Grimmett
H Broersma
H Eriksson
J Wu
L Ruan
Majun Shi
MR Garey
R Chang
S Guha
S Krumke
S Stefanakos
S Yuan
U Feige
Wei Wang
Y Wan
Y Xiong
Zishen Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref