Search CORE

59,012 research outputs found

High efficiency compression for object detection

Author: Bajic Ivan V.
Choi Hyomin
Publication venue
Publication date: 15/02/2018
Field of study

Image and video compression has traditionally been tailored to human vision. However, modern applications such as visual analytics and surveillance rely on computers seeing and analyzing the images before (or instead of) humans. For these applications, it is important to adjust compression to computer vision. In this paper we present a bit allocation and rate control strategy that is tailored to object detection. Using the initial convolutional layers of a state-of-the-art object detector, we create an importance map that can guide bit allocation to areas that are important for object detection. The proposed method enables bit rate savings of 7% or more compared to default HEVC, at the equivalent object detection rate.Comment: The paper is published in IEEE ICASSP 18

arXiv.org e-Print Archive

Crossref

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design

Author: Cai Yuxuan
Li Hongjia
Li Yanyu
Niu Wei
Ren Bin
Tang Xulong
Wang Yanzhi
Yuan Geng
Publication venue
Publication date: 30/12/2020
Field of study

The rapid development and wide utilization of object detection techniques have aroused attention on both accuracy and speed of object detectors. However, the current state-of-the-art object detection works are either accuracy-oriented using a large model but leading to high latency or speed-oriented using a lightweight model but sacrificing accuracy. In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design. A novel block-punched pruning scheme is proposed for any kernel size. To improve computational efficiency on mobile devices, a GPU-CPU collaborative scheme is adopted along with advanced compiler-assisted optimizations. Experimental results indicate that our pruning scheme achieves 14

\times

compression rate of YOLOv4 with 49.0 mAP. Under our YOLObile framework, we achieve 17 FPS inference speed using GPU on Samsung Galaxy S20. By incorporating our proposed GPU-CPU collaborative scheme, the inference speed is increased to 19.1 FPS, and outperforms the original YOLOv4 by 5

\times

speedup. Source code is at: \url{https://github.com/nightsnack/YOLObile}

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Spiking Neural Network for Ultra-low-latency and High-accurate Object Detection

Author: Gao Zeyu
Lu Yanfeng
Qiao Hong
Qu Jinye
Tang Huajin
Zhang Tielin
Publication venue
Publication date: 27/06/2023
Field of study

Spiking Neural Networks (SNNs) have garnered widespread interest for their energy efficiency and brain-inspired event-driven properties. While recent methods like Spiking-YOLO have expanded the SNNs to more challenging object detection tasks, they often suffer from high latency and low detection accuracy, making them difficult to deploy on latency sensitive mobile platforms. Furthermore, the conversion method from Artificial Neural Networks (ANNs) to SNNs is hard to maintain the complete structure of the ANNs, resulting in poor feature representation and high conversion errors. To address these challenges, we propose two methods: timesteps compression and spike-time-dependent integrated (STDI) coding. The former reduces the timesteps required in ANN-SNN conversion by compressing information, while the latter sets a time-varying threshold to expand the information holding capacity. We also present a SNN-based ultra-low latency and high accurate object detection model (SUHD) that achieves state-of-the-art performance on nontrivial datasets like PASCAL VOC and MS COCO, with about remarkable 750x fewer timesteps and 30% mean average precision (mAP) improvement, compared to the Spiking-YOLO on MS COCO datasets. To the best of our knowledge, SUHD is the deepest spike-based object detection model to date that achieves ultra low timesteps to complete the lossless conversion.Comment: 14 pages, 10 figure

arXiv.org e-Print Archive

Increasing Compression Ratio of Low Complexity Compressive Sensing Video Encoder with Application-Aware Configurable Mechanism

Author: Luo Li
Qiao Fei
Yang Huazhong
Yu Shuang
Publication venue
Publication date: 06/11/2013
Field of study

With the development of embedded video acquisition nodes and wireless video surveillance systems, traditional video coding methods could not meet the needs of less computing complexity any more, as well as the urgent power consumption. So, a low-complexity compressive sensing video encoder framework with application-aware configurable mechanism is proposed in this paper, where novel encoding methods are exploited based on the practical purposes of the real applications to reduce the coding complexity effectively and improve the compression ratio (CR). Moreover, the group of processing (GOP) size and the measurement matrix size can be configured on the encoder side according to the post-analysis requirements of an application example of object tracking to increase the CR of encoder as best as possible. Simulations show the proposed framework of encoder could achieve 60X of CR when the tracking successful rate (SR) is still keeping above 90%.Comment: 5 pages with 6figures and 1 table,conferenc

arXiv.org e-Print Archive

Crossref

Deep Learning with Passive Optical Nonlinear Mapping

Author: Cao Hui
Eliezer Yaniv
Gigan Sylvain
Kim Kyungduk
Shaughnessy Liam
Xia Fei
Publication venue
Publication date: 18/07/2023
Field of study

Deep learning has fundamentally transformed artificial intelligence, but the ever-increasing complexity in deep learning models calls for specialized hardware accelerators. Optical accelerators can potentially offer enhanced performance, scalability, and energy efficiency. However, achieving nonlinear mapping, a critical component of neural networks, remains challenging optically. Here, we introduce a design that leverages multiple scattering in a reverberating cavity to passively induce optical nonlinear random mapping, without the need for additional laser power. A key advantage emerging from our work is that we show we can perform optical data compression, facilitated by multiple scattering in the cavity, to efficiently compress and retain vital information while also decreasing data dimensionality. This allows rapid optical information processing and generation of low dimensional mixtures of highly nonlinear features. These are particularly useful for applications demanding high-speed analysis and responses such as in edge computing devices. Utilizing rapid optical information processing capabilities, our optical platforms could potentially offer more efficient and real-time processing solutions for a broad range of applications. We demonstrate the efficacy of our design in improving computational performance across tasks, including classification, image reconstruction, key-point detection, and object detection, all achieved through optical data compression combined with a digital decoder. Notably, we observed high performance, at an extreme compression ratio, for real-time pedestrian detection. Our findings pave the way for novel algorithms and architectural designs for optical computing.Comment: 16 pages, 7 figure

arXiv.org e-Print Archive

Visual Importance-Biased Image Synthesis Animation

Author: Brown Ross
Maeder Anthony
Pham Binh
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2003
Field of study

Present ray tracing algorithms are computationally intensive, requiring hours of computing time for complex scenes. Our previous work has dealt with the development of an overall approach to the application of visual attention to progressive and adaptive ray-tracing techniques. The approach facilitates large computational savings by modulating the supersampling rates in an image by the visual importance of the region being rendered. This paper extends the approach by incorporating temporal changes into the models and techniques developed, as it is expected that further efficiency savings can be reaped for animated scenes. Applications for this approach include entertainment, visualisation and simulation

Queensland University of Technology ePrints Archive

Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving

Author: López Antonio M.
Wang Peng
Xu Jiaolong
Yang Heng
Publication venue
Publication date: 25/05/2019
Field of study

Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just

1

bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet- and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.Comment: Accepted by ICRA 201

arXiv.org e-Print Archive

Crossref

Diposit Digital de Documents de la UAB

In-Band Disparity Compensation for Multiview Image Compression and View Synthesis

Author: Anantrasirichai N
Bull DR
Canagarajah CN
Redmill DW
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2010
Field of study

Explore Bristol Research