Search CORE

54,262 research outputs found

Design Methodology for Face Detection Acceleration

Author: Acasandrei Laurentiu
Barriga Barros Ángel
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

A design methodology to accelerate the face detection for embedded systems is described, starting from high level (algorithm optimization) and ending with low level (software and hardware codesign) by addressing the issues and the design decisions made at each level based on the performance measurements and system limitations. The implemented embedded face detection system consumes very little power compared with the traditional PC software implementations while maintaining the same detection accuracy. The proposed face detection acceleration methodology is suitable for real time applications.Ministerio español de Ciencia y Tecnología TEC2011-24319Junta de Andalucía FEDER P08-TIC-0367

idUS. Depósito de Investigación Universidad de Sevilla

An Efficient and Cost Effective FPGA Based Implementation of the Viola-Jones Face Detection Algorithm

Author: Ababei Cristinel
Bader Curtis
Irgens Peter
Lé Theresa
Saxena Devansh
Publication venue: e-Publications@Marquette
Publication date: 01/01/2017
Field of study

We present an field programmable gate arrays (FPGA) based implementation of the popular Viola-Jones face detection algorithm, which is an essential building block in many applications such as video surveillance and tracking. Our implementation is a complete system level hardware design described in a hardware description language and validated on the affordable DE2-115 evaluation board. Our primary objective is to study the achievable performance with a low-end FPGA chip based implementation. In addition, we release to the public domain the entire project. We hope that this will enable other researchers to easily replicate and compare their results to ours and that it will encourage and facilitate further research and educational ideas in the areas of image processing, computer vision, and advanced digital design and FPGA prototyping

epublications@Marquette

Directory of Open Access Journals

Rushes video summarization using a collaborative approach

Author: Bailer Werner
Bredin Hervé
Byrne Daragh
Dumont Emilie
Essid Slim
Haller Martin
Jones Gareth J.F.
Krutz Andreas
Merialdo Bernard
O'Connor Noel E.
Piatrik Tomas
Rehatschek Herwig
Sikora Thomas
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2008
Field of study

This paper describes the video summarization system developed by the partners of the K-Space European Network of Excellence for the TRECVID 2008 BBC rushes summarization evaluation. We propose an original method based on individual content segmentation and selection tools in a collaborative system. Our system is organized in several steps. First, we segment the video, secondly we identify relevant and redundant segments, and finally, we select a subset of segments to concatenate and build the final summary with video acceleration incorporated. We analyze the performance of our system through the TRECVID evaluation

DCU Online Research Access Service

EURECOM Repository

Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems

Author: abadi
anderson
baker
chetlur
cortes
dong
he
hsu
kim
li
real
sutton
tan
watkins
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 18/11/2018
Field of study

Deep Learning is increasingly being adopted by industry for computer vision applications running on embedded devices. While Convolutional Neural Networks' accuracy has achieved a mature and remarkable state, inference latency and throughput are a major concern especially when targeting low-cost and low-power embedded platforms. CNNs' inference latency may become a bottleneck for Deep Learning adoption by industry, as it is a crucial specification for many real-time processes. Furthermore, deployment of CNNs across heterogeneous platforms presents major compatibility issues due to vendor-specific technology and acceleration libraries. In this work, we present QS-DNN, a fully automatic search based on Reinforcement Learning which, combined with an inference engine optimizer, efficiently explores through the design space and empirically finds the optimal combinations of libraries and primitives to speed up the inference of CNNs on heterogeneous embedded devices. We show that, an optimized combination can achieve 45x speedup in inference latency on CPU compared to a dependency-free baseline and 2x on average on GPGPU compared to the best vendor library. Further, we demonstrate that, the quality of results and time "to-solution" is much better than with Random Search and achieves up to 15x better results for a short-time search

arXiv.org e-Print Archive

Crossref