Search CORE

18,387 research outputs found

GPU-based Image Analysis on Mobile Devices

Author: Ensor Andrew
Hall Seth
Publication venue
Publication date: 13/12/2011
Field of study

With the rapid advances in mobile technology many mobile devices are capable of capturing high quality images and video with their embedded camera. This paper investigates techniques for real-time processing of the resulting images, particularly on-device utilizing a graphical processing unit. Issues and limitations of image processing on mobile devices are discussed, and the performance of graphical processing units on a range of devices measured through a programmable shader implementation of Canny edge detection.Comment: Proceedings of Image and Vision Computing New Zealand 201

arXiv.org e-Print Archive

AUT Scholarly Commons

Deep Learning-Based Multiple Object Visual Tracking on Embedded System for IoT and Mobile Edge Computing Applications

Author: Blanco-Filgueira Beatriz
Brea Víctor M.
Fernández-Sanjurjo Mauro
García-Lesta Daniel
López Paula
Publication venue
Publication date: 31/07/2018
Field of study

Compute and memory demands of state-of-the-art deep learning methods are still a shortcoming that must be addressed to make them useful at IoT end-nodes. In particular, recent results depict a hopeful prospect for image processing using Convolutional Neural Netwoks, CNNs, but the gap between software and hardware implementations is already considerable for IoT and mobile edge computing applications due to their high power consumption. This proposal performs low-power and real time deep learning-based multiple object visual tracking implemented on an NVIDIA Jetson TX2 development kit. It includes a camera and wireless connection capability and it is battery powered for mobile and outdoor applications. A collection of representative sequences captured with the on-board camera, dETRUSC video dataset, is used to exemplify the performance of the proposed algorithm and to facilitate benchmarking. The results in terms of power consumption and frame rate demonstrate the feasibility of deep learning algorithms on embedded platforms although more effort to joint algorithm and hardware design of CNNs is needed.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl

arXiv.org e-Print Archive

Repositorio Institucional da Universidade de Santiago de Compostela

Construction and Evaluation of an Ultra Low Latency Frameless Renderer for VR.

Author: Friston S
Gaydadjiev G
Steed A
Tilbury S
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/01/2016
Field of study

© 2016 IEEE.Latency-the delay between a users action and the response to this action-is known to be detrimental to virtual reality. Latency is typically considered to be a discrete value characterising a delay, constant in time and space-but this characterisation is incomplete. Latency changes across the display during scan-out, and how it does so is dependent on the rendering approach used. In this study, we present an ultra-low latency real-time ray-casting renderer for virtual reality, implemented on an FPGA. Our renderer has a latency of 1 ms from tracker to pixel. Its frameless nature means that the region of the display with the lowest latency immediately follows the scan-beam. This is in contrast to frame-based systems such as those using typical GPUs, for which the latency increases as scan-out proceeds. Using a series of high and low speed videos of our system in use, we confirm its latency of 1 ms. We examine how the renderer performs when driving a traditional sequential scan-out display on a readily available HMO, the Oculus Rift OK2. We contrast this with an equivalent apparatus built using a GPU. Using captured human head motion and a set of image quality measures, we assess the ability of these systems to faithfully recreate the stimuli of an ideal virtual reality system-one with a zero latency tracker, renderer and display running at 1 kHz. Finally, we examine the results of these quality measures, and how each rendering approach is affected by velocity of movement and display persistence. We find that our system, with a lower average latency, can more faithfully draw what the ideal virtual reality system would. Further, we find that with low display persistence, the sensitivity to velocity of both systems is lowered, but that it is much lower for ours

UCL Discovery

Spiral - Imperial College Digital Repository

Benchmarking CPUs and GPUs on embedded platforms for software receiver usage

Author: Bär W.
Closas Gómez Pau
Dampf J.
Fürlinger K.
García Molina J. A.
Pany T.
Stöber C.
Winkel J.
Publication venue
Publication date: 01/01/2015
Field of study

Smartphones containing multi-core central processing units (CPUs) and powerful many-core graphics processing units (GPUs) bring supercomputing technology into your pocket (or into our embedded devices). This can be exploited to produce power-efficient, customized receivers with flexible correlation schemes and more advanced positioning techniques. For example, promising techniques such as the Direct Position Estimation paradigm or usage of tracking solutions based on particle filtering, seem to be very appealing in challenging environments but are likewise computationally quite demanding. This article sheds some light onto recent embedded processor developments, benchmarks Fast Fourier Transform (FFT) and correlation algorithms on representative embedded platforms and relates the results to the use in GNSS software radios. The use of embedded CPUs for signal tracking seems to be straight forward, but more research is required to fully achieve the nominal peak performance of an embedded GPU for FFT computation. Also the electrical power consumption is measured in certain load levels.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC