206 research outputs found
RSGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Stereo depth estimation is used for many computer vision applications. Though
many popular methods strive solely for depth quality, for real-time mobile
applications (e.g. prosthetic glasses or micro-UAVs), speed and power
efficiency are equally, if not more, important. Many real-world systems rely on
Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but
power efficiency is hard to achieve with conventional hardware, making the use
of embedded devices such as FPGAs attractive for low-power applications.
However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so
most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA
context, the accuracy of SGM has been improved by More Global Matching (MGM),
which also helps tackle the streaking artifacts that afflict SGM. In this
paper, we propose a novel, resource-efficient method that is inspired by MGM's
techniques for improving depth quality, but which can be implemented to run in
real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI
and Middlebury), we show that in comparison to other real-time capable stereo
approaches, we can achieve a state-of-the-art balance between accuracy, power
efficiency and speed, making our approach highly desirable for use in real-time
systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4
table
A Proposed Decision−Support System for (Renal) Cancer Imaging
Medical practitioners who treat cancer patients have crucial decisions to make about the best course of action for individual patients, but limited time in which to make those decisions, and limited information to help them do so. Our aim is to explore the requirements of a decision-support system which will aid them in this task. This system will be initially targeted at renal (kidney) cancer, but is intended to be extensible to other cancers in the future
Straight to Shapes: Real-time Detection of Encoded Shapes
Current object detection approaches predict bounding boxes that provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to regress directly to objects’ shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher order concepts such as view similarity, pose variation and occlusion. To achieve this, we use a denoising convolutional auto-encoder to learn a low-dimensional shape embedding space. We place the decoder network after a fast end-to end deep convolutional network that is trained to regress directly to the shape vectors provided by the auto-encoder. This yields what to the best of our knowledge is the first real-time shape prediction network, running at 35 FPS on a high-end desktop. With higher-order shape reasoning well integrated into the network pipeline, the network shows the useful practical quality of generalising to unseen categories that are similar to the ones in the training set, something that most existing approaches fail to handle
Deep fully-connected part-based models for human pose estimation
We propose a 2D multi-level appearance representation of the human body in RGB images, spatially modelled using a fully-connected graphical model. The appearance model is based on a CNN body part detector, which uses shared features in a cascade architecture to simultaneously detect body parts with different levels of granularity. We use a fully-connected Conditional Random Field (CRF) as our spatial model, over which approximate inference is efficiently performed using the Mean-Field algorithm, implemented as a Recurrent Neural Network (RNN). The stronger visual support from body parts with different levels of granularity, along with the fully-connected pairwise spatial relations, which have their weights learnt by the model, improve the performance of the bottom-up part detector. We adopt an end-to-end training strategy to leverage the potential of both our appearance and spatial models, and achieve competitive results on the MPII and LSP datasets
- …
