1,021 research outputs found
Cross-Scale Cost Aggregation for Stereo Matching
Human beings process stereoscopic correspondence across multiple scales.
However, this bio-inspiration is ignored by state-of-the-art cost aggregation
methods for dense stereo correspondence. In this paper, a generic cross-scale
cost aggregation framework is proposed to allow multi-scale interaction in cost
aggregation. We firstly reformulate cost aggregation from a unified
optimization perspective and show that different cost aggregation methods
essentially differ in the choices of similarity kernels. Then, an inter-scale
regularizer is introduced into optimization and solving this new optimization
problem leads to the proposed framework. Since the regularization term is
independent of the similarity kernel, various cost aggregation methods can be
integrated into the proposed general framework. We show that the cross-scale
framework is important as it effectively and efficiently expands
state-of-the-art cost aggregation methods and leads to significant
improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). 2014 (poster, 29.88%
RSGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems
Stereo depth estimation is used for many computer vision applications. Though
many popular methods strive solely for depth quality, for real-time mobile
applications (e.g. prosthetic glasses or micro-UAVs), speed and power
efficiency are equally, if not more, important. Many real-world systems rely on
Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but
power efficiency is hard to achieve with conventional hardware, making the use
of embedded devices such as FPGAs attractive for low-power applications.
However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so
most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA
context, the accuracy of SGM has been improved by More Global Matching (MGM),
which also helps tackle the streaking artifacts that afflict SGM. In this
paper, we propose a novel, resource-efficient method that is inspired by MGM's
techniques for improving depth quality, but which can be implemented to run in
real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI
and Middlebury), we show that in comparison to other real-time capable stereo
approaches, we can achieve a state-of-the-art balance between accuracy, power
efficiency and speed, making our approach highly desirable for use in real-time
systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4
table
Two-Dimensional Gel Electrophoresis Image Registration Using Block-Matching Techniques and Deformation Models
[Abstract] Block-matching techniques have been widely used in the task of estimating displacement in medical images, and they represent the best approach in scenes with deformable structures such as tissues, fluids, and gels. In this article, a new iterative block-matching technique—based on successive deformation, search, fitting, filtering, and interpolation stages—is proposed to measure elastic displacements in two-dimensional polyacrylamide gel electrophoresis (2D–PAGE) images. The proposed technique uses different deformation models in the task of correlating proteins in real 2D electrophoresis gel images, obtaining an accuracy of 96.6% and improving the results obtained with other techniques. This technique represents a general solution, being easy to adapt to different 2D deformable cases and providing an experimental reference for block-matching algorithms.Galicia. Consellería de Economía e Industria; 10MDS014CTGalicia. Consellería de Economía e Industria; 10SIN105004PRInstituto de Salud Carlos III; PI13/0028
Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks
The use of RGB-D information for salient object detection has been
extensively explored in recent years. However, relatively few efforts have been
put towards modeling salient object detection in real-world human activity
scenes with RGBD. In this work, we fill the gap by making the following
contributions to RGB-D salient object detection. (1) We carefully collect a new
SIP (salient person) dataset, which consists of ~1K high-resolution images that
cover diverse real-world scenes from various viewpoints, poses, occlusions,
illuminations, and backgrounds. (2) We conduct a large-scale (and, so far, the
most comprehensive) benchmark comparing contemporary methods, which has long
been missing in the field and can serve as a baseline for future research. We
systematically summarize 32 popular models and evaluate 18 parts of 32 models
on seven datasets containing a total of about 97K images. (3) We propose a
simple general architecture, called Deep Depth-Depurator Network (D3Net). It
consists of a depth depurator unit (DDU) and a three-stream feature learning
module (FLM), which performs low-quality depth map filtering and cross-modal
feature learning respectively. These components form a nested structure and are
elaborately designed to be learned jointly. D3Net exceeds the performance of
any prior contenders across all five metrics under consideration, thus serving
as a strong model to advance research in this field. We also demonstrate that
D3Net can be used to efficiently extract salient object masks from real scenes,
enabling effective background changing application with a speed of 65fps on a
single GPU. All the saliency maps, our new SIP dataset, the D3Net model, and
the evaluation tools are publicly available at
https://github.com/DengPingFan/D3NetBenchmark.Comment: Accepted in TNNLS20. 15 pages, 12 figures. Code:
https://github.com/DengPingFan/D3NetBenchmar
Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements
Optical flow is the pattern of apparent motion of objects in a scene. The
computation of optical flow is a critical component in numerous computer vision
tasks such as object detection, visual object tracking, and activity
recognition. Despite a lot of research, efficiently managing abrupt changes in
motion remains a challenge in motion estimation. This paper proposes novel
variational regularization methods to address this problem since they allow
combining different mathematical concepts into a joint energy minimization
framework. In this work, we incorporate concepts from signal sparsity into
variational regularization for motion estimation. The proposed regularization
uses a robust l1 norm, which promotes sparsity and handles motion
discontinuities. By using this regularization, we promote the sparsity of the
optical flow gradient. This sparsity helps recover a signal even with just a
few measurements. We explore recovering optical flow from a limited set of
linear measurements using this regularizer. Our findings show that leveraging
the sparsity of the derivatives of optical flow reduces computational
complexity and memory needs.Comment: 12 pages, 9 figures, and 3 table
Recommended from our members
Holoscopic 3D image depth estimation and segmentation techniques
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner
- …