Search CORE

1,021 research outputs found

Cross-Scale Cost Aggregation for Stereo Matching

Author: Dongbo Min
Kang Zhang
Lifeng Sun
Qi Tian
Shiqiang Yang
Shuicheng Yan
Yuqiang Fang
Publication venue
Publication date: 03/03/2014
Field of study

Human beings process stereoscopic correspondence across multiple scales. However, this bio-inspiration is ignored by state-of-the-art cost aggregation methods for dense stereo correspondence. In this paper, a generic cross-scale cost aggregation framework is proposed to allow multi-scale interaction in cost aggregation. We firstly reformulate cost aggregation from a unified optimization perspective and show that different cost aggregation methods essentially differ in the choices of similarity kernels. Then, an inter-scale regularizer is introduced into optimization and solving this new optimization problem leads to the proposed framework. Since the regularization term is independent of the similarity kernel, various cost aggregation methods can be integrated into the proposed general framework. We show that the cross-scale framework is important as it effectively and efficiently expands state-of-the-art cost aggregation methods and leads to significant improvements, when evaluated on Middlebury, KITTI and New Tsukuba datasets.Comment: To Appear in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2014 (poster, 29.88%

arXiv.org e-Print Archive

CiteSeerX

Crossref

R $^3$ SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

Author: Cavallari Tommaso
Golodetz Stuart
Rahnama Oscar
Torr Philip H. S.
Walker Simon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/10/2018
Field of study

Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Crossref

Two-Dimensional Gel Electrophoresis Image Registration Using Block-Matching Techniques and Deformation Models

Author: Dorado Julián
Fernández-Lozano Carlos
Rabuñal Juan R.
Rodríguez Álvaro
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

[Abstract] Block-matching techniques have been widely used in the task of estimating displacement in medical images, and they represent the best approach in scenes with deformable structures such as tissues, fluids, and gels. In this article, a new iterative block-matching technique—based on successive deformation, search, fitting, filtering, and interpolation stages—is proposed to measure elastic displacements in two-dimensional polyacrylamide gel electrophoresis (2D–PAGE) images. The proposed technique uses different deformation models in the task of correlating proteins in real 2D electrophoresis gel images, obtaining an accuracy of 96.6% and improving the results obtained with other techniques. This technique represents a general solution, being easy to adapt to different 2D deformable cases and providing an experimental reference for block-matching algorithms.Galicia. Consellería de Economía e Industria; 10MDS014CTGalicia. Consellería de Economía e Industria; 10SIN105004PRInstituto de Salud Carlos III; PI13/0028

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

STEREO MATCHING ALGORITHM BASED ON ILLUMINATION CONTROL TO IMPROVE THE ACCURACY

Author
Publication venue: 'Slovenian Society for Stereology and Quantitative Image Analysis'
Publication date
Field of study

Crossref

Stereo Matching Algorithm Based On Illumination Control To Improve The Accuracy

Author: Abu Hassan Anwar Hasni
Hamzah Rostam Affendi
Ibrahim Haidi
Publication venue: 'Slovenian Society for Stereology and Quantitative Image Analysis'
Publication date: 01/01/2016
Field of study

Crossref

Directory of Open Access Journals

Universiti Teknikal Malaysia Melaka (UTeM) Repository

Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks

Author: Cheng Ming-Ming
Fan Deng-Ping
Hou Qibin
Lin Zheng
Liu Yun
Zhang Zhao
Zhao Jia-Xing
Zhu Menglong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/07/2020
Field of study

The use of RGB-D information for salient object detection has been extensively explored in recent years. However, relatively few efforts have been put towards modeling salient object detection in real-world human activity scenes with RGBD. In this work, we fill the gap by making the following contributions to RGB-D salient object detection. (1) We carefully collect a new SIP (salient person) dataset, which consists of ~1K high-resolution images that cover diverse real-world scenes from various viewpoints, poses, occlusions, illuminations, and backgrounds. (2) We conduct a large-scale (and, so far, the most comprehensive) benchmark comparing contemporary methods, which has long been missing in the field and can serve as a baseline for future research. We systematically summarize 32 popular models and evaluate 18 parts of 32 models on seven datasets containing a total of about 97K images. (3) We propose a simple general architecture, called Deep Depth-Depurator Network (D3Net). It consists of a depth depurator unit (DDU) and a three-stream feature learning module (FLM), which performs low-quality depth map filtering and cross-modal feature learning respectively. These components form a nested structure and are elaborately designed to be learned jointly. D3Net exceeds the performance of any prior contenders across all five metrics under consideration, thus serving as a strong model to advance research in this field. We also demonstrate that D3Net can be used to efficiently extract salient object masks from real scenes, enabling effective background changing application with a speed of 65fps on a single GPU. All the saliency maps, our new SIP dataset, the D3Net model, and the evaluation tools are publicly available at https://github.com/DengPingFan/D3NetBenchmark.Comment: Accepted in TNNLS20. 15 pages, 12 figures. Code: https://github.com/DengPingFan/D3NetBenchmar

arXiv.org e-Print Archive

Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements

Author: Abbas Ghulam
Bouzerdoum Abdesselam
Nawaz Muhammad Wasim
Rahman Muhammad Mahboob Ur
Rashid Faizan
Publication venue
Publication date: 12/01/2024
Field of study

Optical flow is the pattern of apparent motion of objects in a scene. The computation of optical flow is a critical component in numerous computer vision tasks such as object detection, visual object tracking, and activity recognition. Despite a lot of research, efficiently managing abrupt changes in motion remains a challenge in motion estimation. This paper proposes novel variational regularization methods to address this problem since they allow combining different mathematical concepts into a joint energy minimization framework. In this work, we incorporate concepts from signal sparsity into variational regularization for motion estimation. The proposed regularization uses a robust l1 norm, which promotes sparsity and handles motion discontinuities. By using this regularization, we promote the sparsity of the optical flow gradient. This sparsity helps recover a signal even with just a few measurements. We explore recovering optical flow from a limited set of linear measurements using this regularizer. Our findings show that leveraging the sparsity of the derivatives of optical flow reduces computational complexity and memory needs.Comment: 12 pages, 9 figures, and 3 table

arXiv.org e-Print Archive

Recommended from our members

Holoscopic 3D image depth estimation and segmentation techniques

Author: Alazawi Eman
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner

Brunel University Research Archive