Search CORE

543 research outputs found

R $^3$ SGM: Real-time Raster-Respecting Semi-Global Matching for Power-Constrained Systems

Author: Cavallari Tommaso
Golodetz Stuart
Rahnama Oscar
Torr Philip H. S.
Walker Simon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/10/2018
Field of study

Stereo depth estimation is used for many computer vision applications. Though many popular methods strive solely for depth quality, for real-time mobile applications (e.g. prosthetic glasses or micro-UAVs), speed and power efficiency are equally, if not more, important. Many real-world systems rely on Semi-Global Matching (SGM) to achieve a good accuracy vs. speed balance, but power efficiency is hard to achieve with conventional hardware, making the use of embedded devices such as FPGAs attractive for low-power applications. However, the full SGM algorithm is ill-suited to deployment on FPGAs, and so most FPGA variants of it are partial, at the expense of accuracy. In a non-FPGA context, the accuracy of SGM has been improved by More Global Matching (MGM), which also helps tackle the streaking artifacts that afflict SGM. In this paper, we propose a novel, resource-efficient method that is inspired by MGM's techniques for improving depth quality, but which can be implemented to run in real time on a low-power FPGA. Through evaluation on multiple datasets (KITTI and Middlebury), we show that in comparison to other real-time capable stereo approaches, we can achieve a state-of-the-art balance between accuracy, power efficiency and speed, making our approach highly desirable for use in real-time systems with limited power.Comment: Accepted in FPT 2018 as Oral presentation, 8 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Crossref

Recommended from our members

High-quality dense stereo vision for whole body imaging and obesity assessment

Author: Yao Ming, Ph. D.
Publication venue
Publication date: 12/08/2015
Field of study

textThe prevalence of obesity has necessitated developing safe and convenient tools for timely assessing and monitoring this condition for a broad range of population. Three-dimensional (3D) body imaging has become a new mean for obesity assessment. Moreover, it generates body shape information that is meaningful for fitness, ergonomics, and personalized clothing. In the previous work of our lab, we developed a prototype active stereo vision system that demonstrated a potential to fulfill this goal. But the prototype required four computer projectors to cast artificial textures on the body which facilitate the stereo-matching on texture-deficient images (e.g., skin). This decreases the mobility of the system when used to collect a large population data. In addition, the resolution of the generated 3D~images is limited by both cameras and projectors available during the project. The study reported in this dissertation highlights our continued effort in improving the capability of 3Dbody imaging through simplified hardware for passive stereo and advanced computation techniques. The system utilizes high-resolution single-lens reflex (SLR) cameras, which became widely available lately, and is configured in a two-stance design to image the front and back surfaces of a person. A total of eight cameras are used to form four pairs of stereo units. Each unit covers a quarter of the body surface. The stereo units are individually calibrated with a specific pattern to determine cameras' intrinsic and extrinsic parameters for stereo matching. The global orientation and position of each stereo unit within a common world coordinate system is calculated through a 3Dregistration step. The stereo calibration and 3Dregistration procedures do not need to be repeated for a deployed system if the cameras' relative positions have not changed. This property contributes to the portability of the system, and tremendously alleviates the maintenance task. The image acquisition time is around two seconds for a whole-body capture. The system works in an indoor environment with a moderate ambient light. Advanced stereo computation algorithms are developed by taking advantage of high-resolution images and by tackling the ambiguity problem in stereo matching. A multi-scale, coarse-to-fine matching framework is proposed to match large-scale textures at a low resolution and refine the matched results over higher resolutions. This matching strategy reduces the complexity of the computation and avoids ambiguous matching at the native resolution. The pixel-to-pixel stereo matching algorithm follows a classic, four-step strategy which consists of matching cost computation, cost aggregation, disparity computation and disparity refinement. The system performance has been evaluated on mannequins and human subjects in comparison with other measurement methods. It was found that the geometrical measurements from reconstructed 3Dbody models, including body circumferences and whole volume, are highly repeatable and consistent with manual and other instrumental measurements (CV 0.99). The agreement of percent body fat (%BF) estimation on human subjects between stereo and dual-energy X-ray absorptiometry (DEXA) was found to be improved over the previous active stereo system, and the limits of agreement with 95% confidence were reduced by half. Our achieved %BF estimation agreement is among the lowest ones of other comparative studies with commercialized air displacement plethysmography (ADP) and DEXA. In practice, %BF estimation through a two-component model is sensitive to body volume measurement, and the estimation of lung volume could be a source of variation. Protocols for this type of measurement should still be created with an awareness of this factor.Biomedical Engineerin

Texas ScholarWorks

GPUを用いた高解像度画像の実時間ステレオマッチングシステムの研究

Author: Chang Qiong
常穹
Publication venue
Publication date: 01/01/2019
Field of study

筑波大学 (University of Tsukuba)201

Tsukuba Repository

3D Motion Analysis via Energy Minimization

Author: Wedel Andreas
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

This work deals with 3D motion analysis from stereo image sequences for driver assistance systems. It consists of two parts: the estimation of motion from the image data and the segmentation of moving objects in the input images. The content can be summarized with the technical term machine visual kinesthesia, the sensation or perception and cognition of motion. In the first three chapters, the importance of motion information is discussed for driver assistance systems, for machine vision in general, and for the estimation of ego motion. The next two chapters delineate on motion perception, analyzing the apparent movement of pixels in image sequences for both a monocular and binocular camera setup. Then, the obtained motion information is used to segment moving objects in the input video. Thus, one can clearly identify the thread from analyzing the input images to describing the input images by means of stationary and moving objects. Finally, I present possibilities for future applications based on the contents of this thesis. Previous work in each case is presented in the respective chapters. Although the overarching issue of motion estimation from image sequences is related to practice, there is nothing as practical as a good theory (Kurt Lewin). Several problems in computer vision are formulated as intricate energy minimization problems. In this thesis, motion analysis in image sequences is thoroughly investigated, showing that splitting an original complex problem into simplified sub-problems yields improved accuracy, increased robustness, and a clear and accessible approach to state-of-the-art motion estimation techniques. In Chapter 4, optical flow is considered. Optical flow is commonly estimated by minimizing the combined energy, consisting of a data term and a smoothness term. These two parts are decoupled, yielding a novel and iterative approach to optical flow. The derived Refinement Optical Flow framework is a clear and straight-forward approach to computing the apparent image motion vector field. Furthermore this results currently in the most accurate motion estimation techniques in literature. Much as this is an engineering approach of fine-tuning precision to the last detail, it helps to get a better insight into the problem of motion estimation. This profoundly contributes to state-of-the-art research in motion analysis, in particular facilitating the use of motion estimation in a wide range of applications. In Chapter 5, scene flow is rethought. Scene flow stands for the three-dimensional motion vector field for every image pixel, computed from a stereo image sequence. Again, decoupling of the commonly coupled approach of estimating three-dimensional position and three dimensional motion yields an approach to scene ow estimation with more accurate results and a considerably lower computational load. It results in a dense scene flow field and enables additional applications based on the dense three-dimensional motion vector field, which are to be investigated in the future. One such application is the segmentation of moving objects in an image sequence. Detecting moving objects within the scene is one of the most important features to extract in image sequences from a dynamic environment. This is presented in Chapter 6. Scene flow and the segmentation of independently moving objects are only first steps towards machine visual kinesthesia. Throughout this work, I present possible future work to improve the estimation of optical flow and scene flow. Chapter 7 additionally presents an outlook on future research for driver assistance applications. But there is much more to the full understanding of the three-dimensional dynamic scene. This work is meant to inspire the reader to think outside the box and contribute to the vision of building perceiving machines.</em

bonndoc – Der Publikationsserver der Universität Bonn

Selected Papers from the First International Symposium on Future ICT (Future-ICT 2019) in Conjunction with 4th International Symposium on Mobile Internet Security (MobiSec 2019)

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

The International Symposium on Future ICT (Future-ICT 2019) in conjunction with the 4th International Symposium on Mobile Internet Security (MobiSec 2019) was held on 17–19 October 2019 in Taichung, Taiwan. The symposium provided academic and industry professionals an opportunity to discuss the latest issues and progress in advancing smart applications based on future ICT and its relative security. The symposium aimed to publish high-quality papers strictly related to the various theories and practical applications concerning advanced smart applications, future ICT, and related communications and networks. It was expected that the symposium and its publications would be a trigger for further related research and technology improvements in this field

Directory of Open Access Books (DOAB)

Dense Vision in Image-guided Surgery

Author: Chang Ping-Lin
Publication venue: Computing, Imperial College London
Publication date: 01/04/2015
Field of study

Image-guided surgery needs an efficient and effective camera tracking system in order to perform augmented reality for overlaying preoperative models or label cancerous tissues on the 2D video images of the surgical scene. Tracking in endoscopic/laparoscopic scenes however is an extremely difficult task primarily due to tissue deformation, instrument invasion into the surgical scene and the presence of specular highlights. State of the art feature-based SLAM systems such as PTAM fail in tracking such scenes since the number of good features to track is very limited. When the scene is smoky and when there are instrument motions, it will cause feature-based tracking to fail immediately. The work of this thesis provides a systematic approach to this problem using dense vision. We initially attempted to register a 3D preoperative model with multiple 2D endoscopic/laparoscopic images using a dense method but this approach did not perform well. We subsequently proposed stereo reconstruction to directly obtain the 3D structure of the scene. By using the dense reconstructed model together with robust estimation, we demonstrate that dense stereo tracking can be incredibly robust even within extremely challenging endoscopic/laparoscopic scenes. Several validation experiments have been conducted in this thesis. The proposed stereo reconstruction algorithm has turned out to be the state of the art method for several publicly available ground truth datasets. Furthermore, the proposed robust dense stereo tracking algorithm has been proved highly accurate in synthetic environment (< 0.1 mm RMSE) and qualitatively extremely robust when being applied to real scenes in RALP prostatectomy surgery. This is an important step toward achieving accurate image-guided laparoscopic surgery.Open Acces

Spiral - Imperial College Digital Repository

Structured light assisted real-time stereo photogrammetry for robotics and automation. Novel implementation of stereo matching

Author: Losi Jacopo
Publication venue
Publication date: 18/08/2020
Field of study

In this Master’s thesis project a novel implementation of a stereo matching based method is proposed. Moreover, an exhaustive analysis of the state-of-the-art algorithms in that ﬁeld is outlined. Speciﬁcally, both standard and deep learning based methods have been extensively investigated, thus to provide useful insights for the designed implementation. Regarding the developed work, it is basically structured in the following manner. At ﬁrst a research phase has been carried out, hence to simply and rapidly test the thought strategy. Subsequently, a ﬁrst implementation of the algorithm has been designed and tested using data available from the Middlebury 2014 dataset, which is one of the most exploited dataset in the computer vision area. At this stage, numerous tests have been completed and consequently various changes to the algorithm pipeline have been made, in order to improve the ﬁnal result. Finally, after that exhaustive researching phase the actual method has been designed and tested using real environment images obtained from the stereo device developed by the company, in which this work has been produced. Fundamental element of the project is indeed that stereo device. As a matter of fact, the designed algorithm in based on the data produced by the cameras that constitute it. Speciﬁcally, the main function of the system designed by LaDiMo is to make the built stereo matching based procedure simultaneously faster and accurate. As a matter of fact one of the main prerogative of the project was to create an algorithm that has to prove potential real-time results. This has been in fact, achieved by applying one of the two methods created. Speciﬁcally, it is a lightweight implementation, which strongly exploits the information coming from the LaDiMo device, thus to provide accurate results, keeping the computational time short. At the end of this Master’s thesis images showing the main outcomes obtained are proposed. Moreover, a discussion regarding the further improvements that are going to be added to the project is stated. In fact, the method implemented, being not optimized only demonstrate a potential real-time implementation, which would be certainly achieved through an eﬃcient refactoring of the main pipeline

Aaltodoc Publication Archive

High performance and error resilient probabilistic inference system for machine learning

Author: Choi Jungwook
Publication venue
Publication date
Field of study

Many real-world machine learning applications can be considered as inferring the best label assignment of maximum a posteriori probability (MAP) problems. Since these MAP problems are NP-hard in general, they are often dealt with using approximate inference algorithms on Markov random field (MRF) such as belief propagation (BP). However, this approximate inference is still computationally demanding, and thus custom hardware accelerators have been attractive for high performance and energy efficiency. There are various custom hardware implementations that employ BP to achieve reasonable performance for the real-world applications such as stereo matching. Due to lack of convergence guarantees, however, BP often fails to provide the right answer, thus degrading performance of the hardware. Therefore, we consider sequential tree-reweighted message passing (TRW-S), which avoids many of these convergence problems with BP via sequential execution of its computations but challenges parallel implementation for high throughput. In this work, therefore, we propose a novel streaming hardware architecture that parallelizes the sequential computations of TRW-S. Experimental results on stereo matching benchmarks show promising performance of our hardware implementation compared to the software implementation as well as other BP-based custom hardware or GPU implementations. From this result, we further demonstrate video-rate speed and high quality stereo matching using a hybrid CPU+FPGA platform. We propose three frame-level optimization techniques to fully exploit computational resources of a hybrid CPU+FPGA platform and achieve significant speed-up. We first propose a message reuse scheme which is guided by simple scene change detection. This scheme allows a current inference to be made based on a determination of whether the current result is expected to be similar to the inference result of the previous frame. We also consider frame level parallelization to process multiple frames in parallel using multiple FPGAs available in the platform. This parallelized hardware procedure is further pipelined with data management in CPU to overlap the execution time of the two and thereby reduce the entire processing time of the stereo video sequence. From experimental results with the real-world stereo video sequences, we see video-rate speed of our stereo matching system for QVGA stereo videos. Next, we consider error resilience of the message passing hardware for energy efficient hardware implementation. Modern nanoscale CMOS process technologies suffer in reliability caused by process, temperature and voltage variations. Conventional approaches to deal with such unreliability (e.g., design for the worst-case scenario) are complex and inefficient in terms of hardware resources and energy consumption. As machine learning applications are inherently probabilistic and robust to errors, statistical error compensation (SEC) techniques can play a significant role in achieving robust and energy-efficient implementation. SEC embraces the statistical nature of errors and utilizes statistical and probabilistic techniques to build robust systems. Energy-efficiency is obtained by trading off the enhanced robustness with energy. In this work, we analyze the error resilience of our message passing inference hardware subject to the hardware errors (e.g. errors caused by timing violation in circuits) and explore application of a popular SEC technique, algorithmic noise tolerance (ANT), to this hardware. Analysis and simulations show that the TRW-S message passing hardware is tolerant to small magnitude arithmetic errors, but large magnitude errors cause significantly inaccurate inference results which need to be corrected using SEC. Experimental results show that the proposed ANT-based hardware can tolerate an error rate of 21.3%, with performance degradation of only 3.5 % with an energy savings of 39.7 %, compared to an error-free hardware. Lastly, we extend our TRW-S hardware toward a general purpose machine learning framework. We propose advanced streaming architecture with flexible choice of MRF setting to achieve 10-40x speedup across a variety of computer vision applications. Furthermore, we provide better theoretical understanding of error resiliency of TRW-S, and of the implication of ANT for TRW-S, under more general MRF setting, along with strong empirical support

Illinois Digital Environment for Access to Learning and Scholarship Repository