Search CORE

190 research outputs found

Stereo Vision and Scene Segmentation

Author: Dominio Fabio
Mattoccia Stefano
Mutto Carlo Dal
Zanuttigh Pietro
Publication venue: 'IntechOpen'
Publication date: 01/01/2012
Field of study

This chapter focuses on how segmentation robustness can be improved by 3D scene geometry provided by stereo vision systems, as they are simpler and relatively cheaper than most of current range cameras. In fact, two inexpensive cameras arranged in a rig are often enough to obtain good results. Another noteworthy characteristic motivating the choice of stereo systems is that they both provide 3D geometry and color information of the framed scene without requiring further hardware. Indeed, as it will be seen in following sections, 3D geometry extraction from a framed scene by a stereo system, also known as stereo reconstruction, may be eased and improved by scene segmentation since the correspondence research can be restricted within the same segment in the left and right images

IntechOpen

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Archivio istituzionale della ricerca - Università di Padova

Depth Recovery with Rectification using Single-Lens Prism based Stereovision System

Author: WANG DAOLEI
Publication venue
Publication date: 22/08/2012
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Segment-based stereo matching algorithm with rectification for single-lens bi-prism stereovision system

Author: BAI YADING
Publication venue
Publication date: 21/08/2014
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Multi Cost Function Fuzzy Stereo Matching Algorithm for Object Detection and Robot Motion Control

Author: Hegde Navya Thirumaleshwar
Shetty Akhil Appu
Srinivasan C R
Vaz Aldrin Claytus
Publication venue: Universitas Muhammadiyah Yogyakarta
Publication date: 05/06/2023
Field of study

Stereo matching algorithms work with multiple images of a scene, taken from two viewpoints, to generate depth information. Authors usually use a single matching function to generate similarity between corresponding regions in the images. In the present research, the authors have considered a combination of multiple data costs for disparity generation. Disparity maps generated from stereo images tend to have noisy sections. The presented research work is related to a methodology to refine such disparity maps such that they can be further processed to detect obstacle regions. A novel entropy based selective refinement (ESR) technique is proposed to refine the initial disparity map. The information from both the left disparity and right disparity maps are used for this refinement technique. For every disparity map, block wise entropy is calculated. The average entropy values of the corresponding positions in the disparity maps are compared. If the variation between these entropy values exceeds a threshold, then the corresponding disparity value is replaced with the mean disparity of the block with lower entropy. The results of this refinement are compared with similar methods and was observed to be better. Furthermore, in this research work, the v-disparity values are used to highlight the road surface in the disparity map. The regions belonging to the sky are removed through HSV based segmentation. The remaining regions which are our ROIs, are refined through a u-disparity area-based technique. Based on this, the closest obstacles are detected through the use of k-means segmentation. The segmented regions are further refined through a u-disparity image information-based technique and used as masks to highlight obstacle regions in the disparity maps. This information is used in conjunction with a kalman filter based path planning algorithm to guide a mobile robot from a source location to a destination location while also avoiding any obstacle detected in its path. A stereo camera setup was built and the performance of the algorithm on local real-life images, captured through the cameras, was observed. The evaluation of the proposed methodologies was carried out using real life out door images obtained from KITTI dataset and images with radiometric variations from Middlebury stereo dataset

Leading & Enlightening Journal UMY

Applications of Computer Vision Technologies of Automated Crack Detection and Quantification for the Inspection of Civil Infrastructure Systems

Author: Wu Liuliu
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2015
Field of study

Many components of existing civil infrastructure systems, such as road pavement, bridges, and buildings, are suffered from rapid aging, which require enormous nation\u27s resources from federal and state agencies to inspect and maintain them. Crack is one of important material and structural defects, which must be inspected not only for good maintenance of civil infrastructure with a high quality of safety and serviceability, but also for the opportunity to provide early warning against failure. Conventional human visual inspection is still considered as the primary inspection method. However, it is well established that human visual inspection is subjective and often inaccurate. In order to improve current manual visual inspection for crack detection and evaluation of civil infrastructure, this study explores the application of computer vision techniques as a non-destructive evaluation and testing (NDE&T) method for automated crack detection and quantification for different civil infrastructures. In this study, computer vision-based algorithms were developed and evaluated to deal with different situations of field inspection that inspectors could face with in crack detection and quantification. The depth, the distance between camera and object, is a necessary extrinsic parameter that has to be measured to quantify crack size since other parameters, such as focal length, resolution, and camera sensor size are intrinsic, which are usually known by camera manufacturers. Thus, computer vision techniques were evaluated with different crack inspection applications with constant and variable depths. For the fixed-depth applications, computer vision techniques were applied to two field studies, including 1) automated crack detection and quantification for road pavement using the Laser Road Imaging System (LRIS), and 2) automated crack detection on bridge cables surfaces, using a cable inspection robot. For the various-depth applications, two field studies were conducted, including 3) automated crack recognition and width measurement of concrete bridges\u27 cracks using a high-magnification telescopic lens, and 4) automated crack quantification and depth estimation using wearable glasses with stereovision cameras. From the realistic field applications of computer vision techniques, a novel self-adaptive image-processing algorithm was developed using a series of morphological transformations to connect fragmented crack pixels in digital images. The crack-defragmentation algorithm was evaluated with road pavement images. The results showed that the accuracy of automated crack detection, associated with artificial neural network classifier, was significantly improved by reducing both false positive and false negative. Using up to six crack features, including area, length, orientation, texture, intensity, and wheel-path location, crack detection accuracy was evaluated to find the optimal sets of crack features. Lab and field test results of different inspection applications show that proposed compute vision-based crack detection and quantification algorithms can detect and quantify cracks from different structures\u27 surface and depth. Some guidelines of applying computer vision techniques are also suggested for each crack inspection application

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Depth Estimation - An Introduction

Author: Mezcua Belén Ruiz
Pena José M. Sánchez
Sanz Pablo Revuelta
Publication venue: 'IntechOpen'
Publication date: 11/07/2012
Field of study

IntechOpen

An investigation into common challenges of 3D scene understanding in visual surveillance

Author: Tarrit Katy
Publication venue
Publication date
Field of study

Nowadays, video surveillance systems are ubiquitous. Most installations simply consist of CCTV cameras connected to a central control room and rely on human operators to interpret what they see on the screen in order to, for example, detect a crime (either during or after an event). Some modern computer vision systems aim to automate the process, at least to some degree, and various algorithms have been somewhat successful in certain limited areas. However, such systems remain inefficient in general circumstances and present real challenges yet to be solved. These challenges include the ability to recognise and ultimately predict and prevent abnormal behaviour or even reliably recognise objects, for example in order to detect left luggage or suspicious objects. This thesis first aims to study the state-of-the-art and identify the major challenges and possible requirements of future automated and semi-automated CCTV technology in the field. This thesis presents the application of a suite of 2D and highly novel 3D methodologies that go some way to overcome current limitations.The methods presented here are based on the analysis of object features directly extracted from the geometry of the scene and start with a consideration of mainly existing techniques, such as the use of lines, vanishing points (VPs) and planes, applied to real scenes. Then, an investigation is presented into the use of richer 2.5D/3D surface normal data. In all cases the aim is to combine both 2D and 3D data to obtain a better understanding of the scene, aimed ultimately at capturing what is happening within the scene in order to be able to move towards automated scene analysis. Although this thesis focuses on the widespread application of video surveillance, an example case of the railway station environment is used to represent typical real-world challenges, where the principles can be readily extended elsewhere, such as to airports, motorways, the households, shopping malls etc. The context of this research work, together with an overall presentation of existing methods used in video surveillance and their challenges are described in chapter 1.Common computer vision techniques such as VP detection, camera calibration, 3D reconstruction, segmentation etc., can be applied in an effort to extract meaning to video surveillance applications. According to the literature, these methods have been well researched and their use will be assessed in the context of current surveillance requirements in chapter 2. While existing techniques can perform well in some contexts, such as an architectural environment composed of simple geometrical elements, their robustness and performance in feature extraction and object recognition tasks is not sufficient to solve the key challenges encountered in general video surveillance context. This is largely due to issues such as variable lighting, weather conditions, and shadows and in general complexity of the real-world environment. Chapter 3 presents the research and contribution on those topics – methods to extract optimal features for a specific CCTV application – as well as their strengths and weaknesses to highlight that the proposed algorithm obtains better results than most due to its specific design.The comparison of current surveillance systems and methods from the literature has shown that 2D data are however almost constantly used for many applications. Indeed, industrial systems as well as the research community have been improving intensively 2D feature extraction methods since image analysis and Scene understanding has been of interest. The constant progress on 2D feature extraction methods throughout the years makes it almost effortless nowadays due to a large variety of techniques. Moreover, even if 2D data do not allow solving all challenges in video surveillance or other applications, they are still used as starting stages towards scene understanding and image analysis. Chapter 4 will then explore 2D feature extraction via vanishing point detection and segmentation methods. A combination of most common techniques and a novel approach will be then proposed to extract vanishing points from video surveillance environments. Moreover, segmentation techniques will be explored in the aim to determine how they can be used to complement vanishing point detection and lead towards 3D data extraction and analysis. In spite of the contribution above, 2D data is insufficient for all but the simplest applications aimed at obtaining an understanding of a scene, where the aim is for a robust detection of, say, left luggage or abnormal behaviour; without significant a priori information about the scene geometry. Therefore, more information is required in order to be able to design a more automated and intelligent algorithm to obtain richer information from the scene geometry and so a better understanding of what is happening within. This can be overcome by the use of 3D data (in addition to 2D data) allowing opportunity for object “classification” and from this to infer a map of functionality, describing feasible and unfeasible object functionality in a given environment. Chapter 5 presents how 3D data can be beneficial for this task and the various solutions investigated to recover 3D data, as well as some preliminary work towards plane extraction.It is apparent that VPs and planes give useful information about a scene’s perspective and can assist in 3D data recovery within a scene. However, neither VPs nor plane detection techniques alone allow the recovery of more complex generic object shapes - for example composed of spheres, cylinders etc - and any simple model will suffer in the presence of non-Manhattan features, e.g. introduced by the presence of an escalator. For this reason, a novel photometric stereo-based surface normal retrieval methodology is introduced to capture the 3D geometry of the whole scene or part of it. Chapter 6 describes how photometric stereo allows recovery of 3D information in order to obtain a better understanding of a scene, as well as also partially overcoming some current surveillance challenges, such as difficulty in resolving fine detail, particularly at large standoff distances, and in isolating and recognising more complex objects in real scenes. Here items of interest may be obscured by complex environmental factors that are subject to rapid change, making, for example, the detection of suspicious objects and behaviour highly problematic. Here innovative use is made of an untapped latent capability offered within modern surveillance environments to introduce a form of environmental structuring to good advantage in order to achieve a richer form of data acquisition. This chapter also goes on to explore the novel application of photometric stereo in such diverse applications, how our algorithm can be incorporated into an existing surveillance system and considers a typical real commercial application.One of the most important aspects of this research work is its application. Indeed, while most of the research literature has been based on relatively simple structured environments, the approach here has been designed to be applied to real surveillance environments, such as railway stations, airports, waiting rooms, etc, and where surveillance cameras may be fixed or in the future form part of a mobile robotic free roaming surveillance device, that must continually reinterpret its changing environment. So, as mentioned previously, while the main focus has been to apply this algorithm to railway station environments, the work has been approached in a way that allows adaptation to many other applications, such as autonomous robotics, and in motorway, shopping centre, street and home environments. All of these applications require a better understanding of the scene for security or safety purposes. Finally, chapter 7 presents a global conclusion and what will be achieved in the future

UWE Bristol Research Repository

Automatic plant features recognition using stereo vision for crop monitoring

Author: Mohammed Amean Zainab
Publication venue
Publication date: 01/01/2017
Field of study

Machine vision and robotic technologies have potential to accurately monitor plant parameters which reflect plant stress and water requirements, for use in farm management decisions. However, autonomous identification of individual plant leaves on a growing plant under natural conditions is a challenging task for vision-guided agricultural robots, due to the complexity of data relating to various stage of growth and ambient environmental conditions. There are numerous machine vision studies that are concerned with describing the shape of leaves that are individually-presented to a camera. The purpose of these studies is to identify plant species, or for the autonomous detection of multiple leaves from small seedlings under greenhouse conditions. Machine vision-based detection of individual leaves and challenges presented by overlapping leaves on a developed plant canopy using depth perception properties under natural outdoor conditions is yet to be reported. Stereo vision has recently emerged for use in a variety of agricultural applications and is expected to provide an accurate method for plant segmentation and identification which can benefit from depth properties and robustness. This thesis presents a plant leaf extraction algorithm using a stereo vision sensor. This algorithm is used on multiple leaf segmentation and overlapping leaves separation using a combination of image features, specifically colour, shape and depth. The separation between the connected and the overlapping leaves relies on the measurement of the discontinuity in depth gradient for the disparity maps. Two techniques have been developed to implement this task based on global and local measurement. A geometrical plane from each segmented leaf can be extracted and used to parameterise a 3D model of the plant image and to measure the inclination angle of each individual leaf. The stem and branch segmentation and counting method was developed based on the vesselness measure and Hough transform technique. Furthermore, a method for reconstructing the segmented parts of hibiscus plants is presented and a 2.5D model is generated for the plant. Experimental tests were conducted with two different selected plants: cotton of different sizes, and hibiscus, in an outdoor environment under varying light conditions. The proposed algorithm was evaluated using 272 cotton and hibiscus plant images. The results show an observed enhancement in leaf detection when utilising depth features, where many leaves in various positions and shapes (single, touching and overlapping) were detected successfully. Depth properties were more effective in separating between occluded and overlapping leaves with a high separation rate of 84% and these can be detected automatically without adding any artificial tags on the leaf boundaries. The results exhibit an acceptable segmentation rate of 78% for individual plant leaves thereby differentiating the leaves from their complex backgrounds and from each other. The results present almost identical performance for both species under various lighting and environmental conditions. For the stem and branch detection algorithm, experimental tests were conducted on 64 colour images of both species under different environmental conditions. The results show higher stem and branch segmentation rates for hibiscus indoor images (82%) compared to hibiscus outdoor images (49.5%) and cotton images (21%). The segmentation and counting of plant features could provide accurate estimation about plant growth parameters which can be beneficial for many agricultural tasks and applications

University of Southern Queensland ePrints

Scene segmentation using similarity, motion and depth based cues

Author: Mitra Bhargav Kumar
Publication venue
Publication date: 20/09/2010
Field of study

Segmentation of complex scenes to aid surveillance is still considered an open research problem. In this thesis a computational model (CM) has been developed to classify a scene into foreground, moving-shadow and background regions. It has been demonstrated how the CM, with the optional use of a channel ratio test, can be applied to demarcate foreground shadow regions in indoor scenes illuminated by a fixed incandescent source of light. A combined approach, involving the CM working in tandem with a traditional motion cue based segmentation method, has also been constructed. In the combined approach, the CM is applied to segregate the foreground shaded regions in a current frame based on a binary mask generated using a standard background subtraction process (BSP). Various popular outlier detection strategies have been investigated to assess their suitabilities in generating a threshold automatically, required to develop a binary mask from a difference frame, the outcome of the BSP. To evaluate the full scope of the pixel labeling capabilities of the CM and to estimate the associated time constraints, the model is deployed for foreground scene segmentation in recorded real-life video streams. The observations made validate the satisfactory performance of the model in most cases. In the second part of the thesis depth based cues have been exploited to perform the task of foreground scene segmentation. An active structured light based depthestimating arrangement has been modeled in the thesis; the choice of modeling an active system over a passive stereovision one has been made to alleviate some of the difficulties associated with the classical correspondence problem. The model developed not only facilitates use of the set-up but also makes possible a method to increase the working volume of the system without explicitly encoding the projected structured pattern. Finally, it is explained how scene segmentation can be accomplished based solely on the structured pattern disparity information, without generating explicit depthmaps. To de-noise the difference frames, generated using the developed method, two median filtering schemes have been implemented. The working of one of the schemes is advocated for practical use and is described in terms of discrete morphological operators, thus facilitating hardware realisation of the method to speed-up the de-noising process

Sussex Research Online