56 research outputs found
Literature Survey On Stereo Vision Disparity Map Algorithms
This paper presents a literature survey on existing disparity map algorithms. It focuses on four main stages of processing as proposed by Scharstein and Szeliski in a taxonomy and evaluation of dense two-frame stereo correspondence algorithms performed in 2002. To assist future researchers in developing their own stereo matching algorithms, a summary of the existing algorithms developed for every stage of processing is also provided. The survey also notes the implementation of previous software-based and hardware-based algorithms. Generally, the main processing module for a software-based implementation uses only a central processing unit. By contrast, a hardware-based implementation requires one or more additional processors for its processing module, such as graphical processing unit or a field programmable gate array. This literature survey also presents a method of qualitative measurement that is widely used by researchers in the area of stereo vision disparity mappings
Design of Binocular Stereo Vision System Via CNN-based Stereo Matching Algorithm
Stereo vision is one of the representative technologies in the 3D camera, using multiple cameras to perceive the depth information in the three-dimensional space.
The binocular one has become the most widely applied method in stereo vision.
So in our thesis, we design a binocular stereo vision system based on an adjustable narrow-baseline stereo camera, which can simultaneously capture the left and right images belonging to a stereo image pair.
The camera calibration and rectification techniques are firstly performed to get rectified stereo pairs, serving as the input to the subsequent step, that is, searching the corresponding points between the left and right images.
The stereo matching algorithm resolves the correspondence problem and plays a crucial part in our system, which produces disparity maps targeted at predicting the depths with the help of the triangulation principle.
We focus on the first stage of this algorithm, proposing a CNN-based approach to calculating the matching cost by measuring the similarity level between two image patches.
Two kinds of network architectures are presented and both of them are based on the siamese network.
The fast network employs the cosine metric to compute the similarity level at a satisfactory accuracy and processing speed.
While the slow network is aimed at learning a new metric, making the disparity prediction slightly more precise but at the cost of spending way more image handling time and counting on more parameters.
The output of either network is regarded as the initial matching cost, followed by a series of post-processing methods, including cross-based cost aggregation as well as semi-global cost aggregation.
With the trick of Winner-Take-All (WTA), the raw disparity map is attained and it will undergo further refinement procedures containing interpolation and image filtering.
The above networks are trained and validated on three standard stereo datasets: Middlebury, KITTI 2012, and KITTI 2015.
The contrast tests of CNN-based methods and census transformation have demonstrated that the former approach outperforms the later one on the mentioned datasets.
The algorithm based on the fast network is adopted in our devised system.
To evaluate the performance of a binocular stereo vision system, two types of error criteria are come up with, acquiring the proper range of working distance under diverse baseline lengths
ACCURATE AND FAST STEREO VISION
Stereo vision from short-baseline image pairs is one of the most active research fields in computer vision. The estimation of dense disparity maps from stereo image pairs is still a challenging task and there is further space for improving accuracy, minimizing the computational cost and handling more efficiently outliers, low-textured areas, repeated textures, disparity discontinuities and light variations.
This PhD thesis presents two novel methodologies relating to stereo vision from short-baseline image pairs:
I. The first methodology combines three different cost metrics, defined using colour, the CENSUS transform and SIFT (Scale Invariant Feature Transform) coefficients. The selected cost metrics are aggregated based on an adaptive weights approach, in order to calculate their corresponding cost volumes. The resulting cost volumes are merged into a combined one, following a novel two-phase strategy, which is further refined by exploiting semi-global optimization. A mean-shift segmentation-driven approach is exploited to deal with outliers in the disparity maps. Additionally, low-textured areas are handled using disparity histogram analysis, which allows for reliable disparity plane fitting on these areas.
II. The second methodology relies on content-based guided image filtering and weighted semi-global optimization. Initially, the approach uses a pixel-based cost term that combines gradient, Gabor-Feature and colour information. The pixel-based matching costs are filtered by applying guided image filtering, which relies on support windows of two different sizes. In this way, two filtered costs are estimated for each pixel. Among the two filtered costs, the one that will be finally assigned to each pixel, depends on the local image content around this pixel. The filtered cost volume is further refined by exploiting weighted semi-global optimization, which improves the disparity accuracy. The handling of the occluded areas is enhanced by incorporating a straightforward and time efficient scheme.
The evaluation results show that both methodologies are very accurate, since they handle efficiently low-textured/occluded areas and disparity discontinuities. Additionally, the second approach has very low computational complexity.
Except for the aforementioned two methodologies that use as input short-baseline image pairs, this PhD thesis presents a novel methodology for generating 3D point clouds of good accuracy from wide-baseline stereo pairs
Segment-based stereo matching algorithm with rectification for single-lens bi-prism stereovision system
Ph.DDOCTOR OF PHILOSOPH
Recommended from our members
Robust hand pose recognition from stereoscopic capture
Hand pose is emerging as an important interface for human-computer interaction. The problem of hand pose estimation from passive stereo inputs has received less attention in the literature compared to active depth sensors. This thesis seeks to address this gap by presenting a data-driven method to estimate a hand pose from a stereoscopic camera input, with experimental results comparable to more expensive active depth sensors. The frameworks presented in this thesis are based on a two camera stereo rig capture as it yields a simpler and cheaper set-up and calibration. Three frameworks are presented, describing the sequential steps taken to solve the problem of depth and pose estimation of hands.
The first is a data-driven method to estimate a high quality depth map of a hand from a stereoscopic camera input by introducing a novel regression framework. The method first computes disparity using a robust stereo matching technique. Then, it applies a machine learning technique based on Random Forest to learn the mapping between the estimated disparity and depth given ground truth data. We introduce Eigen Leaf Node Features (ELNFs) that perform feature selection at the leaf nodes in each tree to identify features that are most discriminative for depth regression. The system provides a robust method for generating a depth image with an inexpensive stereo camera.
The second framework improves on the task of hand depth estimation from stereo capture by introducing a novel superpixel-based regression framework that takes advantage of the smoothness of the depth surface of the hand. To this end, it introduces Conditional Regressive Random Forest (CRRF), a method that combines a Conditional Random Field (CRF) and a Regressive Random Forest (RRF) to model the mapping from a stereo RGB image pair to a depth image. The RRF provides a unary term that adaptively selects different stereo-matching measures as it implicitly determines matching pixels in a coarse-to-fine manner. While the RRF makes depth prediction for each super-pixel independently, the CRF unifies the prediction of depth by modeling pair-wise interactions between adjacent superpixels.
The final framework introduces a stochastic approach to propose potential depth solutions to the observed stereo capture and evaluate these proposals using two convolutional neural networks (CNNs). The first CNN, configured in a Siamese network architecture, evaluates how consistent the proposed depth solution is to the observed stereo capture. The second CNN estimates a hand pose given the proposed depth. Unlike sequential approaches that reconstruct pose from a known depth, this method jointly optimizes the hand pose and depth estimation through Markov-chain Monte Carlo (MCMC) sampling. This way, pose estimation can correct for errors in depth estimation, and vice versa.
Experimental results using an inexpensive stereo camera show that the proposed system measures pose more accurately than competing methods. More importantly, it presents the possibility of pose recovery from stereo capture that is on par with depth based pose recovery
Economic Determinants of Rural-Urban Labor Migration: The Case of Korea
Though it is conceded that the major stream of migration in developing countries flows from rural to urban areas, research on identifying the principle determinants of rural-urban migration are quite scarce. This is especially so in quantitative terms. Therefore, hypotheses concerning the economic determinants of rural-urban migration are investigated and economic impact of rural-urban migration is also examined in this dissertation.
Migration can be viewed from many different perspectives -- selectivities, pull and push, human investment approaches, etc. However, this study follows economic tradition and views migration in a general equilibrium context. In the economic opportunity hypothesis, discrepancies of factor payments among regions is postulated. The significance of selected economic determinants such as the magnitude of capital investment and relative prices of rural and urban goods is then tested along with the economic opportunity hypothesis.
The migration model in this study is specified as a system of interrelated equations. The simultaneous equations model which is employed, enables us to examine the effectiveness of key variables on migration, and the impacts of these variables on rural and urban labor markets. The model is tested with Korean labor force data.
Rural wage rates were found to have a negative relationship with migration, whereas urban wage rates showed a positive relationship. The changes in relative prices of rural and urban goods were found to exert a significant impact on rural-urban migration. A decrease in prices of rural goods may induce an increase in out-migration and an increase in prices of urban goods may be a pulling factor of rural-urban migration such that rural-urban migration increases. Thus, the net out-migration may be reduced when agricultural prices increase.
An increase in capital investment in rural areas was found to reduce rural-urban migration, and an increase in capital investment in urban areas was found to encourage rural-urban migration. Investment was found to be the most significant variable in determining wage rates, however, it was less significant than the prices of rural and urban goods in determining rural-urban migration.
It is concluded that changing relative economic opportunities, changing output prices, and capital investments between rural and urban areas are important factors providing impetus for rural-urban migration and they are major economic determinants of rural-urban migration in Korea. Thus, government is faced with alternative policies for the reduction of rural-urban migration. For instance, the government may give wage subsidies to the rural employment sector, it may increase large-scale public investments, or it may allow a rise in agricultural prices. Each of these policies would tend to reduce the flow of resources from the agricultural sector to non-agricultural sector.
A redistribution of resources may not facilitate efficient resource allocation and the optimal growth of the national economy. Therefore, the efficiency aspects of stimulating a resource flow should be examined carefully before these policy variables are implemented for achieving population redistribution
- …