413 research outputs found

    Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture

    Get PDF
    Deep neural networks are applied to a wide range of problems in recent years. In this work, Convolutional Neural Network (CNN) is applied to the problem of determining the depth from a single camera image (monocular depth). Eight different networks are designed to perform depth estimation, each of them suitable for a feature level. Networks with different pooling sizes determine different feature levels. After designing a set of networks, these models may be combined into a single network topology using graph optimization techniques. This "Semi Parallel Deep Neural Network (SPDNN)" eliminates duplicated common network layers, and can be further optimized by retraining to achieve an improved model compared to the individual topologies. In this study, four SPDNN models are trained and have been evaluated at 2 stages on the KITTI dataset. The ground truth images in the first part of the experiment are provided by the benchmark, and for the second part, the ground truth images are the depth map results from applying a state-of-the-art stereo matching method. The results of this evaluation demonstrate that using post-processing techniques to refine the target of the network increases the accuracy of depth estimation on individual mono images. The second evaluation shows that using segmentation data alongside the original data as the input can improve the depth estimation results to a point where performance is comparable with stereo depth estimation. The computational time is also discussed in this study.Comment: 44 pages, 25 figure

    Wide baseline stereo matching with convex bounded-distortion constraints

    Full text link
    Finding correspondences in wide baseline setups is a challenging problem. Existing approaches have focused largely on developing better feature descriptors for correspondence and on accurate recovery of epipolar line constraints. This paper focuses on the challenging problem of finding correspondences once approximate epipolar constraints are given. We introduce a novel method that integrates a deformation model. Specifically, we formulate the problem as finding the largest number of corresponding points related by a bounded distortion map that obeys the given epipolar constraints. We show that, while the set of bounded distortion maps is not convex, the subset of maps that obey the epipolar line constraints is convex, allowing us to introduce an efficient algorithm for matching. We further utilize a robust cost function for matching and employ majorization-minimization for its optimization. Our experiments indicate that our method finds significantly more accurate maps than existing approaches

    Multi Cost Function Fuzzy Stereo Matching Algorithm for Object Detection and Robot Motion Control

    Get PDF
    Stereo matching algorithms work with multiple images of a scene, taken from two viewpoints, to generate depth information. Authors usually use a single matching function to generate similarity between corresponding regions in the images. In the present research, the authors have considered a combination of multiple data costs for disparity generation. Disparity maps generated from stereo images tend to have noisy sections. The presented research work is related to a methodology to refine such disparity maps such that they can be further processed to detect obstacle regions.ย  A novel entropy based selective refinement (ESR) technique is proposed to refine the initial disparity map. The information from both the left disparity and right disparity maps are used for this refinement technique. For every disparity map, block wise entropy is calculated. The average entropy values of the corresponding positions in the disparity maps are compared. If the variation between these entropy values exceeds a threshold, then the corresponding disparity value is replaced with the mean disparity of the block with lower entropy. The results of this refinement are compared with similar methods and was observed to be better. Furthermore, in this research work, the v-disparity values are used to highlight the road surface in the disparity map. The regions belonging to the sky are removed through HSV based segmentation. The remaining regions which are our ROIs, are refined through a u-disparity area-based technique.ย  Based on this, the closest obstacles are detected through the use of k-means segmentation.ย  The segmented regions are further refined through a u-disparity image information-based technique and used as masks to highlight obstacle regions in the disparity maps. This information is used in conjunction with a kalman filter based path planning algorithm to guide a mobile robot from a source location to a destination location while also avoiding any obstacle detected in its path. A stereo camera setup was built and the performance of the algorithm on local real-life images, captured through the cameras, was observed. The evaluation of the proposed methodologies was carried out using real life out door images obtained from KITTI dataset and images with radiometric variations from Middlebury stereo dataset

    Development of Correspondence Field and Its Application to Effective Depth Estimation in Stereo Camera Systems

    Get PDF
    Stereo camera systems are still the most widely used apparatus for estimating 3D or depth information of a scene due to their low-cost. Estimation of depth using a stereo camera requires first estimating the disparity map using stereo matching algorithms and calculating depth via triangulation based on the camera arrangement (their locations and orientations with respect to the scene). In almost all cases, the arrangement is determined based on human experience since there lacks an effective theoretical tool to guide the design of the camera arrangement. This thesis presents the development of a novel tool, called correspondence field (CF), and its application to optimize the stereo camera arrangement for depth estimation

    ํฐ ๊ทธ๋ž˜ํ”„ ์ƒ์—์„œ์˜ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€ ๋žญํฌ์— ๋Œ€ํ•œ ๋น ๋ฅธ ๊ณ„์‚ฐ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2020. 8. ์ด์ƒ๊ตฌ.Computation of Personalized PageRank (PPR) in graphs is an important function that is widely utilized in myriad application domains such as search, recommendation, and knowledge discovery. Because the computation of PPR is an expensive process, a good number of innovative and efficient algorithms for computing PPR have been developed. However, efficient computation of PPR within very large graphs with over millions of nodes is still an open problem. Moreover, previously proposed algorithms cannot handle updates efficiently, thus, severely limiting their capability of handling dynamic graphs. In this paper, we present a fast converging algorithm that guarantees high and controlled precision. We improve the convergence rate of traditional Power Iteration method by adopting successive over-relaxation, and initial guess revision, a vector reuse strategy. The proposed method vastly improves on the traditional Power Iteration in terms of convergence rate and computation time, while retaining its simplicity and strictness. Since it can reuse the previously computed vectors for refreshing PPR vectors, its update performance is also greatly enhanced. Also, since the algorithm halts as soon as it reaches a given error threshold, we can flexibly control the trade-off between accuracy and time, a feature lacking in both sampling-based approximation methods and fully exact methods. Experiments show that the proposed algorithm is at least 20 times faster than the Power Iteration and outperforms other state-of-the-art algorithms.๊ทธ๋ž˜ํ”„ ๋‚ด์—์„œ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ (P ersonalized P age R ank, PPR ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๊ฒ€์ƒ‰ , ์ถ”์ฒœ , ์ง€์‹๋ฐœ๊ฒฌ ๋“ฑ ์—ฌ๋Ÿฌ ๋ถ„์•ผ์—์„œ ๊ด‘๋ฒ”์œ„ํ•˜๊ฒŒ ํ™œ์šฉ๋˜๋Š” ์ค‘์š”ํ•œ ์ž‘์—… ์ด๋‹ค . ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๊ณ ๋น„์šฉ์˜ ๊ณผ์ •์ด ํ•„์š”ํ•˜๋ฏ€๋กœ , ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ํšจ์œจ์ ์ด๊ณ  ํ˜์‹ ์ ์ธ ๋ฐฉ๋ฒ•๋“ค์ด ๋‹ค์ˆ˜ ๊ฐœ๋ฐœ๋˜์–ด์™”๋‹ค . ๊ทธ๋Ÿฌ๋‚˜ ์ˆ˜๋ฐฑ๋งŒ ์ด์ƒ์˜ ๋…ธ๋“œ๋ฅผ ๊ฐ€์ง„ ๋Œ€์šฉ๋Ÿ‰ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•œ ํšจ์œจ์ ์ธ ๊ณ„์‚ฐ์€ ์—ฌ์ „ํžˆ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ ๋ฌธ์ œ์ด๋‹ค . ๊ทธ์— ๋”ํ•˜์—ฌ , ๊ธฐ์กด ์ œ์‹œ๋œ ์•Œ๊ณ ๋ฆฌ๋“ฌ๋“ค์€ ๊ทธ๋ž˜ํ”„ ๊ฐฑ์‹ ์„ ํšจ์œจ์ ์œผ๋กœ ๋‹ค๋ฃจ์ง€ ๋ชปํ•˜์—ฌ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐ์— ํ•œ๊ณ„์ ์ด ํฌ๋‹ค . ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋†’์€ ์ •๋ฐ€๋„๋ฅผ ๋ณด์žฅํ•˜๊ณ  ์ •๋ฐ€๋„๋ฅผ ํ†ต์ œ ๊ฐ€๋Šฅํ•œ , ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•˜๋Š” ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ ๊ณ„์‚ฐ ์•Œ๊ณ ๋ฆฌ๋“ฌ์„ ์ œ์‹œํ•œ๋‹ค . ์ „ํ†ต์ ์ธ ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ• (Power ์— ์ถ•์ฐจ๊ฐ€์†์™„ํ™”๋ฒ• (Successive Over Relaxation) ๊ณผ ์ดˆ๊ธฐ ์ถ”์ธก ๊ฐ’ ๋ณด์ •๋ฒ• (Initial Guess ์„ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ์žฌ์‚ฌ์šฉ ์ „๋žต์„ ์ ์šฉํ•˜์—ฌ ์ˆ˜๋ ด ์†๋„๋ฅผ ๊ฐœ์„ ํ•˜์˜€๋‹ค . ์ œ์‹œ๋œ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ•์˜ ์žฅ์ ์ธ ๋‹จ์ˆœ์„ฑ๊ณผ ์—„๋ฐ€์„ฑ์„ ์œ ์ง€ ํ•˜๋ฉด์„œ ๋„ ์ˆ˜๋ ด์œจ๊ณผ ๊ณ„์‚ฐ์†๋„๋ฅผ ํฌ๊ฒŒ ๊ฐœ์„  ํ•œ๋‹ค . ๋˜ํ•œ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ ๋ฒกํ„ฐ์˜ ๊ฐฑ์‹ ์„ ์œ„ํ•˜์—ฌ ์ด์ „์— ๊ณ„์‚ฐ ๋˜์–ด ์ €์žฅ๋œ ๋ฒกํ„ฐ๋ฅผ ์žฌ์‚ฌ์šฉํ•˜ ์—ฌ , ๊ฐฑ์‹  ์— ๋“œ๋Š” ์‹œ๊ฐ„์ด ํฌ๊ฒŒ ๋‹จ์ถ•๋œ๋‹ค . ๋ณธ ๋ฐฉ๋ฒ•์€ ์ฃผ์–ด์ง„ ์˜ค์ฐจ ํ•œ๊ณ„์— ๋„๋‹ฌํ•˜๋Š” ์ฆ‰์‹œ ๊ฒฐ๊ณผ๊ฐ’์„ ์‚ฐ์ถœํ•˜๋ฏ€๋กœ ์ •ํ™•๋„์™€ ๊ณ„์‚ฐ์‹œ๊ฐ„์„ ์œ ์—ฐํ•˜๊ฒŒ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ด๋Š” ํ‘œ๋ณธ ๊ธฐ๋ฐ˜ ์ถ”์ •๋ฐฉ๋ฒ•์ด๋‚˜ ์ •ํ™•ํ•œ ๊ฐ’์„ ์‚ฐ์ถœํ•˜๋Š” ์—ญํ–‰๋ ฌ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ• ์ด ๊ฐ€์ง€์ง€ ๋ชปํ•œ ํŠน์„ฑ์ด๋‹ค . ์‹คํ—˜ ๊ฒฐ๊ณผ , ๋ณธ ๋ฐฉ๋ฒ•์€ ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ•์— ๋น„ํ•˜์—ฌ 20 ๋ฐฐ ์ด์ƒ ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•œ๋‹ค๋Š” ๊ฒƒ์ด ํ™•์ธ๋˜์—ˆ์œผ๋ฉฐ , ๊ธฐ ์ œ์‹œ๋œ ์ตœ๊ณ  ์„ฑ๋Šฅ ์˜ ์•Œ๊ณ ๋ฆฌ ๋“ฌ ๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ ๋˜ํ•œ ํ™•์ธ๋˜์—ˆ๋‹ค1 Introduction 1 2 Preliminaries: Personalized PageRank 4 2.1 Random Walk, PageRank, and Personalized PageRank. 5 2.1.1 Basics on Random Walk 5 2.1.2 PageRank. 6 2.1.3 Personalized PageRank 8 2.2 Characteristics of Personalized PageRank. 9 2.3 Applications of Personalized PageRank. 12 2.4 Previous Work on Personalized PageRank Computation. 17 2.4.1 Basic Algorithms 17 2.4.2 Enhanced Power Iteration 18 2.4.3 Bookmark Coloring Algorithm. 20 2.4.4 Dynamic Programming 21 2.4.5 Monte-Carlo Sampling. 22 2.4.6 Enhanced Direct Solving 24 2.5 Summary 26 3 Personalized PageRank Computation with Initial Guess Revision 30 3.1 Initial Guess Revision and Relaxation 30 3.2 Finding Optimal Weight of Successive Over Relaxation for PPR. 34 3.3 Initial Guess Construction Algorithm for Personalized PageRank. 36 4 Fully Personalized PageRank Algorithm with Initial Guess Revision 42 4.1 FPPR with IGR. 42 4.2 Optimization. 49 4.3 Experiments. 52 5 Personalized PageRank Query Processing with Initial Guess Revision 56 5.1 PPR Query Processing with IGR 56 5.2 Optimization. 64 5.3 Experiments. 67 6 Conclusion 74 Bibliography 77 Appendix 88 Abstract (In Korean) 90Docto

    Contextual Human Trajectory Forecasting within Indoor Environments and Its Applications

    Get PDF
    A human trajectory is the likely path a human subject would take to get to a destination. Human trajectory forecasting algorithms try to estimate or predict this path. Such algorithms have wide applications in robotics, computer vision and video surveillance. Understanding the human behavior can provide useful information towards the design of these algorithms. Human trajectory forecasting algorithm is an interesting problem because the outcome is influenced by many factors, of which we believe that the destination, geometry of the environment, and the humans in it play a significant role. In addressing this problem, we propose a model to estimate the occupancy behavior of humans based on the geometry and behavioral norms. We also develop a trajectory forecasting algorithm that understands this occupancy and leverages it for trajectory forecasting in previously unseen geometries. The algorithm can be useful in a variety of applications. In this work, we show its utility in three applications, namely person re-identification, camera placement optimization, and human tracking. Experiments were performed with real world data and compared to state-of-the-art methods to assess the quality of the forecasting algorithm and the enhancement in the quality of the applications. Results obtained suggests a significant enhancement in the accuracy of trajectory forecasting and the computer vision applications.Computer Science, Department o

    Real time motion estimation using a neural architecture implemented on GPUs

    Get PDF
    This work describes a neural network based architecture that represents and estimates object motion in videos. This architecture addresses multiple computer vision tasks such as image segmentation, object representation or characterization, motion analysis and tracking. The use of a neural network architecture allows for the simultaneous estimation of global and local motion and the representation of deformable objects. This architecture also avoids the problem of finding corresponding features while tracking moving objects. Due to the parallel nature of neural networks, the architecture has been implemented on GPUs that allows the system to meet a set of requirements such as: time constraints management, robustness, high processing speed and re-configurability. Experiments are presented that demonstrate the validity of our architecture to solve problems of mobile agents tracking and motion analysis.This work was partially funded by the Spanish Government DPI2013-40534-R grant and Valencian Government GV/2013/005 grant

    A New Structure of Stereo Algorithm Using Pixel Based Differences and Weighted Median Filter

    Get PDF
    This paper proposed a new algorithm for stereo vision system to obtain depth map or disparity map. The proposed stereo vision algorithm consists of three stages, matching cost computation, disparity optimization and disparity refinement. The first stage starts with matching cost computation, where pixel based differences methods are used. The matching methods are the combination of Absolute Difference (AD) and Gradient Matching (GM). Next, the second stage; disparity optimization utilizes Winner-Takes-All (WTA) technique to normalize the disparity values of each pixel of the image. Finally, for disparity refinement stage, weighted median (WM) filter is added to reduce and smother the noise on the disparity map
    • โ€ฆ
    corecore