81 research outputs found

    Binary Adaptive Semi-Global Matching Based on Image Edges

    Get PDF
    Image-based modeling and rendering is currently one of the most challenging topics in Computer Vision and Photogrammetry. The key issue here is building a set of dense correspondence points between two images, namely dense matching or stereo matching. Among all dense matching algorithms, Semi-Global Matching (SGM) is arguably one of the most promising algorithms for real-time stereo vision. Compared with global matching algorithms, SGM aggregates matching cost from several (eight or sixteen) directions rather than only the epipolar line using Dynamic Programming (DP). Thus, SGM eliminates the classical “streaking problem” and greatly improves its accuracy and efficiency. In this paper, we aim at further improvement of SGM accuracy without increasing the computational cost. We propose setting the penalty parameters adaptively according to image edges extracted by edge detectors. We have carried out experiments on the standard Middlebury stereo dataset and evaluated the performance of our modified method with the ground truth. The results have shown a noticeable accuracy improvement compared with the results using fixed penalty parameters while the runtime computational cost was not increased

    Enhancment of dense urban digital surface models from VHR optical satellite stereo data by pre-segmentation and object detection

    Get PDF
    The generation of digital surface models (DSM) of urban areas from very high resolution (VHR) stereo satellite imagery requires advanced methods. In the classical approach of DSM generation from stereo satellite imagery, interest points are extracted and correlated between the stereo mates using an area based matching followed by a least-squares sub-pixel refinement step. After a region growing the 3D point list is triangulated to the resulting DSM. In urban areas this approach fails due to the size of the correlation window, which smoothes out the usual steep edges of buildings. Also missing correlations as for partly – in one or both of the images – occluded areas will simply be interpolated in the triangulation step. So an urban DSM generated with the classical approach results in a very smooth DSM with missing steep walls, narrow streets and courtyards. To overcome these problems algorithms from computer vision are introduced and adopted to satellite imagery. These algorithms do not work using local optimisation like the area-based matching but try to optimize a (semi-)global cost function. Analysis shows that dynamic programming approaches based on epipolar images like dynamic line warping or semiglobal matching yield the best results according to accuracy and processing time. These algorithms can also detect occlusions – areas not visible in one or both of the stereo images. Beside these also the time and memory consuming step of handling and triangulating large point lists can be omitted due to the direct operation on epipolar images and direct generation of a so called disparity image fitting exactly on the first of the stereo images. This disparity image – representing already a sort of a dense DSM – contains the distances measured in pixels in the epipolar direction (or a no-data value for a detected occlusion) for each pixel in the image. Despite the global optimization of the cost function many outliers, mismatches and erroneously detected occlusions remain, especially if only one stereo pair is available. To enhance these dense DSM – the disparity image – a pre-segmentation approach is presented in this paper. Since the disparity image is fitting exactly on the first of the two stereo partners (beforehand transformed to epipolar geometry) a direct correlation between image pixels and derived heights (the disparities) exist. This feature of the disparity image is exploited to integrate additional knowledge from the image into the DSM. This is done by segmenting the stereo image, transferring the segmentation information to the DSM and performing a statistical analysis on each of the created DSM segments. Based on this analysis and spectral information a coarse object detection and classification can be performed and in turn the DSM can be enhanced. After the description of the proposed method some results are shown and discussed

    Upgrade of foss date plug-in: Implementation of a new radargrammetric DSM generation capability

    Get PDF
    Synthetic Aperture Radar (SAR) satellite systems may give important contribution in terms of Digital Surface Models (DSMs) generation considering their complete independence from logistic constraints on the ground and weather conditions. In recent years, the new availability of very high resolution SAR data (up to 20 cm Ground Sample Distance) gave a new impulse to radargrammetry and allowed new applications and developments. Besides, to date, among the software aimed to radargrammetric applications only few show as free and open source. It is in this context that it has been decided to widen DATE (Digital Automatic Terrain Extractor) plug-in capabilities and additionally include the possibility to use SAR imagery for DSM stereo reconstruction (i.e. radargrammetry), besides to the optical workflow already developed. DATE is a Free and Open Source Software (FOSS) developed at the Geodesy and Geomatics Division, University of Rome "La Sapienza", and conceived as an OSSIM (Open Source Software Image Map) plug-in. It has been developed starting from May 2014 in the framework of 2014 Google Summer of Code, having as early purpose a fully automatic DSMs generation from high resolution optical satellite imagery acquired by the most common sensors. Here, the results achieved through this new capability applied to two stacks (one ascending and one descending) of three TerraSAR-X images each, acquired over Trento (Northern Italy) testfield, are presented. Global accuracies achieved are around 6 metres. These first results are promising and further analysis are expected for a more complete assessment of DATE application to SAR imagery

    Cascade Residual Learning: A Two-stage Convolutional Neural Network for Stereo Matching

    Full text link
    Leveraging on the recent developments in convolutional neural networks (CNNs), matching dense correspondence from a stereo pair has been cast as a learning problem, with performance exceeding traditional approaches. However, it remains challenging to generate high-quality disparities for the inherently ill-posed regions. To tackle this problem, we propose a novel cascade CNN architecture composing of two stages. The first stage advances the recently proposed DispNet by equipping it with extra up-convolution modules, leading to disparity images with more details. The second stage explicitly rectifies the disparity initialized by the first stage; it couples with the first-stage and generates residual signals across multiple scales. The summation of the outputs from the two stages gives the final disparity. As opposed to directly learning the disparity at the second stage, we show that residual learning provides more effective refinement. Moreover, it also benefits the training of the overall cascade network. Experimentation shows that our cascade residual learning scheme provides state-of-the-art performance for matching stereo correspondence. By the time of the submission of this paper, our method ranks first in the KITTI 2015 stereo benchmark, surpassing the prior works by a noteworthy margin.Comment: Accepted at ICCVW 2017. The first two authors contributed equally to this pape

    Open source tool for DSMs generation from high resolution optical satellite imagery. Development and testing of an OSSIM plug-in

    Get PDF
    The fully automatic generation of digital surface models (DSMs) is still an open research issue. From recent years, computer vision algorithms have been introduced in photogrammetry in order to exploit their capabilities and efficiency in three-dimensional modelling. In this article, a new tool for fully automatic DSMs generation from high resolution satellite optical imagery is presented. In particular, a new iterative approach in order to obtain the quasi-epipolar images from the original stereopairs has been defined and deployed. This approach is implemented in a new Free and Open Source Software (FOSS) named Digital Automatic Terrain Extractor (DATE) developed at the Geodesy and Geomatics Division, University of Rome ‘La Sapienza’, and conceived as an Open Source Software Image Map (OSSIM) plug-in. DATE key features include: the epipolarity achievement in the object space, thanks to the images ground projection (Ground quasi-Epipolar Imagery (GrEI)) and the coarse-to-fine pyramidal scheme adopted; the use of computer vision algorithms in order to improve the processing efficiency and make the DSMs generation process fully automatic; the free and open source aspect of the developed code. The implemented plug-in was validated through two optical datasets, GeoEye-1 and the newest Pléiades-high resolution (HR) imagery, on Trento (Northern Italy) test site. The DSMs, generated on the basis of the metadata rational polynomial coefficients only, without any ground control point, are compared to a reference lidar in areas with different land use/land cover and morphology. The results obtained thanks to the developed workflow are good in terms of statistical parameters (root mean square error around 5 m for GeoEye-1 and around 4 m for Pléiades-HR imagery) and comparable with the results obtained through different software by other authors on the same test site, whereas in terms of efficiency DATE outperforms most of the available commercial software. These first achievements indicate good potential for the developed plug-in, which in a very near future will be also upgraded for synthetic aperture radar and tri-stereo optical imagery processing

    A performance analysis of dense stereo correspondence algorithms and error reduction techniques

    Get PDF
    Abstract: Dense stereo correspondence has been intensely studied and there exists a wide variety of proposed solutions in the literature. Different datasets have been constructed to test stereo algorithms, however, their ground truth formation and scene types vary. In this paper, state-of-the-art algorithms are compared using a number of datasets captured under varied conditions, with accuracy and density metrics forming the basis of a performance evaluation. Pre- and post-processing disparity map error reduction techniques are quantified

    A Real-time Range Finding System with Binocular Stereo Vision

    Get PDF
    To acquire range information for mobile robots, a TMS320DM642 DSP-based range finding system with binocular stereo vision is proposed. Firstly, paired images of the target are captured and a Gaussian filter, as well as improved Sobel kernels, are achieved. Secondly, a feature-based local stereo matching algorithm is performed so that the space location of the target can be determined. Finally, in order to improve the reliability and robustness of the stereo matching algorithm under complex conditions, the confidence filter and the left-right consistency filter are investigated to eliminate the mismatching points. In addition, the range finding algorithm is implemented in the DSP/BIOS operating system to gain real-time control. Experimental results show that the average accuracy of range finding is more than 99% for measuring single-point distances equal to 120cm in the simple scenario and the algorithm takes about 39ms for ranging a time in a complex scenario. The effectivity, as well as the feasibility, of the proposed range finding system are verified

    Eyewitnesses’ Visual Recollection in Suspect Identification by using Facial Appearance Model

    Get PDF
    يعتبر تمييز الوجه مجالًا نشطًا لعلوم التصوير. ومع التطورات الحديثة في تطوير رؤية الكمبيوتر ، يتم تطبيقه على نطاق واسع في مختلف المجالات ، وخاصة في فرض القانون والأمن. ان الوجه البشري مقياس حيوي يمكن استخدامه بفعالية في كل من تحديد الهوية والتحقق منها. حتى الآن ، وبغض النظر عن نموذج الوجه والمقاييس ذات الصلة المستخدمة ، فإن عيبه الرئيس هو أنه يتطلب صورة للوجه ، يتم إجراء المقارنة عليها. لذلك ، هناك حاجة دائمًا إلى أجهزة تلفزيون الدائرة المغلقة وقاعدة بيانات الوجه في نظام التشغيل. وللأسف خلال العقود القليلة الماضية ، شهدنا ظهور حرب غير متكافئة ، حيث يتم ارتكاب أعمال إرهابية في كثير من الأحيان في منطقة منعزلة بدون كاميرا مثبتة وربما بواسطة أشخاص لم يتم حفظ صورهم في أي قاعدة بيانات رسمية قبل الحدث. خلال التحقيقات اللاحقة ، كان على السلطات بالتالي الاعتماد على شهود مصابين بصدمات نفسية واحباط ، وهؤلاء تعتبر شهادتهم مشكوك فيها وغالبًا ما تكون مضللة بشأن ظهور المشتبه فيه. لمعالجة هذه المشكلة ، تقدم هذه الورقة تطبيقًا لنموذج المظهر الإحصائي للوجه الإنساني في المساعدة على تحديد هوية المشتبه به استنادًا إلى التذكر البصري للشاهد. تم تنفيذ نظام نموذج أولي عبر الإنترنت لإظهار وظائفه الأساسية. أشار كل من التقييمات المرئية والعددية الواردة هنا بشكل واضح إلى الفوائد المحتملة للنظام للغرض المقصود.Facial recognition has been an active field of imaging science. With the recent progresses in computer vision development, it is extensively applied in various areas, especially in law enforcement and security. Human face is a viable biometric that could be effectively used in both identification and verification. Thus far, regardless of a facial model and relevant metrics employed, its main shortcoming is that it requires a facial image, against which comparison is made. Therefore, closed circuit televisions and a facial database are always needed in an operational system. For the last few decades, unfortunately, we have experienced an emergence of asymmetric warfare, where acts of terrorism are often committed in secluded area with no camera installed and possibly by persons whose photos have never been kept in any official database prior to the event. During subsequent investigations, the authorities thus had to rely on traumatized and frustrated witnesses, whose testimonial accounts regarding suspect’s appearance are dubious and often misleading. To address this issue, this paper presents an application of a statistical appearance model of human face in assisting suspect identification based on witness’s visual recollection. An online prototype system was implemented to demonstrate its core functionalities. Both visual and numerical assessments reported herein evidentially indicated potential benefits of the system for the intended purpose
    corecore