546 research outputs found

    Low-rank Based Algorithms for Rectification, Repetition Detection and De-noising in Urban Images

    Full text link
    In this thesis, we aim to solve the problem of automatic image rectification and repeated patterns detection on 2D urban images, using novel low-rank based techniques. Repeated patterns (such as windows, tiles, balconies and doors) are prominent and significant features in urban scenes. Detection of the periodic structures is useful in many applications such as photorealistic 3D reconstruction, 2D-to-3D alignment, facade parsing, city modeling, classification, navigation, visualization in 3D map environments, shape completion, cinematography and 3D games. However both of the image rectification and repeated patterns detection problems are challenging due to scene occlusions, varying illumination, pose variation and sensor noise. Therefore, detection of these repeated patterns becomes very important for city scene analysis. Given a 2D image of urban scene, we automatically rectify a facade image and extract facade textures first. Based on the rectified facade texture, we exploit novel algorithms that extract repeated patterns by using Kronecker product based modeling that is based on a solid theoretical foundation. We have tested our algorithms in a large set of images, which includes building facades from Paris, Hong Kong and New York

    Relating Multimodal Imagery Data in 3D

    Get PDF
    This research develops and improves the fundamental mathematical approaches and techniques required to relate imagery and imagery derived multimodal products in 3D. Image registration, in a 2D sense, will always be limited by the 3D effects of viewing geometry on the target. Therefore, effects such as occlusion, parallax, shadowing, and terrain/building elevation can often be mitigated with even a modest amounts of 3D target modeling. Additionally, the imaged scene may appear radically different based on the sensed modality of interest; this is evident from the differences in visible, infrared, polarimetric, and radar imagery of the same site. This thesis develops a `model-centric\u27 approach to relating multimodal imagery in a 3D environment. By correctly modeling a site of interest, both geometrically and physically, it is possible to remove/mitigate some of the most difficult challenges associated with multimodal image registration. In order to accomplish this feat, the mathematical framework necessary to relate imagery to geometric models is thoroughly examined. Since geometric models may need to be generated to apply this `model-centric\u27 approach, this research develops methods to derive 3D models from imagery and LIDAR data. Of critical note, is the implementation of complimentary techniques for relating multimodal imagery that utilize the geometric model in concert with physics based modeling to simulate scene appearance under diverse imaging scenarios. Finally, the often neglected final phase of mapping localized image registration results back to the world coordinate system model for final data archival are addressed. In short, once a target site is properly modeled, both geometrically and physically, it is possible to orient the 3D model to the same viewing perspective as a captured image to enable proper registration. If done accurately, the synthetic model\u27s physical appearance can simulate the imaged modality of interest while simultaneously removing the 3-D ambiguity between the model and the captured image. Once registered, the captured image can then be archived as a texture map on the geometric site model. In this way, the 3D information that was lost when the image was acquired can be regained and properly related with other datasets for data fusion and analysis

    View point robust visual search technique

    Get PDF
    In this thesis, we have explored visual search techniques for images taken from diferent view points and have tried to enhance the matching capability under view point changes. We have proposed the Homography based back-projection as post-processing stage of Compact Descriptors for Visual Search (CDVS), the new MPEG standard; moreover, we have deined the aine adapted scale space based aine detection, which steers the Gaussian scale space to capture the features from aine transformed images; we have also developed the corresponding gradient based aine descriptor. Using these proposed techniques, the image retrieval robustness to aine transformations has been signiicantly improved. The irst chapter of this thesis introduces the background on visual search. In the second chapter, we propose a homography based back-projection used as the postprocessing stage of CDVS to improve the resilience to view point changes. The theory behind this proposal is that each perspective projection of the image of 2D object can be simulated as an aine transformation. Each pair of aine transformations are mathematically related by homography matrix. Given that matrix, the image can be back-projected to simulate the image of another view point. In this way, the real matched images can then be declared as matching because the perspective distortion has been reduced by the back-projection. An accurate homography estimation from the images of diferent view point requires at least 4 correspondences, which could be ofered by the CDVS pipeline. In this way, the homography based back-projection can be used to scrutinize the images with not enough matched keypoints. If they contain some homography relations, the perspective distortion can then be reduced exploiting the few provided correspondences. In the experiment, this technique has been proved to be quite efective especially to the 2D object images. The third chapter introduces the scale space, which is also the kernel to the feature detection for the scale invariant visual search techniques. Scale space, which is made by a series of Gaussian blurred images, represents the image structures at diferent level of details. The Gaussian smoothed images in the scale space result in feature detection being not invariant to aine transformations. That is the reason why scale invariant visual search techniques are sensitive to aine transformations. Thus, in this chapter, we propose an aine adapted scale space, which employs the aine steered Gaussian ilters to smooth the images. This scale space is lexible to diferent aine transformations and it well represents the image structures from diferent view points. With the help of this structure, the features from diferent view points can be well captured. In practice, the scale invariant visual search techniques have employed a pyramid structure to speed up the construction. By employing the aine Gaussian scale space principles, we also propose two structures to build the aine Gaussian scale space. The structure of aine Gaussian scale space is similar to the pyramid structure because of the similiar sampling and cascading iii properties. Conversely, the aine Laplacian of Gaussian (LoG) structure is completely diferent. The Laplacian operator, under aine transformation, is hard to be aine deformed. Diferently from a simple Laplacian operation on the scale space to build the general LoG construction, the aine LoG can only be obtained by aine LoG convolution and the cascade implementations on the aine scale space. Using our proposed structures, both the aine Gaussian scale space and aine LoG can be constructed. We have also explored the aine scale space implementation in frequency domain. In the second chapter, we will also explore the spectrum of Gaussian image smoothing under the aine transformation, and propose two structures. General speaking, the implementation in frequency domain is more robust to aine transformations at the expense of a higher computational complexity. It makes sense to adopt an aine descriptor for the aine invariant visual search. In the fourth chapter, we will propose an aine invariant feature descriptor based on aine gradient. Currently, the state of the art feature descriptors, including SIFT and Gradient location and orientation histogram (GLOH), are based on the histogram of image gradient around the detected features. If the image gradient is calculated as the diference of the adjacent pixels, it will not be aine invariant. Thus in that chapter, we irst propose an aine gradient which will contribute the aine invariance to the descriptor. This aine gradient will be calculated directly by the derivative of the aine Gaussian blurred images. To simplify the processing, we will also create the corresponding aine Gaussian derivative ilters for diferent detected scales to quickly generate the aine gradient. With this aine gradient, we can apply the same scheme of SIFT descriptor to generate the gradient histogram. By normalizing the histogram, the aine descriptor can then be formed. This aine descriptor is not only aine invariant but also rotation invariant, because the direction of the area to form the histogram is determined by the main direction of the gradient around the features. In practice, this aine descriptor is fully aine invariant and its performance for image matching is extremely good. In the conclusions chapter, we draw some conclusions and describe some future work

    Methods for Real-time Visualization and Interaction with Landforms

    Get PDF
    This thesis presents methods to enrich data modeling and analysis in the geoscience domain with a particular focus on geomorphological applications. First, a short overview of the relevant characteristics of the used remote sensing data and basics of its processing and visualization are provided. Then, two new methods for the visualization of vector-based maps on digital elevation models (DEMs) are presented. The first method uses a texture-based approach that generates a texture from the input maps at runtime taking into account the current viewpoint. In contrast to that, the second method utilizes the stencil buffer to create a mask in image space that is then used to render the map on top of the DEM. A particular challenge in this context is posed by the view-dependent level-of-detail representation of the terrain geometry. After suitable visualization methods for vector-based maps have been investigated, two landform mapping tools for the interactive generation of such maps are presented. The user can carry out the mapping directly on the textured digital elevation model and thus benefit from the 3D visualization of the relief. Additionally, semi-automatic image segmentation techniques are applied in order to reduce the amount of user interaction required and thus make the mapping process more efficient and convenient. The challenge in the adaption of the methods lies in the transfer of the algorithms to the quadtree representation of the data and in the application of out-of-core and hierarchical methods to ensure interactive performance. Although high-resolution remote sensing data are often available today, their effective resolution at steep slopes is rather low due to the oblique acquisition angle. For this reason, remote sensing data are suitable to only a limited extent for visualization as well as landform mapping purposes. To provide an easy way to supply additional imagery, an algorithm for registering uncalibrated photos to a textured digital elevation model is presented. A particular challenge in registering the images is posed by large variations in the photos concerning resolution, lighting conditions, seasonal changes, etc. The registered photos can be used to increase the visual quality of the textured DEM, in particular at steep slopes. To this end, a method is presented that combines several georegistered photos to textures for the DEM. The difficulty in this compositing process is to create a consistent appearance and avoid visible seams between the photos. In addition to that, the photos also provide valuable means to improve landform mapping. To this end, an extension of the landform mapping methods is presented that allows the utilization of the registered photos during mapping. This way, a detailed and exact mapping becomes feasible even at steep slopes

    Using information content to select keypoints for UAV image matching

    Get PDF
    Image matching is one of the most important tasks in Unmanned Arial Vehicles (UAV) photogrammetry applications. The number and distribution of extracted keypoints play an essential role in the reliability and accuracy of image matching and orientation results. Conventional detectors generally produce too many redundant keypoints. In this paper, we study the effect of applying various information content criteria to keypoint selection tasks. For this reason, the quality measures of entropy, spatial saliency and texture coefficient are used to select keypoints extracted using SIFT, SURF, MSER and BRISK operators. Experiments are conducted using several synthetic and real UAV image pairs. Results show that the keypoint selection methods perform differently based on the applied detector and scene type, but in most cases, the precision of the matching results is improved by an average of 15%. In general, it can be said that applying proper keypoint selection techniques can improve the accuracy and efficiency of UAV image matching and orientation results. In addition to the evaluation, a new hybrid keypoint selection is proposed that combines all of the information content criteria discussed in this paper. This new screening method was also compared with those of SIFT, which showed 22% to 40% improvement for the bundle adjustment of UAV images

    Dense monocular visual odometry aided by range sensors for mobile robots

    Get PDF
    We propose and describe the use of a monocular set-up that should provide both Visual Odometry that Dense Reconstruction. The thesis mainly focuses on the second part and tries to define an algorithm for Dense Reconstruction suitable for a monocular set-up. We show that the Monocular Dense Reconstruction algorithm works in different possible configuration. However, it struggles to solve the more complex epipolar geometries.ope
    • …
    corecore