209 research outputs found

    Explicit Edge Inconsistency Evaluation Model for Color-Guided Depth Map Enhancement

    Full text link
    © 2016 IEEE. Color-guided depth enhancement is used to refine depth maps according to the assumption that the depth edges and the color edges at the corresponding locations are consistent. In methods on such low-level vision tasks, the Markov random field (MRF), including its variants, is one of the major approaches that have dominated this area for several years. However, the assumption above is not always true. To tackle the problem, the state-of-the-art solutions are to adjust the weighting coefficient inside the smoothness term of the MRF model. These methods lack an explicit evaluation model to quantitatively measure the inconsistency between the depth edge map and the color edge map, so they cannot adaptively control the efforts of the guidance from the color image for depth enhancement, leading to various defects such as texture-copy artifacts and blurring depth edges. In this paper, we propose a quantitative measurement on such inconsistency and explicitly embed it into the smoothness term. The proposed method demonstrates promising experimental results compared with the benchmark and state-of-the-art methods on the Middlebury ToF-Mark, and NYU data sets

    SC-Fuse: A Feature Fusion Approach for Unpaved Road Detection from Remotely Sensed Images

    Get PDF
    Road network extraction from remote sensing imagery is crucial for numerous applications, ranging from autonomous navigation to urban and rural planning. A particularly challenging aspect is the detection of unpaved roads, often underrepresented in research and data. These roads display variability in texture, width, shape, and surroundings, making their detection quite complex. This thesis addresses these challenges by creating a specialized dataset and introducing the SC-Fuse model. Our custom dataset comprises high resolution remote sensing imagery which primarily targets unpaved roads of the American Midwest. To capture the diverse seasonal variation and their impact, the dataset includes images from different times of the year, capturing various weather conditions and offering a comprehensive view of these changing conditions. To detect roads from our custom dataset we developed SC-Fuse model, a novel deep learning architecture designed to extract unpaved road networks from satellite imagery. This model leverages the strengths of dual feature extractors: the Swin Transformer and a Residual CNN. By combining features from these, SC-fuse captures the local as well as the global context of the images. The fusion of these features is done by a Feature Fusion Module which uses Linear Attention Mechanism, to optimize the computational efficiency. A LinkNet based decoder is used to ensure precise road network reconstruction. The evaluation of SC-Fuse model is done using various metrics, including qualitative visual assessments, to test its effectiveness in unpaved road detection. Advisors: Ashok Samal and Cody Stoll

    Rich probabilistic models for semantic labeling

    Get PDF
    Das Ziel dieser Monographie ist es die Methoden und Anwendungen des semantischen Labelings zu erforschen. Unsere Beiträge zu diesem sich rasch entwickelten Thema sind bestimmte Aspekte der Modellierung und der Inferenz in probabilistischen Modellen und ihre Anwendungen in den interdisziplinären Bereichen der Computer Vision sowie medizinischer Bildverarbeitung und Fernerkundung

    On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey

    Full text link
    Stereo matching is one of the longest-standing problems in computer vision with close to 40 years of studies and research. Throughout the years the paradigm has shifted from local, pixel-level decision to various forms of discrete and continuous optimization to data-driven, learning-based methods. Recently, the rise of machine learning and the rapid proliferation of deep learning enhanced stereo matching with new exciting trends and applications unthinkable until a few years ago. Interestingly, the relationship between these two worlds is two-way. While machine, and especially deep, learning advanced the state-of-the-art in stereo matching, stereo itself enabled new ground-breaking methodologies such as self-supervised monocular depth estimation based on deep networks. In this paper, we review recent research in the field of learning-based depth estimation from single and binocular images highlighting the synergies, the successes achieved so far and the open challenges the community is going to face in the immediate future.Comment: Accepted to TPAMI. Paper version of our CVPR 2019 tutorial: "Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges" (https://sites.google.com/view/cvpr-2019-depth-from-image/home

    S3^3M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

    Full text link
    Semantic segmentation and stereo matching are two essential components of 3D environmental perception systems for autonomous driving. Nevertheless, conventional approaches often address these two problems independently, employing separate models for each task. This approach poses practical limitations in real-world scenarios, particularly when computational resources are scarce or real-time performance is imperative. Hence, in this article, we introduce S3^3M-Net, a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously. Specifically, S3^3M-Net shares the features extracted from RGB images between both tasks, resulting in an improved overall scene understanding capability. This feature sharing process is realized using a feature fusion adaption (FFA) module, which effectively transforms the shared features into semantic space and subsequently fuses them with the encoded disparity features. The entire joint learning framework is trained by minimizing a novel semantic consistency-guided (SCG) loss, which places emphasis on the structural consistency in both tasks. Extensive experimental results conducted on the vKITTI2 and KITTI datasets demonstrate the effectiveness of our proposed joint learning framework and its superior performance compared to other state-of-the-art single-task networks. Our project webpage is accessible at mias.group/S3M-Net.Comment: accepted to IEEE Trans. on Intelligent Vehicles (T-IV

    A Multiscale Pyramid Transform for Graph Signals

    Get PDF
    Multiscale transforms designed to process analog and discrete-time signals and images cannot be directly applied to analyze high-dimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic geometric structure of the underlying graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze high-dimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.Comment: 16 pages, 13 figure

    Depth Super-Resolution with Hybrid Camera System

    Get PDF
    An important field of research in computer vision is the 3D analysis and reconstruction of objects and scenes. Currently, among all the the techniques for 3D acquisition, stereo vision systems are the most common. More recently, Time-of-Flight (ToF) range cameras have been introduced. The focus of this thesis is to combine the information from the ToF with one or two standard cameras, in order to obtain a high- resolution depth imageopenEmbargo per motivi di segretezza e/o di proprietĂ  dei risultati e informazioni di enti esterni o aziende private che hanno partecipato alla realizzazione del lavoro di ricerca relativo alla tes
    • …
    corecore