Search CORE

209 research outputs found

Explicit Edge Inconsistency Evaluation Model for Color-Guided Depth Map Enhancement

Author: An P
Wu Q
Zhang J
Zuo Y
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

© 2016 IEEE. Color-guided depth enhancement is used to refine depth maps according to the assumption that the depth edges and the color edges at the corresponding locations are consistent. In methods on such low-level vision tasks, the Markov random field (MRF), including its variants, is one of the major approaches that have dominated this area for several years. However, the assumption above is not always true. To tackle the problem, the state-of-the-art solutions are to adjust the weighting coefficient inside the smoothness term of the MRF model. These methods lack an explicit evaluation model to quantitatively measure the inconsistency between the depth edge map and the color edge map, so they cannot adaptively control the efforts of the guidance from the color image for depth enhancement, leading to various defects such as texture-copy artifacts and blurring depth edges. In this paper, we propose a quantitative measurement on such inconsistency and explicitly embed it into the smoothness term. The proposed method demonstrates promising experimental results compared with the benchmark and state-of-the-art methods on the Middlebury ToF-Mark, and NYU data sets

Crossref

OPUS - University of Technology Sydney

SC-Fuse: A Feature Fusion Approach for Unpaved Road Detection from Remotely Sensed Images

Author: Saxena Aniruddh
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/12/2023
Field of study

Road network extraction from remote sensing imagery is crucial for numerous applications, ranging from autonomous navigation to urban and rural planning. A particularly challenging aspect is the detection of unpaved roads, often underrepresented in research and data. These roads display variability in texture, width, shape, and surroundings, making their detection quite complex. This thesis addresses these challenges by creating a specialized dataset and introducing the SC-Fuse model. Our custom dataset comprises high resolution remote sensing imagery which primarily targets unpaved roads of the American Midwest. To capture the diverse seasonal variation and their impact, the dataset includes images from different times of the year, capturing various weather conditions and offering a comprehensive view of these changing conditions. To detect roads from our custom dataset we developed SC-Fuse model, a novel deep learning architecture designed to extract unpaved road networks from satellite imagery. This model leverages the strengths of dual feature extractors: the Swin Transformer and a Residual CNN. By combining features from these, SC-fuse captures the local as well as the global context of the images. The fusion of these features is done by a Feature Fusion Module which uses Linear Attention Mechanism, to optimize the computational efficiency. A LinkNet based decoder is used to ensure precise road network reconstruction. The evaluation of SC-Fuse model is done using various metrics, including qualitative visual assessments, to test its effectiveness in unpaved road detection. Advisors: Ashok Samal and Cody Stoll

DigitalCommons@University of Nebraska

Rich probabilistic models for semantic labeling

Author: Yang Michael Ying
Publication venue: Hannover : Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2016
Field of study

Das Ziel dieser Monographie ist es die Methoden und Anwendungen des semantischen Labelings zu erforschen. Unsere Beiträge zu diesem sich rasch entwickelten Thema sind bestimmte Aspekte der Modellierung und der Inferenz in probabilistischen Modellen und ihre Anwendungen in den interdisziplinären Bereichen der Computer Vision sowie medizinischer Bildverarbeitung und Fernerkundung

Institutionelles Repositorium der Leibniz Universität Hannover

On the Synergies between Machine Learning and Binocular Stereo for Depth Estimation from Images: a Survey

Author: Batsos Konstantinos
Mattoccia Stefano
Mordohai Philippos
Poggi Matteo
Tosi Fabio
Publication venue
Publication date: 31/03/2021
Field of study

Stereo matching is one of the longest-standing problems in computer vision with close to 40 years of studies and research. Throughout the years the paradigm has shifted from local, pixel-level decision to various forms of discrete and continuous optimization to data-driven, learning-based methods. Recently, the rise of machine learning and the rapid proliferation of deep learning enhanced stereo matching with new exciting trends and applications unthinkable until a few years ago. Interestingly, the relationship between these two worlds is two-way. While machine, and especially deep, learning advanced the state-of-the-art in stereo matching, stereo itself enabled new ground-breaking methodologies such as self-supervised monocular depth estimation based on deep networks. In this paper, we review recent research in the field of learning-based depth estimation from single and binocular images highlighting the synergies, the successes achieved so far and the open challenges the community is going to face in the immediate future.Comment: Accepted to TPAMI. Paper version of our CVPR 2019 tutorial: "Learning-based depth estimation from stereo and monocular images: successes, limitations and future challenges" (https://sites.google.com/view/cvpr-2019-depth-from-image/home

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

S $^3$ M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

Author: Chen Qijun
Fan Rui
Feng Yi
Liu Chuang-Wei
Wu Zhiyuan
Yu Fisher
Publication venue
Publication date: 28/01/2024
Field of study

Semantic segmentation and stereo matching are two essential components of 3D environmental perception systems for autonomous driving. Nevertheless, conventional approaches often address these two problems independently, employing separate models for each task. This approach poses practical limitations in real-world scenarios, particularly when computational resources are scarce or real-time performance is imperative. Hence, in this article, we introduce S

^3

M-Net, a novel joint learning framework developed to perform semantic segmentation and stereo matching simultaneously. Specifically, S

^3

M-Net shares the features extracted from RGB images between both tasks, resulting in an improved overall scene understanding capability. This feature sharing process is realized using a feature fusion adaption (FFA) module, which effectively transforms the shared features into semantic space and subsequently fuses them with the encoded disparity features. The entire joint learning framework is trained by minimizing a novel semantic consistency-guided (SCG) loss, which places emphasis on the structural consistency in both tasks. Extensive experimental results conducted on the vKITTI2 and KITTI datasets demonstrate the effectiveness of our proposed joint learning framework and its superior performance compared to other state-of-the-art single-task networks. Our project webpage is accessible at mias.group/S3M-Net.Comment: accepted to IEEE Trans. on Intelligent Vehicles (T-IV

arXiv.org e-Print Archive

A Multiscale Pyramid Transform for Graph Signals

Author: Faraji Mohammad Javad
Shuman David I
Vandergheynst Pierre
Publication venue
Publication date: 18/08/2015
Field of study

Multiscale transforms designed to process analog and discrete-time signals and images cannot be directly applied to analyze high-dimensional data residing on the vertices of a weighted graph, as they do not capture the intrinsic geometric structure of the underlying graph data domain. In this paper, we adapt the Laplacian pyramid transform for signals on Euclidean domains so that it can be used to analyze high-dimensional data residing on the vertices of a weighted graph. Our approach is to study existing methods and develop new methods for the four fundamental operations of graph downsampling, graph reduction, and filtering and interpolation of signals on graphs. Equipped with appropriate notions of these operations, we leverage the basic multiscale constructs and intuitions from classical signal processing to generate a transform that yields both a multiresolution of graphs and an associated multiresolution of a graph signal on the underlying sequence of graphs.Comment: 16 pages, 13 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Depth Super-Resolution with Hybrid Camera System

Author: Vianello Alessandro
Publication venue
Publication date: 08/04/2022
Field of study

An important field of research in computer vision is the 3D analysis and reconstruction of objects and scenes. Currently, among all the the techniques for 3D acquisition, stereo vision systems are the most common. More recently, Time-of-Flight (ToF) range cameras have been introduced. The focus of this thesis is to combine the information from the ToF with one or two standard cameras, in order to obtain a high- resolution depth imageopenEmbargo per motivi di segretezza e/o di proprietà dei risultati e informazioni di enti esterni o aziende private che hanno partecipato alla realizzazione del lavoro di ricerca relativo alla tes

Padua Thesis and Dissertation Archive

Recommended from our members

Automatic channel detection using deep learning

Author: Pham Nam Phuong
Publication venue
Publication date: 25/06/2019
Field of study

Picking 3D channel geobodies in seismic volumes is an important objective in seismic interpretation for hydrocarbon exploration. Manual detection of channel geobodies is a time-consuming and subjective process. The interpreter can calculate different seismic attributes such as coherence to aid for manual detection of channel geobodies in seismic volumes. However, these attributes still do not directly identify 3D channel geobodies. Machine learning and deep learning are data-driven techniques that have been getting more attention recently in different fields, such as medical imaging and computer vision. With large volumes of available data in different types and a development of powerful computational resources, geophysics is a promising field for applying machine learning and deep learning. Many seismic interpretation steps are analogous to different problems in computer vision that have been solved successfully using deep learning. Channel detection in seismic volumes is analogous to segmentation problems for images. Applying deep learning to seismic interpretations, specifically to automatic channel detection in 3D seismic volumes, can make the process faster and the workflow less subjective. Decision-making based on interpretations is uncertain; so uncertainties in interpretation results are very important. Deep learning with different algorithms can also help interpreters quantify this uncertainty.Geological Science

Texas ScholarWorks