24,570 research outputs found
Training of Convolutional Networks on Multiple Heterogeneous Datasets for Street Scene Semantic Segmentation
We propose a convolutional network with hierarchical classifiers for
per-pixel semantic segmentation, which is able to be trained on multiple,
heterogeneous datasets and exploit their semantic hierarchy. Our network is the
first to be simultaneously trained on three different datasets from the
intelligent vehicles domain, i.e. Cityscapes, GTSDB and Mapillary Vistas, and
is able to handle different semantic level-of-detail, class imbalances, and
different annotation types, i.e. dense per-pixel and sparse bounding-box
labels. We assess our hierarchical approach, by comparing against flat,
non-hierarchical classifiers and we show improvements in mean pixel accuracy of
13.0% for Cityscapes classes and 2.4% for Vistas classes and 32.3% for GTSDB
classes. Our implementation achieves inference rates of 17 fps at a resolution
of 520x706 for 108 classes running on a GPU.Comment: IEEE Intelligent Vehicles 201
Human-Machine CRFs for Identifying Bottlenecks in Holistic Scene Understanding
Recent trends in image understanding have pushed for holistic scene
understanding models that jointly reason about various tasks such as object
detection, scene recognition, shape analysis, contextual reasoning, and local
appearance based classifiers. In this work, we are interested in understanding
the roles of these different tasks in improved scene understanding, in
particular semantic segmentation, object detection and scene recognition.
Towards this goal, we "plug-in" human subjects for each of the various
components in a state-of-the-art conditional random field model. Comparisons
among various hybrid human-machine CRFs give us indications of how much "head
room" there is to improve scene understanding by focusing research efforts on
various individual tasks
Combining multiple resolutions into hierarchical representations for kernel-based image classification
Geographic object-based image analysis (GEOBIA) framework has gained
increasing interest recently. Following this popular paradigm, we propose a
novel multiscale classification approach operating on a hierarchical image
representation built from two images at different resolutions. They capture the
same scene with different sensors and are naturally fused together through the
hierarchical representation, where coarser levels are built from a Low Spatial
Resolution (LSR) or Medium Spatial Resolution (MSR) image while finer levels
are generated from a High Spatial Resolution (HSR) or Very High Spatial
Resolution (VHSR) image. Such a representation allows one to benefit from the
context information thanks to the coarser levels, and subregions spatial
arrangement information thanks to the finer levels. Two dedicated structured
kernels are then used to perform machine learning directly on the constructed
hierarchical representation. This strategy overcomes the limits of conventional
GEOBIA classification procedures that can handle only one or very few
pre-selected scales. Experiments run on an urban classification task show that
the proposed approach can highly improve the classification accuracy w.r.t.
conventional approaches working on a single scale.Comment: International Conference on Geographic Object-Based Image Analysis
(GEOBIA 2016), University of Twente in Enschede, The Netherland
- …