19,978 research outputs found
Hierarchical morphological segmentation for image sequence coding
This paper deals with a hierarchical morphological segmentation algorithm for image sequence coding. Mathematical morphology is very attractive for this purpose because it efficiently deals with geometrical features such as size, shape, contrast, or connectivity that can be considered as segmentation-oriented features. The algorithm follows a top-down procedure. It first takes into account the global information and produces a coarse segmentation, that is, with a small number of regions. Then, the segmentation quality is improved by introducing regions corresponding to more local information. The algorithm, considering sequences as being functions on a 3-D space, directly segments 3-D regions. A 3-D approach is used to get a segmentation that is stable in time and to directly solve the region correspondence problem. Each segmentation stage relies on four basic steps: simplification, marker extraction, decision, and quality estimation. The simplification removes information from the sequence to make it easier to segment. Morphological filters based on partial reconstruction are proven to be very efficient for this purpose, especially in the case of sequences. The marker extraction identifies the presence of homogeneous 3-D regions. It is based on constrained flat region labeling and morphological contrast extraction. The goal of the decision is to precisely locate the contours of regions detected by the marker extraction. This decision is performed by a modified watershed algorithm. Finally, the quality estimation concentrates on the coding residue, all the information about the 3-D regions that have not been properly segmented and therefore coded. The procedure allows the introduction of the texture and contour coding schemes within the segmentation algorithm. The coding residue is transmitted to the next segmentation stage to improve the segmentation and coding quality. Finally, segmentation and coding examples are presented to show the validity and interest of the coding approach.Peer ReviewedPostprint (published version
Efficient Continuous-Time SLAM for 3D Lidar-Based Online Mapping
Modern 3D laser-range scanners have a high data rate, making online
simultaneous localization and mapping (SLAM) computationally challenging.
Recursive state estimation techniques are efficient but commit to a state
estimate immediately after a new scan is made, which may lead to misalignments
of measurements. We present a 3D SLAM approach that allows for refining
alignments during online mapping. Our method is based on efficient local
mapping and a hierarchical optimization back-end. Measurements of a 3D laser
scanner are aggregated in local multiresolution maps by means of surfel-based
registration. The local maps are used in a multi-level graph for allocentric
mapping and localization. In order to incorporate corrections when refining the
alignment, the individual 3D scans in the local map are modeled as a sub-graph
and graph optimization is performed to account for drift and misalignments in
the local maps. Furthermore, in each sub-graph, a continuous-time
representation of the sensor trajectory allows to correct measurements between
scan poses. We evaluate our approach in multiple experiments by showing
qualitative results. Furthermore, we quantify the map quality by an
entropy-based measure.Comment: In: Proceedings of the International Conference on Robotics and
Automation (ICRA) 201
Complexity Analysis Of Next-Generation VVC Encoding and Decoding
While the next generation video compression standard, Versatile Video Coding
(VVC), provides a superior compression efficiency, its computational complexity
dramatically increases. This paper thoroughly analyzes this complexity for both
encoder and decoder of VVC Test Model 6, by quantifying the complexity
break-down for each coding tool and measuring the complexity and memory
requirements for VVC encoding/decoding. These extensive analyses are performed
for six video sequences of 720p, 1080p, and 2160p, under Low-Delay (LD),
Random-Access (RA), and All-Intra (AI) conditions (a total of 320
encoding/decoding). Results indicate that the VVC encoder and decoder are 5x
and 1.5x more complex compared to HEVC in LD, and 31x and 1.8x in AI,
respectively. Detailed analysis of coding tools reveals that in LD on average,
motion estimation tools with 53%, transformation and quantization with 22%, and
entropy coding with 7% dominate the encoding complexity. In decoding, loop
filters with 30%, motion compensation with 20%, and entropy decoding with 16%,
are the most complex modules. Moreover, the required memory bandwidth for VVC
encoding/decoding are measured through memory profiling, which are 30x and 3x
of HEVC. The reported results and insights are a guide for future research and
implementations of energy-efficient VVC encoder/decoder.Comment: IEEE ICIP 202
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective Mapping
Many tasks performed by autonomous vehicles such as road marking detection,
object tracking, and path planning are simpler in bird's-eye view. Hence,
Inverse Perspective Mapping (IPM) is often applied to remove the perspective
effect from a vehicle's front-facing camera and to remap its images into a 2D
domain, resulting in a top-down view. Unfortunately, however, this leads to
unnatural blurring and stretching of objects at further distance, due to the
resolution of the camera, limiting applicability. In this paper, we present an
adversarial learning approach for generating a significantly improved IPM from
a single camera image in real time. The generated bird's-eye-view images
contain sharper features (e.g. road markings) and a more homogeneous
illumination, while (dynamic) objects are automatically removed from the scene,
thus revealing the underlying road layout in an improved fashion. We
demonstrate our framework using real-world data from the Oxford RobotCar
Dataset and show that scene understanding tasks directly benefit from our
boosted IPM approach.Comment: equal contribution of first two authors, 8 full pages, 6 figures,
accepted at IV 201
Intra-WZ quantization mismatch in distributed video coding
During the past decade, Distributed Video Coding (DVC) has emerged as a new video coding paradigm, shifting the complexity from the encoder-to the decoder-side. This paper addresses a problem of current DVC architectures that has not been studied in the literature so far, that is, the mismatch between the intra and Wyner-Ziv (WZ) quantization processes. Due to this mismatch, WZ rate is spent even for spatial regions that are accurately approximated by the side-information. As a solution, this paper proposes side-information generation using selective unidirectional motion compensation from temporally adjacent WZ frames. Experimental results show that the proposed approach yields promising WZ rate gains of up to 7% relative to the conventional method
- …