5,119 research outputs found
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Bidirectional Propagation for Cross-Modal 3D Object Detection
Recent works have revealed the superiority of feature-level fusion for
cross-modal 3D object detection, where fine-grained feature propagation from 2D
image pixels to 3D LiDAR points has been widely adopted for performance
improvement. Still, the potential of heterogeneous feature propagation between
2D and 3D domains has not been fully explored. In this paper, in contrast to
existing pixel-to-point feature propagation, we investigate an opposite
point-to-pixel direction, allowing point-wise features to flow inversely into
the 2D image branch. Thus, when jointly optimizing the 2D and 3D streams, the
gradients back-propagated from the 2D image branch can boost the representation
ability of the 3D backbone network working on LiDAR point clouds. Then,
combining pixel-to-point and point-to-pixel information flow mechanisms, we
construct an bidirectional feature propagation framework, dubbed BiProDet. In
addition to the architectural design, we also propose normalized local
coordinates map estimation, a new 2D auxiliary task for the training of the 2D
image branch, which facilitates learning local spatial-aware features from the
image modality and implicitly enhances the overall 3D detection performance.
Extensive experiments and ablation studies validate the effectiveness of our
method. Notably, we rank on the highly competitive
KITTI benchmark on the cyclist class by the time of submission. The source code
is available at https://github.com/Eaphan/BiProDet.Comment: Accepted by ICLR2023. Code is avaliable at
https://github.com/Eaphan/BiProDe
PointCaM: Cut-and-Mix for Open-Set Point Cloud Learning
Point cloud learning is receiving increasing attention, however, most
existing point cloud models lack the practical ability to deal with the
unavoidable presence of unknown objects. This paper mainly discusses point
cloud learning under open-set settings, where we train the model without data
from unknown classes and identify them in the inference stage. Basically, we
propose to solve open-set point cloud learning using a novel Point Cut-and-Mix
mechanism consisting of Unknown-Point Simulator and Unknown-Point Estimator
modules. Specifically, we use the Unknown-Point Simulator to simulate
out-of-distribution data in the training stage by manipulating the geometric
context of partial known data. Based on this, the Unknown-Point Estimator
module learns to exploit the point cloud's feature context for discriminating
the known and unknown data. Extensive experiments show the plausibility of
open-set point cloud learning and the effectiveness of our proposed solutions.
Our code is available at \url{https://github.com/ShiQiu0419/pointcam}
Structured Indoor Modeling
In this dissertation, we propose data-driven approaches to reconstruct 3D models for indoor scenes which are represented in a structured way (e.g., a wall is represented by a planar surface and two rooms are connected via the wall). The structured representation of models is more application ready than dense representations (e.g., a point cloud), but poses additional challenges for reconstruction since extracting structures requires high-level understanding about geometries. To address this challenging problem, we explore two common structural regularities of indoor scenes: 1) most indoor structures consist of planar surfaces (planarity), and 2) structural surfaces (e.g., walls and floor) can be represented by a 2D floorplan as a top-down view projection (orthogonality). With breakthroughs in data capturing techniques, we develop automated systems to tackle structured modeling problems, namely piece-wise planar reconstruction and floorplan reconstruction, by learning shape priors (i.e., planarity and orthogonality) from data. With structured representations and production-level quality, the reconstructed models have an immediate impact on many industrial applications
TriangleNet: Edge Prior Augmented Network for Semantic Segmentation through Cross-Task Consistency
This paper addresses the task of semantic segmentation in computer vision,
aiming to achieve precise pixel-wise classification. We investigate the joint
training of models for semantic edge detection and semantic segmentation, which
has shown promise. However, implicit cross-task consistency learning in
multi-task networks is limited. To address this, we propose a novel "decoupled
cross-task consistency loss" that explicitly enhances cross-task consistency.
Our semantic segmentation network, TriangleNet, achieves a substantial 2.88\%
improvement over the Baseline in mean Intersection over Union (mIoU) on the
Cityscapes test set. Notably, TriangleNet operates at 77.4\% mIoU/46.2 FPS on
Cityscapes, showcasing real-time inference capabilities at full resolution.
With multi-scale inference, performance is further enhanced to 77.8\%.
Furthermore, TriangleNet consistently outperforms the Baseline on the FloodNet
dataset, demonstrating its robust generalization capabilities. The proposed
method underscores the significance of multi-task learning and explicit
cross-task consistency enhancement for advancing semantic segmentation and
highlights the potential of multitasking in real-time semantic segmentation.Comment: Accepted for publication in the journal "International Journal of
Intelligent Systems
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
- …