16,155 research outputs found
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
Looking Fast and Slow: Memory-Guided Mobile Video Object Detection
With a single eye fixation lasting a fraction of a second, the human visual
system is capable of forming a rich representation of a complex environment,
reaching a holistic understanding which facilitates object recognition and
detection. This phenomenon is known as recognizing the "gist" of the scene and
is accomplished by relying on relevant prior knowledge. This paper addresses
the analogous question of whether using memory in computer vision systems can
not only improve the accuracy of object detection in video streams, but also
reduce the computation time. By interleaving conventional feature extractors
with extremely lightweight ones which only need to recognize the gist of the
scene, we show that minimal computation is required to produce accurate
detections when temporal memory is present. In addition, we show that the
memory contains enough information for deploying reinforcement learning
algorithms to learn an adaptive inference policy. Our model achieves
state-of-the-art performance among mobile methods on the Imagenet VID 2015
dataset, while running at speeds of up to 70+ FPS on a Pixel 3 phone
- …