13,721 research outputs found
Indexing large geographic datasets with compact qualitative representation
© 2015 Taylor & Francis. This paper develops a new mechanism to efficiently compute and compactly store qualitative spatial relations between spatial objects, focusing on topological and directional relations for large datasets of region objects. The central idea is to use minimum bounding rectangles (MBRs) to approximately represent region objects with arbitrary shape and complexity and only store spatial relations that cannot be unambiguously inferred from the relations of corresponding MBRs. We demonstrate, both in theory and practice, that our approach requires considerably less construction time and storage space, and can answer queries more efficiently than the state-of-the-art methods
LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation
Recent works in autonomous driving have widely adopted the bird's-eye-view
(BEV) semantic map as an intermediate representation of the world. Online
prediction of these BEV maps involves non-trivial operations such as
multi-camera data extraction as well as fusion and projection into a common
topview grid. This is usually done with error-prone geometric operations (e.g.,
homography or back-projection from monocular depth estimation) or expensive
direct dense mapping between image pixels and pixels in BEV (e.g., with MLP or
attention). In this work, we present 'LaRa', an efficient encoder-decoder,
transformer-based model for vehicle semantic segmentation from multiple
cameras. Our approach uses a system of cross-attention to aggregate information
over multiple sensors into a compact, yet rich, collection of latent
representations. These latent representations, after being processed by a
series of self-attention blocks, are then reprojected with a second
cross-attention in the BEV space. We demonstrate that our model outperforms the
best previous works using transformers on nuScenes. The code and trained models
are available at https://github.com/valeoai/LaR
Long-term experiments with an adaptive spherical view representation for navigation in changing environments
Real-world environments such as houses and offices change over time, meaning that a mobile robot’s map will become out of date. In this work, we introduce a method to update the reference views in a hybrid metric-topological map so that a mobile robot can continue to localize itself in a changing environment. The updating mechanism, based on the multi-store model of human memory, incorporates a spherical metric representation of the observed visual features for each node in the map, which enables the robot to estimate its heading and navigate using multi-view geometry, as well as representing the local 3D geometry of the environment. A series of experiments demonstrate the persistence performance of the proposed system in real changing environments, including analysis of the long-term stability
Surface Networks
We study data-driven representations for three-dimensional triangle meshes,
which are one of the prevalent objects used to represent 3D geometry. Recent
works have developed models that exploit the intrinsic geometry of manifolds
and graphs, namely the Graph Neural Networks (GNNs) and its spectral variants,
which learn from the local metric tensor via the Laplacian operator. Despite
offering excellent sample complexity and built-in invariances, intrinsic
geometry alone is invariant to isometric deformations, making it unsuitable for
many applications. To overcome this limitation, we propose several upgrades to
GNNs to leverage extrinsic differential geometry properties of
three-dimensional surfaces, increasing its modeling power.
In particular, we propose to exploit the Dirac operator, whose spectrum
detects principal curvature directions --- this is in stark contrast with the
classical Laplace operator, which directly measures mean curvature. We coin the
resulting models \emph{Surface Networks (SN)}. We prove that these models
define shape representations that are stable to deformation and to
discretization, and we demonstrate the efficiency and versatility of SNs on two
challenging tasks: temporal prediction of mesh deformations under non-linear
dynamics and generative models using a variational autoencoder framework with
encoders/decoders given by SNs
- …