487 research outputs found
Efficient moving point handling for incremental 3D manifold reconstruction
As incremental Structure from Motion algorithms become effective, a good
sparse point cloud representing the map of the scene becomes available
frame-by-frame. From the 3D Delaunay triangulation of these points,
state-of-the-art algorithms build a manifold rough model of the scene. These
algorithms integrate incrementally new points to the 3D reconstruction only if
their position estimate does not change. Indeed, whenever a point moves in a 3D
Delaunay triangulation, for instance because its estimation gets refined, a set
of tetrahedra have to be removed and replaced with new ones to maintain the
Delaunay property; the management of the manifold reconstruction becomes thus
complex and it entails a potentially big overhead. In this paper we investigate
different approaches and we propose an efficient policy to deal with moving
points in the manifold estimation process. We tested our approach with four
sequences of the KITTI dataset and we show the effectiveness of our proposal in
comparison with state-of-the-art approaches.Comment: Accepted in International Conference on Image Analysis and Processing
(ICIAP 2015
Mesh-based 3D Textured Urban Mapping
In the era of autonomous driving, urban mapping represents a core step to let
vehicles interact with the urban context. Successful mapping algorithms have
been proposed in the last decade building the map leveraging on data from a
single sensor. The focus of the system presented in this paper is twofold: the
joint estimation of a 3D map from lidar data and images, based on a 3D mesh,
and its texturing. Indeed, even if most surveying vehicles for mapping are
endowed by cameras and lidar, existing mapping algorithms usually rely on
either images or lidar data; moreover both image-based and lidar-based systems
often represent the map as a point cloud, while a continuous textured mesh
representation would be useful for visualization and navigation purposes. In
the proposed framework, we join the accuracy of the 3D lidar data, and the
dense information and appearance carried by the images, in estimating a
visibility consistent map upon the lidar measurements, and refining it
photometrically through the acquired images. We evaluate the proposed framework
against the KITTI dataset and we show the performance improvement with respect
to two state of the art urban mapping algorithms, and two widely used surface
reconstruction algorithms in Computer Graphics.Comment: accepted at iros 201
ART-SLAM: Accurate Real-Time 6DoF LiDAR SLAM
Real-time six degrees-of-freedom pose estimation with ground vehicles represents a relevant and well-studied topic in robotics due to its many applications such as autonomous driving and 3D mapping. Although some systems already exist, they are either not accurate or they struggle in real-time settings. In this letter, we propose a fast, accurate and modular LiDAR SLAM system for both batch and online estimation. We first apply downsampling and outlier removal, to filter out noise and reduce the size of the input point clouds. Filtered clouds are then used for pose tracking, possibly aided by a pre-tracking module, and floor detection, to ground optimize the estimated trajectory. Efficient multi-steps loop closure and pose optimization, achieved through a g2o pose graph, are the last steps of the proposed SLAM pipeline. We compare the performance of our system with state-of-the-art point cloud-based methods, LOAM, LeGO-LOAM, A-LOAM, LeGO-LOAM-BOR, LIO-SAM and HDL, and show that the proposed system achieves equal or better accuracy and can easily handle even cases without loops. The comparison is done evaluating the estimated trajectory displacement using the KITTI (urban driving) and Chilean (underground mine) datasets
Multi-View Stereo with Single-View Semantic Mesh Refinement
While 3D reconstruction is a well-established and widely explored research
topic, semantic 3D reconstruction has only recently witnessed an increasing
share of attention from the Computer Vision community. Semantic annotations
allow in fact to enforce strong class-dependent priors, as planarity for ground
and walls, which can be exploited to refine the reconstruction often resulting
in non-trivial performance improvements. State-of-the art methods propose
volumetric approaches to fuse RGB image data with semantic labels; even if
successful, they do not scale well and fail to output high resolution meshes.
In this paper we propose a novel method to refine both the geometry and the
semantic labeling of a given mesh. We refine the mesh geometry by applying a
variational method that optimizes a composite energy made of a state-of-the-art
pairwise photo-metric term and a single-view term that models the semantic
consistency between the labels of the 3D mesh and those of the segmented
images. We also update the semantic labeling through a novel Markov Random
Field (MRF) formulation that, together with the classical data and smoothness
terms, takes into account class-specific priors estimated directly from the
annotated mesh. This is in contrast to state-of-the-art methods that are
typically based on handcrafted or learned priors. We are the first, jointly
with the very recent and seminal work of [M. Blaha et al arXiv:1706.08336,
2017], to propose the use of semantics inside a mesh refinement framework.
Differently from [M. Blaha et al arXiv:1706.08336, 2017], which adopts a more
classical pairwise comparison to estimate the flow of the mesh, we apply a
single-view comparison between the semantically annotated image and the current
3D mesh labels; this improves the robustness in case of noisy segmentations.Comment: {\pounds}D Reconstruction Meets Semantic, ICCV worksho
Attention Mechanisms for Object Recognition with Event-Based Cameras
Event-based cameras are neuromorphic sensors capable of efficiently encoding
visual information in the form of sparse sequences of events. Being
biologically inspired, they are commonly used to exploit some of the
computational and power consumption benefits of biological vision. In this
paper we focus on a specific feature of vision: visual attention. We propose
two attentive models for event based vision: an algorithm that tracks events
activity within the field of view to locate regions of interest and a
fully-differentiable attention procedure based on DRAW neural model. We
highlight the strengths and weaknesses of the proposed methods on four
datasets, the Shifted N-MNIST, Shifted MNIST-DVS, CIFAR10-DVS and N-Caltech101
collections, using the Phased LSTM recognition network as a baseline reference
model obtaining improvements in terms of both translation and scale invariance.Comment: WACV2019 camera-ready submissio
Advancements in Radar Odometry
Radar odometry estimation has emerged as a critical technique in the field of
autonomous navigation, providing robust and reliable motion estimation under
various environmental conditions. Despite its potential, the complex nature of
radar signals and the inherent challenges associated with processing these
signals have limited the widespread adoption of this technology. This paper
aims to address these challenges by proposing novel improvements to an existing
method for radar odometry estimation, designed to enhance accuracy and
reliability in diverse scenarios. Our pipeline consists of filtering, motion
compensation, oriented surface points computation, smoothing, one-to-many radar
scan registration, and pose refinement. The developed method enforces local
understanding of the scene, by adding additional information through smoothing
techniques, and alignment of consecutive scans, as a refinement posterior to
the one-to-many registration. We present an in-depth investigation of the
contribution of each improvement to the localization accuracy, and we benchmark
our system on the sequences of the main datasets for radar understanding, i.e.,
the Oxford Radar RobotCar, MulRan, and Boreas datasets. The proposed pipeline
is able to achieve superior results, on all scenarios considered and under
harsh environmental constraints
On the precision of 6 DoF IMU-LiDAR based localization in GNSS-denied scenarios
Positioning and navigation represent relevant topics in the field of robotics, due to their multiple applications in real-world scenarios, ranging from autonomous driving to harsh environment exploration. Despite localization in outdoor environments is generally achieved using a Global Navigation Satellite System (GNSS) receiver, global navigation satellite system-denied environments are typical of many situations, especially in indoor settings. Autonomous robots are commonly equipped with multiple sensors, including laser rangefinders, IMUs, and odometers, which can be used for mapping and localization, overcoming the need for global navigation satellite system data. In literature, almost no information can be found on the positioning accuracy and precision of 6 Degrees of Freedom Light Detection and Ranging (LiDAR) localization systems, especially for real-world scenarios. In this paper, we present a short review of state-of-the-art light detection and ranging localization methods in global navigation satellite system-denied environments, highlighting their advantages and disadvantages. Then, we evaluate two state-of-the-art Simultaneous Localization and Mapping (SLAM) systems able to also perform localization, one of which implemented by us. We benchmark these two algorithms on manually collected dataset, with the goal of providing an insight into their attainable precision in real-world scenarios. In particular, we present two experimental campaigns, one indoor and one outdoor, to measure the precision of these algorithms. After creating a map for each of the two environments, using the simultaneous localization and mapping part of the systems, we compute a custom localization error for multiple, different trajectories. Results show that the two algorithms are comparable in terms of precision, having a similar mean translation and rotation errors of about 0.01 m and 0.6 degrees, respectively. Nevertheless, the system implemented by us has the advantage of being modular, customizable and able to achieve real-time performance
On the utility and protection of optimization with differential privacy and classic regularization techniques
Nowadays, owners and developers of deep learning models must consider
stringent privacy-preservation rules of their training data, usually
crowd-sourced and retaining sensitive information. The most widely adopted
method to enforce privacy guarantees of a deep learning model nowadays relies
on optimization techniques enforcing differential privacy. According to the
literature, this approach has proven to be a successful defence against several
models' privacy attacks, but its downside is a substantial degradation of the
models' performance. In this work, we compare the effectiveness of the
differentially-private stochastic gradient descent (DP-SGD) algorithm against
standard optimization practices with regularization techniques. We analyze the
resulting models' utility, training performance, and the effectiveness of
membership inference and model inversion attacks against the learned models.
Finally, we discuss differential privacy's flaws and limits and empirically
demonstrate the often superior privacy-preserving properties of dropout and
l2-regularization
Federated Survival Forests
Survival analysis is a subfield of statistics concerned with modeling the occurrence time of a particular event of interest for a population. Survival analysis found widespread applications in healthcare, engineering, and social sciences. However, real-world applications involve survival datasets that are distributed, incomplete, censored, and confidential. In this context, federated learning can tremendously improve the performance of survival analysis applications. Federated learning provides a set of privacy-preserving techniques to jointly train machine learning models on multiple datasets without compromising user privacy, leading to a better generalization performance. However, despite the widespread development of federated learning in recent AI research, few studies focus on federated survival analysis. In this work, we present a novel federated algorithm for survival analysis based on one of the most successful survival models, the random survival forest. We call the proposed method Federated Survival Forest (FedSurF). With a single communication round, FedSurF obtains a discriminative power comparable to deep-learning-based federated models trained over hundreds of federated iterations. Moreover, FedSurF retains all the advantages of random forests, namely low computational cost and natural handling of missing values and incomplete datasets. These advantages are especially desirable in real-world federated environments with multiple small datasets stored on devices with low computational capabilities. Numerical experiments compare FedSurF with state-of-the-art survival models in federated networks, showing how FedSurF outperforms deep-learning-based federated algorithms in realistic environments with non-identically distributed data
Facetwise Mesh Refinement for Multi-View Stereo
Mesh refinement is a fundamental step for accurate Multi-View Stereo. It
modifies the geometry of an initial manifold mesh to minimize the photometric
error induced in a set of camera pairs. This initial mesh is usually the output
of volumetric 3D reconstruction based on min-cut over Delaunay Triangulations.
Such methods produce a significant amount of non-manifold vertices, therefore
they require a vertex split step to explicitly repair them. In this paper, we
extend this method to preemptively fix the non-manifold vertices by reasoning
directly on the Delaunay Triangulation and avoid most vertex splits. The main
contribution of this paper addresses the problem of choosing the camera pairs
adopted by the refinement process. We treat the problem as a mesh labeling
process, where each label corresponds to a camera pair. Differently from the
state-of-the-art methods, which use each camera pair to refine all the visible
parts of the mesh, we choose, for each facet, the best pair that enforces both
the overall visibility and coverage. The refinement step is applied for each
facet using only the camera pair selected. This facetwise refinement helps the
process to be applied in the most evenly way possible.Comment: Accepted as Oral ICPR202
- …