114 research outputs found
Occlusion-Robust MVO: Multimotion Estimation Through Occlusion Via Motion Closure
Visual motion estimation is an integral and well-studied challenge in
autonomous navigation. Recent work has focused on addressing multimotion
estimation, which is especially challenging in highly dynamic environments.
Such environments not only comprise multiple, complex motions but also tend to
exhibit significant occlusion.
Previous work in object tracking focuses on maintaining the integrity of
object tracks but usually relies on specific appearance-based descriptors or
constrained motion models. These approaches are very effective in specific
applications but do not generalize to the full multimotion estimation problem.
This paper presents a pipeline for estimating multiple motions, including the
camera egomotion, in the presence of occlusions. This approach uses an
expressive motion prior to estimate the SE (3) trajectory of every motion in
the scene, even during temporary occlusions, and identify the reappearance of
motions through motion closure. The performance of this occlusion-robust
multimotion visual odometry (MVO) pipeline is evaluated on real-world data and
the Oxford Multimotion Dataset.Comment: To appear at the 2020 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS). An earlier version of this work first
appeared at the Long-term Human Motion Planning Workshop (ICRA 2019). 8
pages, 5 figures. Video available at
https://www.youtube.com/watch?v=o_N71AA6FR
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
We propose a novel method to fit and segment multi-structural data via convex
relaxation. Unlike greedy methods --which maximise the number of inliers-- this
approach efficiently searches for a soft assignment of points to models by
minimising the energy of the overall classification. Our approach is similar to
state-of-the-art energy minimisation techniques which use a global energy.
However, we deal with the scaling factor (as the number of models increases) of
the original combinatorial problem by relaxing the solution. This relaxation
brings two advantages: first, by operating in the continuous domain we can
parallelize the calculations. Second, it allows for the use of different
metrics which results in a more general formulation.
We demonstrate the versatility of our technique on two different problems of
estimating structure from images: plane extraction from RGB-D data and
homography estimation from pairs of images. In both cases, we report accurate
results on publicly available datasets, in most of the cases outperforming the
state-of-the-art
Multiple structure recovery with maximum coverage
We present a general framework for geometric model fitting based on a set coverage formulation that caters for intersecting structures and outliers in a simple and principled manner. The multi-model fitting problem is formulated in terms of the optimization of a consensus-based global cost function, which allows to sidestep the pitfalls of preference approaches based on clustering and to avoid the difficult trade-off between data fidelity and complexity of other optimization formulations. Two especially appealing characteristics of this method are the ease with which it can be implemented and its modularity with respect to the solver and to the sampling strategy. Few intelligible parameters need to be set and tuned, namely the inlier threshold and the number of desired models. The summary of the experiments is that our method compares favourably with its competitors overall, and it is always either the best performer or almost on par with the best performer in specific scenarios
Multiple structure recovery via robust preference analysis
2noThis paper address the extraction of multiple models from outlier-contaminated data by exploiting preference analysis and low rank approximation. First points are represented in the preference space, then Robust PCA (Principal Component Analysis) and Symmetric NMF (Non negative Matrix Factorization) are used to break the multi-model fitting problem into many single-model problems, which in turn are tackled with an approach inspired to MSAC (M-estimator SAmple Consensus) coupled with a model-specific scale estimate. Experimental validation on public, real data-sets demonstrates that our method compares favorably with the state of the art.openopenMagri, Luca; Fusiello, AndreaMagri, Luca; Fusiello, Andre
Accurate Calibration of Multi-LiDAR-Multi-Camera Systems
As autonomous driving attracts more and more attention these days, the algorithms and sensors used for machine perception become popular in research, as well. This paper investigates the extrinsic calibration of two frequently-applied sensors: the camera and Light Detection and Ranging (LiDAR). The calibration can be done with the help of ordinary boxes. It contains an iterative refinement step, which is proven to converge to the box in the LiDAR point cloud, and can be used for system calibration containing multiple LiDARs and cameras. For that purpose, a bundle adjustment-like minimization is also presented. The accuracy of the method is evaluated on both synthetic and real-world data, outperforming the state-of-the-art techniques. The method is general in the sense that it is both LiDAR and camera-type independent, and only the intrinsic camera parameters have to be known. Finally, a method for determining the 2D bounding box of the car chassis from LiDAR point clouds is also presented in order to determine the car body border with respect to the calibrated sensors
Human Motion Diffusion as a Generative Prior
Recent work has demonstrated the significant potential of denoising diffusion
models for generating human motion, including text-to-motion capabilities.
However, these methods are restricted by the paucity of annotated motion data,
a focus on single-person motions, and a lack of detailed control. In this
paper, we introduce three forms of composition based on diffusion priors:
sequential, parallel, and model composition. Using sequential composition, we
tackle the challenge of long sequence generation. We introduce DoubleTake, an
inference-time method with which we generate long animations consisting of
sequences of prompted intervals and their transitions, using a prior trained
only for short clips. Using parallel composition, we show promising steps
toward two-person generation. Beginning with two fixed priors as well as a few
two-person training examples, we learn a slim communication block, ComMDM, to
coordinate interaction between the two resulting motions. Lastly, using model
composition, we first train individual priors to complete motions that realize
a prescribed motion for a given joint. We then introduce DiffusionBlending, an
interpolation mechanism to effectively blend several such models to enable
flexible and efficient fine-grained joint and trajectory-level control and
editing. We evaluate the composition methods using an off-the-shelf motion
diffusion model, and further compare the results to dedicated models trained
for these specific tasks
Advances in Graph-Cut Optimization: Multi-Surface Models, Label Costs, and Hierarchical Costs
Computer vision is full of problems that are elegantly expressed in terms of mathematical optimization, or energy minimization. This is particularly true of low-level inference problems such as cleaning up noisy signals, clustering and classifying data, or estimating 3D points from images. Energies let us state each problem as a clear, precise objective function. Minimizing the correct energy would, hypothetically, yield a good solution to the corresponding problem. Unfortunately, even for low-level problems we are confronted by energies that are computationally hard—often NP-hard—to minimize. As a consequence, a rather large portion of computer vision research is dedicated to proposing better energies and better algorithms for energies. This dissertation presents work along the same line, specifically new energies and algorithms based on graph cuts.
We present three distinct contributions. First we consider biomedical segmentation where the object of interest comprises multiple distinct regions of uncertain shape (e.g. blood vessels, airways, bone tissue). We show that this common yet difficult scenario can be modeled as an energy over multiple interacting surfaces, and can be globally optimized by a single graph cut. Second, we introduce multi-label energies with label costs and provide algorithms to minimize them. We show how label costs are useful for clustering and robust estimation problems in vision. Third, we characterize a class of energies with hierarchical costs and propose a novel hierarchical fusion algorithm with improved approximation guarantees. Hierarchical costs are natural for modeling an array of difficult problems, e.g. segmentation with hierarchical context, simultaneous estimation of motions and homographies, or detecting hierarchies of patterns
- …