250 research outputs found

    Multi-Robot Relative Pose Estimation in SE(2) with Observability Analysis: A Comparison of Extended Kalman Filtering and Robust Pose Graph Optimization

    Full text link
    In this study, we address multi-robot localization issues, with a specific focus on cooperative localization and observability analysis of relative pose estimation. Cooperative localization involves enhancing each robot's information through a communication network and message passing. If odometry data from a target robot can be transmitted to the ego robot, observability of their relative pose estimation can be achieved through range-only or bearing-only measurements, provided both robots have non-zero linear velocities. In cases where odometry data from a target robot are not directly transmitted but estimated by the ego robot, both range and bearing measurements are necessary to ensure observability of relative pose estimation. For ROS/Gazebo simulations, we explore four sensing and communication structures. We compare extended Kalman filtering (EKF) and pose graph optimization (PGO) estimation using different robust loss functions (filtering and smoothing with varying batch sizes of sliding windows) in terms of estimation accuracy. In hardware experiments, two Turtlebot3 equipped with UWB modules are used for real-world inter-robot relative pose estimation, applying both EKF and PGO and comparing their performance.Comment: 20 pages, 21 figure

    Learning to Rank: Online Learning, Statistical Theory and Applications.

    Full text link
    Learning to rank is a supervised machine learning problem, where the output space is the special structured space of emph{permutations}. Learning to rank has diverse application areas, spanning information retrieval, recommendation systems, computational biology and others. In this dissertation, we make contributions to some of the exciting directions of research in learning to rank. In the first part, we extend the classic, online perceptron algorithm for classification to learning to rank, giving a loss bound which is reminiscent of Novikoff's famous convergence theorem for classification. In the second part, we give strategies for learning ranking functions in an online setting, with a novel, feedback model, where feedback is restricted to labels of top ranked items. The second part of our work is divided into two sub-parts; one without side information and one with side information. In the third part, we provide novel generalization error bounds for algorithms applied to various Lipschitz and/or smooth ranking surrogates. In the last part, we apply ranking losses to learn policies for personalized advertisement recommendations, partially overcoming the problem of click sparsity. We conduct experiments on various simulated and commercial datasets, comparing our strategies with baseline strategies for online learning to rank and personalized advertisement recommendation.PhDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133334/1/sougata_1.pd

    Consistent Right-Invariant Fixed-Lag Smoother with Application to Visual Inertial SLAM

    Full text link
    State estimation problems without absolute position measurements routinely arise in navigation of unmanned aerial vehicles, autonomous ground vehicles, etc., whose proper operation relies on accurate state estimates and reliable covariances. Unaware of absolute positions, these problems have immanent unobservable directions. Traditional causal estimators, however, usually gain spurious information on the unobservable directions, leading to over-confident covariance inconsistent with actual estimator errors. The consistency problem of fixed-lag smoothers (FLSs) has only been attacked by the first estimate Jacobian (FEJ) technique because of the complexity to analyze their observability property. But the FEJ has several drawbacks hampering its wide adoption. To ensure the consistency of a FLS, this paper introduces the right invariant error formulation into the FLS framework. To our knowledge, we are the first to analyze the observability of a FLS with the right invariant error. Our main contributions are twofold. As the first novelty, to bypass the complexity of analysis with the classic observability matrix, we show that observability analysis of FLSs can be done equivalently on the linearized system. Second, we prove that the inconsistency issue in the traditional FLS can be elegantly solved by the right invariant error formulation without artificially correcting Jacobians. By applying the proposed FLS to the monocular visual inertial simultaneous localization and mapping (SLAM) problem, we confirm that the method consistently estimates covariance similarly to a batch smoother in simulation and that our method achieved comparable accuracy as traditional FLSs on real data.Comment: 13 pages, 4 figures, AAAI 2021 Conferenc

    Towards Visual Localization, Mapping and Moving Objects Tracking by a Mobile Robot: a Geometric and Probabilistic Approach

    Get PDF
    Dans cette thĂšse, nous rĂ©solvons le problĂšme de reconstruire simultanĂ©ment une reprĂ©sentation de la gĂ©omĂ©trie du monde, de la trajectoire de l'observateur, et de la trajectoire des objets mobiles, Ă  l'aide de la vision. Nous divisons le problĂšme en trois Ă©tapes : D'abord, nous donnons une solution au problĂšme de la cartographie et localisation simultanĂ©es pour la vision monoculaire qui fonctionne dans les situations les moins bien conditionnĂ©es gĂ©omĂ©triquement. Ensuite, nous incorporons l'observabilitĂ© 3D instantanĂ©e en dupliquant le matĂ©riel de vision avec traitement monoculaire. Ceci Ă©limine les inconvĂ©nients inhĂ©rents aux systĂšmes stĂ©rĂ©o classiques. Nous ajoutons enfin la dĂ©tection et suivi des objets mobiles proches en nous servant de cette observabilitĂ© 3D. Nous choisissons une reprĂ©sentation Ă©parse et ponctuelle du monde et ses objets. La charge calculatoire des algorithmes de perception est allĂ©gĂ©e en focalisant activement l'attention aux rĂ©gions de l'image avec plus d'intĂ©rĂȘt. ABSTRACT : In this thesis we give new means for a machine to understand complex and dynamic visual scenes in real time. In particular, we solve the problem of simultaneously reconstructing a certain representation of the world's geometry, the observer's trajectory, and the moving objects' structures and trajectories, with the aid of vision exteroceptive sensors. We proceeded by dividing the problem into three main steps: First, we give a solution to the Simultaneous Localization And Mapping problem (SLAM) for monocular vision that is able to adequately perform in the most ill-conditioned situations: those where the observer approaches the scene in straight line. Second, we incorporate full 3D instantaneous observability by duplicating vision hardware with monocular algorithms. This permits us to avoid some of the inherent drawbacks of classic stereo systems, notably their limited range of 3D observability and the necessity of frequent mechanical calibration. Third, we add detection and tracking of moving objects by making use of this full 3D observability, whose necessity we judge almost inevitable. We choose a sparse, punctual representation of both the world and the moving objects in order to alleviate the computational payload of the image processing algorithms, which are required to extract the necessary geometrical information out of the images. This alleviation is additionally supported by active feature detection and search mechanisms which focus the attention to those image regions with the highest interest. This focusing is achieved by an extensive exploitation of the current knowledge available on the system (all the mapped information), something that we finally highlight to be the ultimate key to success

    Learned Monocular Depth Priors in Visual-Inertial Initialization

    Full text link
    Visual-inertial odometry (VIO) is the pose estimation backbone for most AR/VR and autonomous robotic systems today, in both academia and industry. However, these systems are highly sensitive to the initialization of key parameters such as sensor biases, gravity direction, and metric scale. In practical scenarios where high-parallax or variable acceleration assumptions are rarely met (e.g. hovering aerial robot, smartphone AR user not gesticulating with phone), classical visual-inertial initialization formulations often become ill-conditioned and/or fail to meaningfully converge. In this paper we target visual-inertial initialization specifically for these low-excitation scenarios critical to in-the-wild usage. We propose to circumvent the limitations of classical visual-inertial structure-from-motion (SfM) initialization by incorporating a new learning-based measurement as a higher-level input. We leverage learned monocular depth images (mono-depth) to constrain the relative depth of features, and upgrade the mono-depth to metric scale by jointly optimizing for its scale and shift. Our experiments show a significant improvement in problem conditioning compared to a classical formulation for visual-inertial initialization, and demonstrate significant accuracy and robustness improvements relative to the state-of-the-art on public benchmarks, particularly under motion-restricted scenarios. We further extend this improvement to implementation within an existing odometry system to illustrate the impact of our improved initialization method on resulting tracking trajectories

    Egocentric Planning for Scalable Embodied Task Achievement

    Full text link
    Embodied agents face significant challenges when tasked with performing actions in diverse environments, particularly in generalizing across object types and executing suitable actions to accomplish tasks. Furthermore, agents should exhibit robustness, minimizing the execution of illegal actions. In this work, we present Egocentric Planning, an innovative approach that combines symbolic planning and Object-oriented POMDPs to solve tasks in complex environments, harnessing existing models for visual perception and natural language processing. We evaluated our approach in ALFRED, a simulated environment designed for domestic tasks, and demonstrated its high scalability, achieving an impressive 36.07% unseen success rate in the ALFRED benchmark and winning the ALFRED challenge at CVPR Embodied AI workshop. Our method requires reliable perception and the specification or learning of a symbolic description of the preconditions and effects of the agent's actions, as well as what object types reveal information about others. It is capable of naturally scaling to solve new tasks beyond ALFRED, as long as they can be solved using the available skills. This work offers a solid baseline for studying end-to-end and hybrid methods that aim to generalize to new tasks, including recent approaches relying on LLMs, but often struggle to scale to long sequences of actions or produce robust plans for novel tasks
    • 

    corecore