16 research outputs found
Vision Based Collaborative Localization and Path Planning for Micro Aerial Vehicles
Autonomous micro aerial vehicles (MAV) have gained immense popularity in both the commercial and research worlds over the last few years. Due to their small size and agility, MAVs are considered to have great potential for civil and industrial tasks such as photography, search and rescue, exploration, inspection and surveillance. Autonomy on MAVs usually involves solving the major problems of localization and path planning. While GPS is a popular choice for localization for many MAV platforms today, it suffers from issues such as inaccurate estimation around large structures, and complete unavailability in remote areas/indoor scenarios. From the alternative sensing mechanisms, cameras arise as an attractive choice to be an onboard sensor due to the richness of information captured, along with small size and inexpensiveness. Another consideration that comes into picture for micro aerial vehicles is the fact that these small platforms suffer from inability to fly for long amounts of time or carry heavy payload, scenarios that can be solved by allocating a group, or a swarm of MAVs to perform a task than just one. Collaboration between multiple vehicles allows for better accuracy of estimation, task distribution and mission efficiency. Combining these rationales, this dissertation presents collaborative vision based localization and path planning frameworks. Although these were created as two separate steps, the ideal application would contain both of them as a loosely coupled localization and planning algorithm. A forward-facing monocular camera onboard each MAV is considered as the sole sensor for computing pose estimates. With this minimal setup, this dissertation first investigates methods to perform feature-based localization, with the possibility of fusing two types of localization data: one that is computed onboard each MAV, and the other that comes from relative measurements between the vehicles. Feature based methods were preferred over direct methods for vision because of the relative ease with which tangible data packets can be transferred between vehicles, and because feature data allows for minimal data transfer compared to large images. Inspired by techniques from multiple view geometry and structure from motion, this localization algorithm presents a decentralized full 6-degree of freedom pose estimation method complete with a consistent fusion methodology to obtain robust estimates only at discrete instants, thus not requiring constant communication between vehicles. This method was validated on image data obtained from high fidelity simulations as well as real life MAV tests. These vision based collaborative constraints were also applied to the problem of path planning with a focus on performing uncertainty-aware planning, where the algorithm is responsible for generating not only a valid, collision-free path, but also making sure that this path allows for successful localization throughout. As joint multi-robot planning can be a computationally intractable problem, planning was divided into two steps from a vision-aware perspective. As the first step for improving localization performance is having access to a better map of features, a next-best-multi-view algorithm was developed which can compute the best viewpoints for multiple vehicles that can improve an existing sparse reconstruction. This algorithm contains a cost function containing vision-based heuristics that determines the quality of expected images from any set of viewpoints; which is minimized through an efficient evolutionary strategy known as Covariance Matrix Adaption (CMA-ES) that can handle very high dimensional sample spaces. In the second step, a sampling based planner called Vision-Aware RRT* (VA-RRT*) was developed which includes similar vision heuristics in an information gain based framework in order to drive individual vehicles towards areas that can benefit feature tracking and thus localization. Both steps of the planning framework were tested and validated using results from simulation
Vision Based Collaborative Localization and Path Planning for Micro Aerial Vehicles
Autonomous micro aerial vehicles (MAV) have gained immense popularity in both the commercial and research worlds over the last few years. Due to their small size and agility, MAVs are considered to have great potential for civil and industrial tasks such as photography, search and rescue, exploration, inspection and surveillance. Autonomy on MAVs usually involves solving the major problems of localization and path planning. While GPS is a popular choice for localization for many MAV platforms today, it suffers from issues such as inaccurate estimation around large structures, and complete unavailability in remote areas/indoor scenarios. From the alternative sensing mechanisms, cameras arise as an attractive choice to be an onboard sensor due to the richness of information captured, along with small size and inexpensiveness. Another consideration that comes into picture for micro aerial vehicles is the fact that these small platforms suffer from inability to fly for long amounts of time or carry heavy payload, scenarios that can be solved by allocating a group, or a swarm of MAVs to perform a task than just one. Collaboration between multiple vehicles allows for better accuracy of estimation, task distribution and mission efficiency. Combining these rationales, this dissertation presents collaborative vision based localization and path planning frameworks. Although these were created as two separate steps, the ideal application would contain both of them as a loosely coupled localization and planning algorithm. A forward-facing monocular camera onboard each MAV is considered as the sole sensor for computing pose estimates. With this minimal setup, this dissertation first investigates methods to perform feature-based localization, with the possibility of fusing two types of localization data: one that is computed onboard each MAV, and the other that comes from relative measurements between the vehicles. Feature based methods were preferred over direct methods for vision because of the relative ease with which tangible data packets can be transferred between vehicles, and because feature data allows for minimal data transfer compared to large images. Inspired by techniques from multiple view geometry and structure from motion, this localization algorithm presents a decentralized full 6-degree of freedom pose estimation method complete with a consistent fusion methodology to obtain robust estimates only at discrete instants, thus not requiring constant communication between vehicles. This method was validated on image data obtained from high fidelity simulations as well as real life MAV tests. These vision based collaborative constraints were also applied to the problem of path planning with a focus on performing uncertainty-aware planning, where the algorithm is responsible for generating not only a valid, collision-free path, but also making sure that this path allows for successful localization throughout. As joint multi-robot planning can be a computationally intractable problem, planning was divided into two steps from a vision-aware perspective. As the first step for improving localization performance is having access to a better map of features, a next-best-multi-view algorithm was developed which can compute the best viewpoints for multiple vehicles that can improve an existing sparse reconstruction. This algorithm contains a cost function containing vision-based heuristics that determines the quality of expected images from any set of viewpoints; which is minimized through an efficient evolutionary strategy known as Covariance Matrix Adaption (CMA-ES) that can handle very high dimensional sample spaces. In the second step, a sampling based planner called Vision-Aware RRT* (VA-RRT*) was developed which includes similar vision heuristics in an information gain based framework in order to drive individual vehicles towards areas that can benefit feature tracking and thus localization. Both steps of the planning framework were tested and validated using results from simulation
ChatGPT for Robotics: Design Principles and Model Abilities
This paper presents an experimental study regarding the use of OpenAI's
ChatGPT for robotics applications. We outline a strategy that combines design
principles for prompt engineering and the creation of a high-level function
library which allows ChatGPT to adapt to different robotics tasks, simulators,
and form factors. We focus our evaluations on the effectiveness of different
prompt engineering techniques and dialog strategies towards the execution of
various types of robotics tasks. We explore ChatGPT's ability to use free-form
dialog, parse XML tags, and to synthesize code, in addition to the use of
task-specific prompting functions and closed-loop reasoning through dialogues.
Our study encompasses a range of tasks within the robotics domain, from basic
logical, geometrical, and mathematical reasoning all the way to complex domains
such as aerial navigation, manipulation, and embodied agents. We show that
ChatGPT can be effective at solving several of such tasks, while allowing users
to interact with it primarily via natural language instructions. In addition to
these studies, we introduce an open-sourced research tool called PromptCraft,
which contains a platform where researchers can collaboratively upload and vote
on examples of good prompting schemes for robotics applications, as well as a
sample robotics simulator with ChatGPT integration, making it easier for users
to get started with using ChatGPT for robotics
Masked Autoencoders for Egocentric Video Understanding @ Ego4D Challenge 2022
In this report, we present our approach and empirical results of applying
masked autoencoders in two egocentric video understanding tasks, namely, Object
State Change Classification and PNR Temporal Localization, of Ego4D Challenge
2022. As team TheSSVL, we ranked 2nd place in both tasks. Our code will be made
available.Comment: 5 page
Learning to Simulate Realistic LiDARs
Simulating realistic sensors is a challenging part in data generation for
autonomous systems, often involving carefully handcrafted sensor design, scene
properties, and physics modeling. To alleviate this, we introduce a pipeline
for data-driven simulation of a realistic LiDAR sensor. We propose a model that
learns a mapping between RGB images and corresponding LiDAR features such as
raydrop or per-point intensities directly from real datasets. We show that our
model can learn to encode realistic effects such as dropped points on
transparent surfaces or high intensity returns on reflective materials. When
applied to naively raycasted point clouds provided by off-the-shelf simulator
software, our model enhances the data by predicting intensities and removing
points based on the scene's appearance to match a real LiDAR sensor. We use our
technique to learn models of two distinct LiDAR sensors and use them to improve
simulated LiDAR data accordingly. Through a sample task of vehicle
segmentation, we show that enhancing simulated point clouds with our technique
improves downstream task performance.Comment: IROS2022 pape