5 research outputs found
Auto-TransRL: Autonomous Composition of Vision Pipelines for Robotic Perception
Creating a vision pipeline for different datasets to solve a computer vision
task is a complex and time consuming process. Currently, these pipelines are
developed with the help of domain experts. Moreover, there is no systematic
structure to construct a vision pipeline apart from relying on experience,
trial and error or using template-based approaches. As the search space for
choosing suitable algorithms for achieving a particular vision task is large,
human exploration for finding a good solution requires time and effort. To
address the following issues, we propose a dynamic and data-driven way to
identify an appropriate set of algorithms that would be fit for building the
vision pipeline in order to achieve the goal task. We introduce a Transformer
Architecture complemented with Deep Reinforcement Learning to recommend
algorithms that can be incorporated at different stages of the vision workflow.
This system is both robust and adaptive to dynamic changes in the environment.
Experimental results further show that our method also generalizes well to
recommend algorithms that have not been used while training and hence
alleviates the need of retraining the system on a new set of algorithms
introduced during test time.Comment: Presented at the IEEE ICRA 2022 Workshop in Robotic Perception and
Mapping: Emerging Technique
Concept-based Anomaly Detection in Retail Stores for Automatic Correction using Mobile Robots
Tracking of inventory and rearrangement of misplaced items are some of the
most labor-intensive tasks in a retail environment. While there have been
attempts at using vision-based techniques for these tasks, they mostly use
planogram compliance for detection of any anomalies, a technique that has been
found lacking in robustness and scalability. Moreover, existing systems rely on
human intervention to perform corrective actions after detection. In this
paper, we present Co-AD, a Concept-based Anomaly Detection approach using a
Vision Transformer (ViT) that is able to flag misplaced objects without using a
prior knowledge base such as a planogram. It uses an auto-encoder architecture
followed by outlier detection in the latent space. Co-AD has a peak success
rate of 89.90% on anomaly detection image sets of retail objects drawn from the
RP2K dataset, compared to 80.81% on the best-performing baseline of a standard
ViT auto-encoder. To demonstrate its utility, we describe a robotic mobile
manipulation pipeline to autonomously correct the anomalies flagged by Co-AD.
This work is ultimately aimed towards developing autonomous mobile robot
solutions that reduce the need for human intervention in retail store
management.Comment: 8 pages, 9 figures, 2 tables, IEEE Transactions on Systems, Man and
Cybernetic
A Wearable Robotic Forearm for Human-Robot Collaboration
191 pagesThe idea of extending and augmenting the capabilities of the human body has been an enduring area of exploration in fiction, research, and industry alike. The most concrete realizations of this idea have been in the form of wearable devices such as prostheses and exoskeletons, that replace or enhance existing human functions. With recent advances in sensing, actuation, and materials technology, we are witnessing the advent of a new class of wearable robots: Supernumerary Robotic (SR) devices that provide additional degrees of freedom to a user, typically in the form of extra limbs or fingers. The development, analysis, and experimental evaluation of one such SR device, a Wearable Robotic Forearm (WRF) for close-range collaborative tasks, forms the focus of this dissertation. We initiated its design process through a basic prototype mounted on a user's elbow, and conducted an online survey, a contextual inquiry at a construction site, and an in-person usability study to identify usage contexts and functions for such a device, and formed guidelines for improving the design. In the next WRF prototype, we added two more degrees of freedom while remaining within acceptable human ergonomic load limits, and expanding its reachable workspace volume. We then developed the final prototype based on further feedback from a pilot interaction study, and found an analytical solution for its inverse kinematics. Going beyond static analyses with predefined robot trajectories, we further addressed the biomechanical effects of wearing the WRF using a detailed musculoskeletal model, and developed a motion planner that minimizes loads on the user's muscles. Looking at the other side of the physical interaction between the user and WRF, we applied human motion prediction and feedback control for stabilizing the robot's end- effector position when subjected to disturbances from the wearer's body movements. Finally, we conducted a user study involving a collaborative pick-and-place task with the WRF acting in two conditions: responding to direct speech commands from the wearer, and predicting human intent using supervised learning models. We evaluated the quality of interaction in the two conditions through human-robot fluency metrics. The WRF, and its associated systems described in this dissertation do have limitations, particularly in terms of ergonomics, feedback control performance, and fluency of interaction. However, as a prototype, the WRF shows that SR devices can be effective agents in human-robot collaboration when they possess capabilities for mutual adaptation while reducing the cognitive load on the user
The Wearable Robotic Forearm: Design and Predictive Control of a Collaborative Supernumerary Robot
This article presents the design process of a supernumerary wearable robotic forearm (WRF), along with methods for stabilizing the robot’s end-effector using human motion prediction. The device acts as a lightweight “third arm” for the user, extending their reach during handovers and manipulation in close-range collaborative activities. It was developed iteratively, following a user-centered design process that included an online survey, contextual inquiry, and an in-person usability study. Simulations show that the WRF significantly enhances a wearer’s reachable workspace volume, while remaining within biomechanical ergonomic load limits during typical usage scenarios. While operating the device in such scenarios, the user introduces disturbances in its pose due to their body movements. We present two methods to overcome these disturbances: autoregressive (AR) time series and a recurrent neural network (RNN). These models were used for forecasting the wearer’s body movements to compensate for disturbances, with prediction horizons determined through linear system identification. The models were trained offline on a subset of the KIT Human Motion Database, and tested in five usage scenarios to keep the 3D pose of the WRF’s end-effector static. The addition of the predictive models reduced the end-effector position errors by up to 26% compared to direct feedback control