84,355 research outputs found
Understanding Of And Applications For Robot Vision Guidance At KSC
The primary thrust of robotics at KSC is for the servicing of Space Shuttle remote umbilical docking functions. In order for this to occur, robots performing servicing operations must be capable of tracking a swaying Orbiter in Six Degrees of Freedom (6-DOF) . Currently, in NASA KSC\u27s Robotic Applications Development Laboratory (RADL) , an ASEA IRB-90 industrial robot is being equipped with a real-time computer vision (hardware and software) system to allow it to track a simulated Orbiter interface (target) in 6-DOF. The real-time computer vision system effectively becomes the eyes for the lab robot, guiding it through a closed loop visual feedback system to move with the simulated Orbiter interface. This paper will address an understanding of this vision guidance system and how it will be applied to remote umbilical servicing at KSC. In addition, other current and future applications will be addressed
Centroid Distance Keypoint Detector for Colored Point Clouds
Keypoint detection serves as the basis for many computer vision and robotics
applications. Despite the fact that colored point clouds can be readily
obtained, most existing keypoint detectors extract only geometry-salient
keypoints, which can impede the overall performance of systems that intend to
(or have the potential to) leverage color information. To promote advances in
such systems, we propose an efficient multi-modal keypoint detector that can
extract both geometry-salient and color-salient keypoints in colored point
clouds. The proposed CEntroid Distance (CED) keypoint detector comprises an
intuitive and effective saliency measure, the centroid distance, that can be
used in both 3D space and color space, and a multi-modal non-maximum
suppression algorithm that can select keypoints with high saliency in two or
more modalities. The proposed saliency measure leverages directly the
distribution of points in a local neighborhood and does not require normal
estimation or eigenvalue decomposition. We evaluate the proposed method in
terms of repeatability and computational efficiency (i.e. running time) against
state-of-the-art keypoint detectors on both synthetic and real-world datasets.
Results demonstrate that our proposed CED keypoint detector requires minimal
computational time while attaining high repeatability. To showcase one of the
potential applications of the proposed method, we further investigate the task
of colored point cloud registration. Results suggest that our proposed CED
detector outperforms state-of-the-art handcrafted and learning-based keypoint
detectors in the evaluated scenes. The C++ implementation of the proposed
method is made publicly available at
https://github.com/UCR-Robotics/CED_Detector.Comment: Accepted to IEEE/CVF Winter Conference on Applications of Computer
Vision (WACV) 2023; copyright will be transferred to IEEE upon publicatio
A Biologically Motivated Software Retina for Robotic Sensors Based on Smartphone Technology
A key issue in designing robotics systems is the cost of an integrated camera sensor that meets the bandwidth/processing requirement for many advanced robotics applications, especially lightweight robotics applications, such as visual surveillance or SLAM in autonomous aerial vehicles. There is currently much work going on to adapt smartphones to provide complete robot vision systems, as the phone is so exquisitely integrated having camera(s), inertial sensing, sound I/O and excellent wireless connectivity. Mass market production makes this a very low-cost platform and manufacturers from quadrotor drone suppliers to children’s toys, such as the Meccanoid robot, employ a smartphone to provide a vision system/control system.
Accordingly, many research groups are attempting to optimise image analysis, computer vision and machine learning libraries for the smartphone platform. However current approaches to robot vision remain highly demanding for mobile processors such as the ARM, and while a number of algorithms have been developed, these are very stripped down, i.e. highly compromised in function or performance For example, the semi-dense visual odometry implementation of [1] operates on images of only 320x240pixels.
In our research we have been developing biologically motivated foveated vision algorithms, potentially some 100 times more efficient than their conventional counterparts, based on a model of the mammalian retina we have developed. Vision systems based on the foveated architectures found in mammals have the potential to reduce bandwidth and processing requirements by about x100 - it has been estimated that our brains would weigh ~60Kg if we were to process all our visual input at uniform high resolution. We have reported a foveated visual architecture that implements a functional model of the retina-visual cortex to produce feature vectors that can be matched/classified using conventional methods, or indeed could be adapted to employ Deep Convolutional Neural Nets for the classification/interpretation stage, [2,3,4].
We are now at the early stages of investigating how best to port our foveated architecture onto a smartphone platform. To achieve the required levels of performance we propose to optimise our retina model to the ARM processors utilised in smartphones, in conjunction with their integrated GPUs, to provide a foveated smart vision system on a smartphone. Our current goal is to have a foveated system running in real-time to serve as a front-end robot sensor for tasks such as general purpose object recognition and reliable dense SLAM using a commercial off-the-shelf smartphone which communicates with conventional hardware performing back-end visual classification/interpretation. We believe that, as in Nature, space-variance is the key to achieving the necessary data reduction to be able to implement the complete visual processing chain on the smartphone itself
Training a Convolutional Neural Network for Appearance-Invariant Place Recognition
Place recognition is one of the most challenging problems in computer vision,
and has become a key part in mobile robotics and autonomous driving
applications for performing loop closure in visual SLAM systems. Moreover, the
difficulty of recognizing a revisited location increases with appearance
changes caused, for instance, by weather or illumination variations, which
hinders the long-term application of such algorithms in real environments. In
this paper we present a convolutional neural network (CNN), trained for the
first time with the purpose of recognizing revisited locations under severe
appearance changes, which maps images to a low dimensional space where
Euclidean distances represent place dissimilarity. In order for the network to
learn the desired invariances, we train it with triplets of images selected
from datasets which present a challenging variability in visual appearance. The
triplets are selected in such way that two samples are from the same location
and the third one is taken from a different place. We validate our system
through extensive experimentation, where we demonstrate better performance than
state-of-art algorithms in a number of popular datasets
On quaternion based parametrization of orientation in computer vision and robotics
The problem of orientation parameterization for applications in computer vision and robotics is examined in detail herein.
The necessary intuition and formulas are provided for direct practical use in any existing algorithm that seeks to
minimize a cost function in an iterative fashion. Two distinct schemes of parameterization are analyzed: The first scheme
concerns the traditional axis-angle approach, while the second employs stereographic projection from unit quaternion
sphere to the 3D real projective space. Performance measurements are taken and a comparison is made between the two
approaches. Results suggests that there exist several benefits in the use of stereographic projection that include rational
expressions in the rotation matrix derivatives, improved accuracy, robustness to random starting points and accelerated
convergence
Past, Present, and Future of Simultaneous Localization And Mapping: Towards the Robust-Perception Age
Simultaneous Localization and Mapping (SLAM)consists in the concurrent
construction of a model of the environment (the map), and the estimation of the
state of the robot moving within it. The SLAM community has made astonishing
progress over the last 30 years, enabling large-scale real-world applications,
and witnessing a steady transition of this technology to industry. We survey
the current state of SLAM. We start by presenting what is now the de-facto
standard formulation for SLAM. We then review related work, covering a broad
set of topics including robustness and scalability in long-term mapping, metric
and semantic representations for mapping, theoretical performance guarantees,
active SLAM and exploration, and other new frontiers. This paper simultaneously
serves as a position paper and tutorial to those who are users of SLAM. By
looking at the published research with a critical eye, we delineate open
challenges and new research issues, that still deserve careful scientific
investigation. The paper also contains the authors' take on two questions that
often animate discussions during robotics conferences: Do robots need SLAM? and
Is SLAM solved
- …