10,394 research outputs found
Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives
The importance of depth perception in the interactions that humans have
within their nearby space is a well established fact. Consequently, it is also
well known that the possibility of exploiting good stereo information would
ease and, in many cases, enable, a large variety of attentional and interactive
behaviors on humanoid robotic platforms. However, the difficulty of computing
real-time and robust binocular disparity maps from moving stereo cameras often
prevents from relying on this kind of cue to visually guide robots' attention
and actions in real-world scenarios. The contribution of this paper is
two-fold: first, we show that the Efficient Large-scale Stereo Matching
algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map
is well suited to be used on a humanoid robotic platform as the iCub robot;
second, we show how, provided with a fast and reliable stereo system,
implementing relatively challenging visual behaviors in natural settings can
require much less effort. As a case of study we consider the common situation
where the robot is asked to focus the attention on one object close in the
scene, showing how a simple but effective disparity-based segmentation solves
the problem in this case. Indeed this example paves the way to a variety of
other similar applications
Al-Robotics team: A cooperative multi-unmanned aerial vehicle approach for the Mohamed Bin Zayed International Robotic Challenge
The Al-Robotics team was selected as one of the 25 finalist teams out of 143 applications received to participate in the first edition of the Mohamed Bin Zayed International Robotic Challenge (MBZIRC), held in 2017. In particular, one of the competition Challenges offered us the opportunity to develop a cooperative approach with multiple unmanned aerial vehicles (UAVs) searching, picking up, and dropping static and moving objects. This paper presents the approach that our team Al-Robotics followed to address that Challenge 3 of the MBZIRC. First, we overview the overall architecture of the system, with the different modules involved. Second, we describe the procedure that we followed to design the aerial platforms, as well as all their onboard components. Then, we explain the techniques that we used to develop the software functionalities of the system. Finally, we discuss our experimental results and the lessons that we learned before and during the competition. The cooperative approach was validated with fully autonomous missions in experiments previous to the actual competition. We also analyze the results that we obtained during the competition trials.Unión Europea H2020 73166
Mapping Wide Row Crops with Video Sequences Acquired from a Tractor Moving at Treatment Speed
This paper presents a mapping method for wide row crop fields. The resulting map shows the crop rows and weeds present in the inter-row spacing. Because field videos are acquired with a camera mounted on top of an agricultural vehicle, a method for image sequence stabilization was needed and consequently designed and developed. The proposed stabilization method uses the centers of some crop rows in the image sequence as features to be tracked, which compensates for the lateral movement (sway) of the camera and leaves the pitch unchanged. A region of interest is selected using the tracked features, and an inverse perspective technique transforms the selected region into a bird’s-eye view that is centered on the image and that enables map generation. The algorithm developed has been tested on several video sequences of different fields recorded at different times and under different lighting conditions, with good initial results. Indeed, lateral displacements of up to 66% of the inter-row spacing were suppressed through the stabilization process, and crop rows in the resulting maps appear straight
Detection-assisted Object Tracking by Mobile Cameras
Tracking-by-detection is a class of new tracking approaches that utilizes recent development of object detection algorithms. This type of approach performs object detection for each frame and uses data association algorithms to associate new observations to existing targets. Inspired by the core idea of the tracking-by-detection framework, we propose a new framework called detection-assisted tracking where object detection algorithm provides help to the tracking algorithm when such help is necessary; thus object detection, a very time consuming task, is performed only when needed. The proposed framework is also able to handle complicated scenarios where cameras are allowed to move, and occlusion or multiple similar objects exist.
We also port the core component of the proposed framework, the detector, onto embedded smart cameras. Contrary to traditional scenarios where the smart cameras are assumed to be static, we allow the smart cameras to move around in the scene. Our approach employs histogram of oriented gradients (HOG) object detector for foreground detection, to enable more robust detection on mobile platform. Traditional background subtraction methods are not suitable for mobile platforms where the background changes constantly.
Adviser: Senem Velipasalar and Mustafa Cenk Gurso
Pseudo-labels for Supervised Learning on Dynamic Vision Sensor Data, Applied to Object Detection under Ego-motion
In recent years, dynamic vision sensors (DVS), also known as event-based
cameras or neuromorphic sensors, have seen increased use due to various
advantages over conventional frame-based cameras. Using principles inspired by
the retina, its high temporal resolution overcomes motion blurring, its high
dynamic range overcomes extreme illumination conditions and its low power
consumption makes it ideal for embedded systems on platforms such as drones and
self-driving cars. However, event-based data sets are scarce and labels are
even rarer for tasks such as object detection. We transferred discriminative
knowledge from a state-of-the-art frame-based convolutional neural network
(CNN) to the event-based modality via intermediate pseudo-labels, which are
used as targets for supervised learning. We show, for the first time,
event-based car detection under ego-motion in a real environment at 100 frames
per second with a test average precision of 40.3% relative to our annotated
ground truth. The event-based car detector handles motion blur and poor
illumination conditions despite not explicitly trained to do so, and even
complements frame-based CNN detectors, suggesting that it has learnt
generalized visual representations
Reliable vision-guided grasping
Automated assembly of truss structures in space requires vision-guided servoing for grasping a strut when its position and orientation are uncertain. This paper presents a methodology for efficient and robust vision-guided robot grasping alignment. The vision-guided grasping problem is related to vision-guided 'docking' problems. It differs from other hand-in-eye visual servoing problems, such as tracking, in that the distance from the target is a relevant servo parameter. The methodology described in this paper is hierarchy of levels in which the vision/robot interface is decreasingly 'intelligent,' and increasingly fast. Speed is achieved primarily by information reduction. This reduction exploits the use of region-of-interest windows in the image plane and feature motion prediction. These reductions invariably require stringent assumptions about the image. Therefore, at a higher level, these assumptions are verified using slower, more reliable methods. This hierarchy provides for robust error recovery in that when a lower-level routine fails, the next-higher routine will be called and so on. A working system is described which visually aligns a robot to grasp a cylindrical strut. The system uses a single camera mounted on the end effector of a robot and requires only crude calibration parameters. The grasping procedure is fast and reliable, with a multi-level error recovery system
Deep Detection of People and their Mobility Aids for a Hospital Robot
Robots operating in populated environments encounter many different types of
people, some of whom might have an advanced need for cautious interaction,
because of physical impairments or their advanced age. Robots therefore need to
recognize such advanced demands to provide appropriate assistance, guidance or
other forms of support. In this paper, we propose a depth-based perception
pipeline that estimates the position and velocity of people in the environment
and categorizes them according to the mobility aids they use: pedestrian,
person in wheelchair, person in a wheelchair with a person pushing them, person
with crutches and person using a walker. We present a fast region proposal
method that feeds a Region-based Convolutional Network (Fast R-CNN). With this,
we speed up the object detection process by a factor of seven compared to a
dense sliding window approach. We furthermore propose a probabilistic position,
velocity and class estimator to smooth the CNN's detections and account for
occlusions and misclassifications. In addition, we introduce a new hospital
dataset with over 17,000 annotated RGB-D images. Extensive experiments confirm
that our pipeline successfully keeps track of people and their mobility aids,
even in challenging situations with multiple people from different categories
and frequent occlusions. Videos of our experiments and the dataset are
available at http://www2.informatik.uni-freiburg.de/~kollmitz/MobilityAidsComment: 7 pages, ECMR 2017, dataset and videos:
http://www2.informatik.uni-freiburg.de/~kollmitz/MobilityAids
- …