3 research outputs found
VISION-BASED URBAN NAVIGATION PROCEDURES FOR VERBALLY INSTRUCTED ROBOTS
The work presented in this thesis is part of a project in instruction based learning (IBL) for mobile
robots were a robot is designed that can be instructed by its users through unconstrained natural
language. The robot uses vision guidance to follow route instructions in a miniature town model.
The aim of the work presented here was to determine the functional vocabulary of the robot in the
form of "primitive procedures". In contrast to previous work in the field of instructable robots this
was done following a "user-centred" approach were the main concern was to create primitive
procedures that can be directly associated with natural language instructions. To achieve this, a corpus
of human-to-human natural language instructions was collected and analysed. A set of primitive
actions was found with which the collected corpus could be represented. These primitive actions were
then implemented as robot-executable procedures.
Natural language instructions are under-specified when destined to be executed by a robot. This is
because instructors omit information that they consider as "commonsense" and rely on the listener's
sensory-motor capabilities to determine the details of the task execution. In this thesis the under-specification
problem is solved by determining the missing information, either during the learning of
new routes or during their execution by the robot. During learning, the missing information is
determined by imitating the commonsense approach human listeners take to achieve the same
purpose. During execution, missing information, such as the location of road layout features
mentioned in route instructions, is determined from the robot's view by using image template
matching. The original contribution of this thesis, in both these methods, lies in the fact that they are
driven by the natural language examples found in the corpus collected for the IDL project.
During the testing phase a high success rate of primitive calls, when these were considered individually,
showed that the under-specification problem has overall been solved. A novel method for testing the
primitive procedures, as part of complete route descriptions, is also proposed in this thesis. This was
done by comparing the performance of human subjects when driving the robot, following route
descriptions, with the performance of the robot when executing the same route descriptions. The
results obtained from this comparison clearly indicated where errors occur from the time when a
human speaker gives a route description to the time when the task is executed by a human listener or
by the robot.
Finally, a software speed controller is proposed in this thesis in order to control the wheel speeds of
the robot used in this project. The controller employs PI (Proportional and Integral) and PID
(Proportional, Integral and Differential) control and provides a good alternative to expensive hardware
Recommended from our members
Target tracking and image interpretation in natural open world scenes
This thesis is concerned with tracking man made objects moving in natural open world scenes and based on the tracking data, construct a structural representation of that scene, frame by frame. The system developed uses a static camera and a statistical frame differencing technique for detecting motion in an image that has a relatively static background. Objects with a measured temporal consistency are tracked across successive image frames. Based on the tracking data, regions in the scene are associated with particular types of dynamic event. For example regions containing movement (could be roads) and regions where objects seem to disappear or partially disappear (could be hedges).
Because of the sensitivity of the motion estimator to changes in scene illumination and environmental conditions, a tile-based method is used to detect scene motion based on the estimations of statistical variations within the tiles. An updating process is used to ensure that a reliable estimate of the background reference image is maintained by the system. Motion cues are matched against tracked objects from a previous frame using an estimate of the temporal continuity of an object. A spatial-temporal reasoning process is used to infer the structure in the image. This inference mechanism is implemented using a semantic network.
The system has been tested on several open world sequences and in each case has demonstrated that it can identify and track vehicles moving in the scene. Based on the motion of these vehicles regions in the image were identified and scene maps constructed for each scene. The map identified regions where vehicles can be expected to be observed moving and regions where they could become occluded.
A CD-ROM is included with this thesis that contains the results obtained by the system for the two image sequences used in chapter seven. These results incorporate some of the enhancements outlined in chapter 8, section 8.3. A windows movie player is included on the CD-ROM and appendix d provides information on the contents of the CD-ROM together with installation and operating instructions
Pose and Structure Recovery Using Active Models
A new formulation of a pose refinement technique using "active" models is described. An error term derived from the detection of image derivatives close to an initial object hypothesis is linearised and solved by least squares. The method is particularly well suited to problems involving external geometrical constraints (such as the ground-plane constraint). We show that the method is able to recover both the pose of a rigid model, and the structure of a deformable model. We report an initial assessment of the performance and cost of pose and structure recovery using the active model in comparison with our previously reported "passive" model-based techniques in the context of traffic surveillance. The new method is more stable, and requires fewer iterations, especially when the number of free parameters increases, but shows somewhat poorer convergence. 1 Introduction We have previously demonstrated a system for recognising and tracking vehicles in complex traffic scenes, using model-..