66,397 research outputs found
Grounding semantics in robots for Visual Question Answering
In this thesis I describe an operational implementation of an object detection and description system that incorporates in an end-to-end Visual Question Answering system and evaluated it on two visual question answering datasets for compositional language and elementary visual reasoning
Combining depth and intensity images to produce enhanced object detection for use in a robotic colony
Robotic colonies that can communicate with each other and interact with their ambient environments can be utilized for a wide range of research and industrial applications. However amongst the problems that these colonies face is that of the isolating objects within an environment. Robotic colonies that can isolate objects within the environment can not only map that environment in de-tail, but interact with that ambient space. Many object recognition techniques ex-ist, however these are often complex and computationally expensive, leading to overly complex implementations. In this paper a simple model is proposed to isolate objects, these can then be recognize and tagged. The model will be using 2D and 3D perspectives of the perceptual data to produce a probability map of the outline of an object, therefore addressing the defects that exist with 2D and 3D image techniques. Some of the defects that will be addressed are; low level illumination and objects at similar depths. These issues may not be completely solved, however, the model provided will provide results confident enough for use in a robotic colony
Indoor wireless communications and applications
Chapter 3 addresses challenges in radio link and system design in indoor scenarios. Given the fact that most human activities take place in indoor environments, the need for supporting ubiquitous indoor data connectivity and location/tracking service becomes even more important than in the previous decades. Specific technical challenges addressed in this section are(i), modelling complex indoor radio channels for effective antenna deployment, (ii), potential of millimeter-wave (mm-wave) radios for supporting higher data rates, and (iii), feasible indoor localisation and tracking techniques, which are summarised in three dedicated sections of this chapter
Role Playing Learning for Socially Concomitant Mobile Robot Navigation
In this paper, we present the Role Playing Learning (RPL) scheme for a mobile
robot to navigate socially with its human companion in populated environments.
Neural networks (NN) are constructed to parameterize a stochastic policy that
directly maps sensory data collected by the robot to its velocity outputs,
while respecting a set of social norms. An efficient simulative learning
environment is built with maps and pedestrians trajectories collected from a
number of real-world crowd data sets. In each learning iteration, a robot
equipped with the NN policy is created virtually in the learning environment to
play itself as a companied pedestrian and navigate towards a goal in a socially
concomitant manner. Thus, we call this process Role Playing Learning, which is
formulated under a reinforcement learning (RL) framework. The NN policy is
optimized end-to-end using Trust Region Policy Optimization (TRPO), with
consideration of the imperfectness of robot's sensor measurements. Simulative
and experimental results are provided to demonstrate the efficacy and
superiority of our method
Cognitive visual tracking and camera control
Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision
- …