6 research outputs found

    Color Optical Flow

    Get PDF
    Grayscale optical-flow methods have long been the focus of methods for recovering optical flow. Optical flow recovery from color-images can be implemented using direct methods, i.e. without using computationally costly iterations or search strategies. The quality of recovered optical flow can be assessed and tailored after processing, providing an effective, efficient tool for motion estimation. In this paper, a brief introduction to optical flow is presented, the optical flow constraint equation and its extension to color images is presented. New methods for solving this extended equation are given. Results of applying these methods to two synthetic image sequences are presented

    Collision Avoidance for UAVs Using Optic Flow Measurement with Line of Sight Rate Equalization and Looming

    Get PDF
    A series of simplified scenarios is investigated whereby an optical flow balancing guidance law is used to avoid obstacles by steering an air vehicle between fixed objects/obstacles. These obstacles are registered as specific points that can be representative of features in a scene. The obstacles appear in the field of view of a single forward looking camera. First a 2-D analysis is presented where the rate of the line of sight from the vehicle to each of the obstacles to be avoided is measured. The analysis proceeds by initially using no field of view (FOV) limitations, then applying FOV restrictions, and adding features or obstacles in the scene. These analyses show that using a guidance law that equalizes the line of sight rates with no FOV limitations, actually results in the vehicle being steered into one of the objects for all initial conditions. The research next develops an obstacle avoidance strategy based on equilibrating the optic flow generated by the obstacles and presents an analysis that leads to a different conclusion in which balancing the optic flows does avoid the obstacles. The paper then describes a set of guidance methods that with real FOV limitations create a favorable result. Finally, the looming of an object in the camera\u27s FOV can be measured and used for synthesizing a collision avoidance guidance law. For the simple 2-D case, looming is quantified as an increase in LOS between two features on a wall in front of the air vehicle. The 2-D guidance law for equalizing the optic flow and looming detection is then extended into the 3-D case. Then a set of 3-D scenarios are further explored using a decoupled two channel approach. In addition, a comparison of two image segmentation techniques that are used to find optic flow vectors is presented

    Human robot interaction in a crowded environment

    No full text
    Human Robot Interaction (HRI) is the primary means of establishing natural and affective communication between humans and robots. HRI enables robots to act in a way similar to humans in order to assist in activities that are considered to be laborious, unsafe, or repetitive. Vision based human robot interaction is a major component of HRI, with which visual information is used to interpret how human interaction takes place. Common tasks of HRI include finding pre-trained static or dynamic gestures in an image, which involves localising different key parts of the human body such as the face and hands. This information is subsequently used to extract different gestures. After the initial detection process, the robot is required to comprehend the underlying meaning of these gestures [3]. Thus far, most gesture recognition systems can only detect gestures and identify a person in relatively static environments. This is not realistic for practical applications as difficulties may arise from people‟s movements and changing illumination conditions. Another issue to consider is that of identifying the commanding person in a crowded scene, which is important for interpreting the navigation commands. To this end, it is necessary to associate the gesture to the correct person and automatic reasoning is required to extract the most probable location of the person who has initiated the gesture. In this thesis, we have proposed a practical framework for addressing the above issues. It attempts to achieve a coarse level understanding about a given environment before engaging in active communication. This includes recognizing human robot interaction, where a person has the intention to communicate with the robot. In this regard, it is necessary to differentiate if people present are engaged with each other or their surrounding environment. The basic task is to detect and reason about the environmental context and different interactions so as to respond accordingly. For example, if individuals are engaged in conversation, the robot should realize it is best not to disturb or, if an individual is receptive to the robot‟s interaction, it may approach the person. Finally, if the user is moving in the environment, it can analyse further to understand if any help can be offered in assisting this user. The method proposed in this thesis combines multiple visual cues in a Bayesian framework to identify people in a scene and determine potential intentions. For improving system performance, contextual feedback is used, which allows the Bayesian network to evolve and adjust itself according to the surrounding environment. The results achieved demonstrate the effectiveness of the technique in dealing with human-robot interaction in a relatively crowded environment [7]

    An Empirical Model of Area MT: Investigating the Link between Representation Properties and Function

    Get PDF
    The middle temporal area (MT) is one of the visual areas of the primate brain where neurons have highly specialized representations of motion and binocular disparity. Other stimulus features such as contrast, size, and pattern can also modulate MT activity. Since MT has been studied intensively for decades, there is a rich literature on its response characteristics. Here, I present an empirical model that incorporates some of this literature into a statistical model of population response. Specifically, the parameters of the model are drawn from distributions that I have estimated from data in the electrophysiology literature. The model accepts arbitrary stereo video as input and uses computer-vision methods to calculate dense flow, disparity, and contrast fields. The activity is then predicted using a combination of tuning functions, which have previously been used to describe data in a variety of experiments. The empirical model approximates a number of MT phenomena more closely than other models as well as reproducing three phenomena not addressed with the past models. I present three applications of the model. First, I use it for examining the relationships between MT tuning features and behaviour in an ethologically relevant task. Second, I employ it to study the functional role of MT surrounds in motion-related tasks. Third, I use it to guide the internal activity of a deep convolutional network towards a more physiologically realistic representation
    corecore