202,755 research outputs found

    Movement flow-based visual servoing to track moving objects

    Get PDF
    The purpose of this paper is to describe a new method for tracking trajectories specified in the image space. This method, called movement flow-based visual servoing system, is applied to an eye-in-hand robot and it is shown that it allows the correct tracking of a trajectory, not only in the image but also in the 3-D space. This method is also extended to the case in which the object from which the features are extracted, is in motion. To do so, the estimations obtained, using several Kalman filters, are integrated in the control action

    Linear Regression and Unsupervised Learning For Tracking and Embodied Robot Control.

    Get PDF
    Computer vision problems, such as tracking and robot navigation, tend to be solved using models of the objects of interest to the problem. These models are often either hard-coded, or learned in a supervised manner. In either case, an engineer is required to identify the visual information that is important to the task, which is both time consuming and problematic. Issues with these engineered systems relate to the ungrounded nature of the knowledge imparted by the engineer, where the systems have no meaning attached to the representations. This leads to systems that are brittle and are prone to failure when expected to act in environments not envisaged by the engineer. The work presented in this thesis removes the need for hard-coded or engineered models of either visual information representations or behaviour. This is achieved by developing novel approaches for learning from example, in both input (percept) and output (action) spaces. This approach leads to the development of novel feature tracking algorithms, and methods for robot control. Applying this approach to feature tracking, unsupervised learning is employed, in real time, to build appearance models of the target that represent the input space structure, and this structure is exploited to partition banks of computationally efficient, linear regression based target displacement estimators. This thesis presents the first application of regression based methods to the problem of simultaneously modeling and tracking a target object. The computationally efficient Linear Predictor (LP) tracker is investigated, along with methods for combining and weighting flocks of LP’s. The tracking algorithms developed operate with accuracy comparable to other state of the art online approaches and with a significant gain in computational efficiency. This is achieved as a result of two specific contributions. First, novel online approaches for the unsupervised learning of modes of target appearance that identify aspects of the target are introduced. Second, a general tracking framework is developed within which the identified aspects of the target are adaptively associated to subsets of a bank of LP trackers. This results in the partitioning of LP’s and the online creation of aspect specific LP flocks that facilitate tracking through significant appearance changes. Applying the approach to the percept action domain, unsupervised learning is employed to discover the structure of the action space, and this structure is used in the formation of meaningful perceptual categories, and to facilitate the use of localised input-output (percept-action) mappings. This approach provides a realisation of an embodied and embedded agent that organises its perceptual space and hence its cognitive process based on interactions with its environment. Central to the proposed approach is the technique of clustering an input-output exemplar set, based on output similarity, and using the resultant input exemplar groupings to characterise a perceptual category. All input exemplars that are coupled to a certain class of outputs form a category - the category of a given affordance, action or function. In this sense the formed perceptual categories have meaning and are grounded in the embodiment of the agent. The approach is shown to identify the relative importance of perceptual features and is able to solve percept-action tasks, defined only by demonstration, in previously unseen situations. Within this percept-action learning framework, two alternative approaches are developed. The first approach employs hierarchical output space clustering of point-to-point mappings, to achieve search efficiency and input and output space generalisation as well as a mechanism for identifying the important variance and invariance in the input space. The exemplar hierarchy provides, in a single structure, a mechanism for classifying previously unseen inputs and generating appropriate outputs. The second approach to a percept-action learning framework integrates the regression mappings used in the feature tracking domain, with the action space clustering and imitation learning techniques developed in the percept-action domain. These components are utilised within a novel percept-action data mining methodology, that is able to discover the visual entities that are important to a specific problem, and to map from these entities onto the action space. Applied to the robot control task, this approach allows for real-time generation of continuous action signals, without the use of any supervision or definition of representations or rules of behaviour

    Visual Learning in Multiple-Object Tracking

    Get PDF
    Tracking moving objects in space is important for the maintenance of spatiotemporal continuity in everyday visual tasks. In the laboratory, this ability is tested using the Multiple Object Tracking (MOT) task, where participants track a subset of moving objects with attention over an extended period of time. The ability to track multiple objects with attention is severely limited. Recent research has shown that this ability may improve with extensive practice (e.g., from action videogame playing). However, whether tracking also improves in a short training session with repeated trajectories has rarely been investigated. In this study we examine the role of visual learning in multiple-object tracking and characterize how varieties of attention interact with visual learning.Participants first conducted attentive tracking on trials with repeated motion trajectories for a short session. In a transfer phase we used the same motion trajectories but changed the role of tracking targets and nontargets. We found that compared with novel trials, tracking was enhanced only when the target subset was the same as that used during training. Learning did not transfer when the previously trained targets and nontargets switched roles or mixed up. However, learning was not specific to the trained temporal order as it transferred to trials where the motion was played backwards.These findings suggest that a demanding task of tracking multiple objects can benefit from learning of repeated motion trajectories. Such learning potentially facilitates tracking in natural vision, although learning is largely confined to the trajectories of attended objects. Furthermore, we showed that learning in attentive tracking relies on relational coding of all target trajectories. Surprisingly, learning was not specific to the trained temporal context, probably because observers have learned motion paths of each trajectory independently of the exact temporal order

    Analytical tools of strategic marketing management

    Get PDF
    This thesis deals with three topics; Bayesian tracking, shape matching and visual servoing. These topics are bound together by the goal of visual control of robotic systems. The work leading to this thesis was conducted within two European projects, COSPAL and DIPLECS, both with the stated goal of developing artificial cognitive systems. Thus, the ultimate goal of my research is to contribute to the development of artificial cognitive systems. The contribution to the field of Bayesian tracking is in the form of a framework called Channel Based Tracking (CBT). CBT has been proven to perform competitively with particle filter based approaches but with the added advantage of not having to specify the observation or system models. CBT uses channel representation and correspondence free learning in order to acquire the observation and system models from unordered sets of observations and states. We demonstrate how this has been used for tracking cars in the presence of clutter and noise. The shape matching part of this thesis presents a new way to match Fourier Descriptors (FDs). We show that it is possible to take rotation and index shift into account while matching FDs without explicitly de-rotate the contours or neglecting the phase. We also propose to use FDs for matching locally extracted shapes in contrast to the traditional way of using FDs to match the global outline of an object. We have in this context evaluated our matching scheme against the popular Affine Invariant FDs and shown that our method is clearly superior. In the visual servoing part we present a visual servoing method that is based on an action precedes perception approach. By applying random action with a system, e.g. a robotic arm, it is possible to learn a mapping between action space and percept space. In experiments we show that it is possible to achieve high precision positioning of a robotic arm without knowing beforehand how the robotic arm looks like or how it is controlled

    Distributional Feature Mapping in Data Classification

    Get PDF
    Performance of a machine learning algorithm depends on the representation of the input data. In computer vision problems, histogram based feature representation has significantly improved the classification tasks. L1 normalized histograms can be modelled by Dirichlet and related distributions to transform input space to feature space. We propose a mapping technique that contains prior knowledge about the distribution of the data and increases the discriminative power of the classifiers in supervised learning such as Support Vector Machine (SVM). The mapping technique for proportional data which is based on Dirichlet, Generalized Dirichlet, Beta Liouville, scaled Dirichlet and shifted scaled Dirichlet distributions can be incorporated with traditional kernels to improve the base kernels accuracy. Experimental results show that the proposed technique for proportional data increases accuracy for machine vision tasks such as natural scene recognition, satellite image classification, gender classification, facial expression recognition and human action recognition in videos. In addition, in object tracking, learning parametric features of the target object using Dirichlet and related distributions may help to capture representations invariant to noise. This further motivated our study of such distributions in object tracking. We propose a framework for feature representation on probability simplex for proportional data utilizing the histogram representation of the target object at initial frame. A set of parameter vectors determine the appearance features of the target object in the subsequent frames. Motivated by the success of distribution based feature mapping for proportional data, we extend this technique for semi-bounded data utilizing inverted Dirichlet, generalized inverted Dirichlet and inverted Beta Liouville distributions. Similar approach is taken into account for count data where Dirichlet multinomial and generalized Dirichlet multinomial distributions are used to map density features with input features

    USING HAND RECOGNITION IN TELEROBOTICS

    Get PDF
    The objective of this project is to recognize selected hand gestures and imitate the recognized hand gesture using a robot. A telerobotics system that relies on computer vision to create the human-machine interface was build. Hand tracking was used as an intuitive control interface, as it represents a natural interaction medium. The system tracks the hand of the operator and the gesture it represents, and relays the appropriate signal to the robot to perform the respective action, in real time. The study focuses on two gestures, open hand, and closed hand, as the NAO robot is not equipped with a dexterous hand. Numerous object recognition algorithms were compared and the SURF based object detector was used. The system was successfully implemented, and was able to recognise the two gestures in 3D space using images from a 2D video camera

    Sensor Scheduling for Energy-Efficient Target Tracking in Sensor Networks

    Full text link
    In this paper we study the problem of tracking an object moving randomly through a network of wireless sensors. Our objective is to devise strategies for scheduling the sensors to optimize the tradeoff between tracking performance and energy consumption. We cast the scheduling problem as a Partially Observable Markov Decision Process (POMDP), where the control actions correspond to the set of sensors to activate at each time step. Using a bottom-up approach, we consider different sensing, motion and cost models with increasing levels of difficulty. At the first level, the sensing regions of the different sensors do not overlap and the target is only observed within the sensing range of an active sensor. Then, we consider sensors with overlapping sensing range such that the tracking error, and hence the actions of the different sensors, are tightly coupled. Finally, we consider scenarios wherein the target locations and sensors' observations assume values on continuous spaces. Exact solutions are generally intractable even for the simplest models due to the dimensionality of the information and action spaces. Hence, we devise approximate solution techniques, and in some cases derive lower bounds on the optimal tradeoff curves. The generated scheduling policies, albeit suboptimal, often provide close-to-optimal energy-tracking tradeoffs

    Learning to track for spatio-temporal action localization

    Get PDF
    We propose an effective approach for spatio-temporal action localization in realistic videos. The approach first detects proposals at the frame-level and scores them with a combination of static and motion CNN features. It then tracks high-scoring proposals throughout the video using a tracking-by-detection approach. Our tracker relies simultaneously on instance-level and class-level detectors. The tracks are scored using a spatio-temporal motion histogram, a descriptor at the track level, in combination with the CNN features. Finally, we perform temporal localization of the action using a sliding-window approach at the track level. We present experimental results for spatio-temporal localization on the UCF-Sports, J-HMDB and UCF-101 action localization datasets, where our approach outperforms the state of the art with a margin of 15%, 7% and 12% respectively in mAP

    Sensor Management for Tracking in Sensor Networks

    Full text link
    We study the problem of tracking an object moving through a network of wireless sensors. In order to conserve energy, the sensors may be put into a sleep mode with a timer that determines their sleep duration. It is assumed that an asleep sensor cannot be communicated with or woken up, and hence the sleep duration needs to be determined at the time the sensor goes to sleep based on all the information available to the sensor. Having sleeping sensors in the network could result in degraded tracking performance, therefore, there is a tradeoff between energy usage and tracking performance. We design sleeping policies that attempt to optimize this tradeoff and characterize their performance. As an extension to our previous work in this area [1], we consider generalized models for object movement, object sensing, and tracking cost. For discrete state spaces and continuous Gaussian observations, we derive a lower bound on the optimal energy-tracking tradeoff. It is shown that in the low tracking error regime, the generated policies approach the derived lower bound
    corecore