1,512 research outputs found

    Adaptive Framework for Robust Visual Tracking

    Get PDF
    Visual tracking is a difficult and challenging problem, for numerous reasons such as small object size, pose angle variations, occlusion, and camera motion. Object tracking has many real-world applications such as surveillance systems, moving organs in medical imaging, and robotics. Traditional tracking methods lack a recovery mechanism that can be used in situations when the tracked objects drift away from ground truth. In this paper, we propose a novel framework for tracking moving objects based on a composite framework and a reporter mechanism. The composite framework tracks moving objects using different trackers and produces pairs of forward/backward tracklets. A robustness score is then calculated for each tracker using its forward/backward tracklet pair to find the most reliable moving object trajectory. The reporter serves as the recovery mechanism to correct the moving object trajectory when the robustness score is very low, mainly using a combination of particle filter and template matching. The proposed framework can handle partial and heavy occlusions; moreover, the structure of the framework enables integration of other user-specific trackers. Extensive experiments on recent benchmarks show that the proposed framework outperforms other current state-of-the-art trackers due to its powerful trajectory analysis and recovery mechanism; the framework improved the area under the curve from 68% to 70.8% on OTB-100 benchmark

    Event-based Vision: A Survey

    Get PDF
    Event cameras are bio-inspired sensors that differ from conventional frame cameras: Instead of capturing images at a fixed rate, they asynchronously measure per-pixel brightness changes, and output a stream of events that encode the time, location and sign of the brightness changes. Event cameras offer attractive properties compared to traditional cameras: high temporal resolution (in the order of microseconds), very high dynamic range (140 dB vs. 60 dB), low power consumption, and high pixel bandwidth (on the order of kHz) resulting in reduced motion blur. Hence, event cameras have a large potential for robotics and computer vision in challenging scenarios for traditional cameras, such as low-latency, high speed, and high dynamic range. However, novel methods are required to process the unconventional output of these sensors in order to unlock their potential. This paper provides a comprehensive overview of the emerging field of event-based vision, with a focus on the applications and the algorithms developed to unlock the outstanding properties of event cameras. We present event cameras from their working principle, the actual sensors that are available and the tasks that they have been used for, from low-level vision (feature detection and tracking, optic flow, etc.) to high-level vision (reconstruction, segmentation, recognition). We also discuss the techniques developed to process events, including learning-based techniques, as well as specialized processors for these novel sensors, such as spiking neural networks. Additionally, we highlight the challenges that remain to be tackled and the opportunities that lie ahead in the search for a more efficient, bio-inspired way for machines to perceive and interact with the world

    e-TLD: Event-based Framework for Dynamic Object Tracking

    Full text link
    This paper presents a long-term object tracking framework with a moving event camera under general tracking conditions. A first of its kind for these revolutionary cameras, the tracking framework uses a discriminative representation for the object with online learning, and detects and re-tracks the object when it comes back into the field-of-view. One of the key novelties is the use of an event-based local sliding window technique that tracks reliably in scenes with cluttered and textured background. In addition, Bayesian bootstrapping is used to assist real-time processing and boost the discriminative power of the object representation. On the other hand, when the object re-enters the field-of-view of the camera, a data-driven, global sliding window detector locates the object for subsequent tracking. Extensive experiments demonstrate the ability of the proposed framework to track and detect arbitrary objects of various shapes and sizes, including dynamic objects such as a human. This is a significant improvement compared to earlier works that simply track objects as long as they are visible under simpler background settings. Using the ground truth locations for five different objects under three motion settings, namely translation, rotation and 6-DOF, quantitative measurement is reported for the event-based tracking framework with critical insights on various performance issues. Finally, real-time implementation in C++ highlights tracking ability under scale, rotation, view-point and occlusion scenarios in a lab setting.Comment: 11 pages, 10 figure

    Mobile Robot Navigation for Person Following in Indoor Environments

    Get PDF
    Service robotics is a rapidly growing area of interest in robotics research. Service robots inhabit human-populated environments and carry out specific tasks. The goal of this dissertation is to develop a service robot capable of following a human leader around populated indoor environments. A classification system for person followers is proposed such that it clearly defines the expected interaction between the leader and the robotic follower. In populated environments, the robot needs to be able to detect and identify its leader and track the leader through occlusions, a common characteristic of populated spaces. An appearance-based person descriptor, which augments the Kinect skeletal tracker, is developed and its performance in detecting and overcoming short and long-term leader occlusions is demonstrated. While following its leader, the robot has to ensure that it does not collide with stationary and moving obstacles, including other humans, in the environment. This requirement necessitates the use of a systematic navigation algorithm. A modified version of navigation function path planning, called the predictive fields path planner, is developed. This path planner models the motion of obstacles, uses a simplified representation of practical workspaces, and generates bounded, stable control inputs which guide the robot to its desired position without collisions with obstacles. The predictive fields path planner is experimentally verified on a non-person follower system and then integrated into the robot navigation module of the person follower system. To navigate the robot, it is necessary to localize it within its environment. A mapping approach based on depth data from the Kinect RGB-D sensor is used in generating a local map of the environment. The map is generated by combining inter-frame rotation and translation estimates based on scan generation and dead reckoning respectively. Thus, a complete mobile robot navigation system for person following in indoor environments is presented

    Towards achieving convincing live interaction in a mixed reality environment for television studios

    Get PDF
    The virtual studio is a form of Mixed Reality environment for creating television programmes, where the (real) actor appears to exist within an entirely virtual set. The work presented in this thesis evaluates the routes required towards developing a virtual studio that extends from current architectures in allowing realistic interactions between the actor and the virtual set in real-time. The methodologies and framework presented in this thesis is intended to support future work in this domain. Heuristic investigation is offered as a framework to analyse and provide the requirements for developing interaction within a virtual studio. In this framework a group of experts participate in case study scenarios to generate a list of requirements that guide future development of the technology. It is also concluded that this method could be used in a cyclical manner to further refine systems postdevelopment. This leads to the development of three key areas. Firstly a feedback system is presented, which tracks actor head motion within the studio and provides dynamic visual feedback relative to their current gaze location. Secondly a real-time actor/virtual set occlusion system that uses skeletal tracking data and depth information to change the relative location of virtual set elements dynamically is developed. Finally an interaction system is presented that facilitates real-time interaction between an actor and the virtual set objects, providing both single handed and bimanual interactions. Evaluation of this system highlights some common errors in mixed reality interaction, notably those arising from inaccurate hand placement when actors perform bimanual interactions. A novel two stage framework is presented that measures the magnitude of the errors in actor hand placement, and also, the perceived fidelity of the interaction from a third person viewer. The first stage of this framework quantifies the actor motion errors while completing a series of interaction tasks under varying controls. The second stage uses examples of these errors to measure the perceptual tolerance of a third person when viewing interaction errors in the end broadcast. The results from this two stage evaluation lead to the development of three methods for mitigating the actor errors, with each evaluated against its ability to aid in the visual fidelity of the interaction. It was discovered that the adapting the size of the virtual object was effective in improving the quality of the interaction, whereas adapting the colour of any exposed background did not have any apparent effects. Finally a set of guidelines based on these findings is provided to recommend appropriate solutions that can be applied for allowing interaction within live virtual studio environments that can easily be adapted for other mixed reality systems

    Novel Aggregated Solutions for Robust Visual Tracking in Traffic Scenarios

    Get PDF
    This work proposes novel approaches for object tracking in challenging scenarios like severe occlusion, deteriorated vision and long range multi-object reidentification. All these solutions are only based on image sequence captured by a monocular camera and do not require additional sensors. Experiments on standard benchmarks demonstrate an improved state-of-the-art performance of these approaches. Since all the presented approaches are smartly designed, they can run at a real-time speed
    corecore