45,200 research outputs found

    Active visual tracking in multi-agent scenarios

    Get PDF
    PhD thesisCamera-equipped robots (agents) can autonomously follow people to provide continuous assistance in wide areas, e.g. museums and airports. Each agent serves one person (target) at a time and aims to maintain its target centred on the camera’s image plane with a certain size (active visual tracking) without colliding with other agents and targets in its proximity. It is essential that each agent accurately estimates the state of itself and that of nearby targets and agents over time (i.e. tracking) to perform collision-free active visual tracking. Agents can track themselves with either on-board sensors (e.g. cameras or inertial sensors) or external tracking systems (e.g. multi-camera systems). However, on-board sensing alone is not sufficient for tracking nearby targets due to occlusions in crowded scenes, where an external multi-camera system can help. To address scalability of wide-area applications and accurate tracking, this thesis proposes a novel collaborative framework where agents track nearby targets jointly with wireless ceiling-mounted static cameras in a distributed manner. Distributed tracking enables each agent to achieve agreed state estimates of targets via iteratively communicating with neighbouring static cameras. However, such iterative neighbourhood communication may cause poor communication quality (i.e. packet loss/error) due to limited bandwidth, which worsens tracking accuracy. This thesis proposes the formation of coalitions among static cameras prior to distributed tracking based on a marginal information utility that accounts for both the communication quality and the local tracking confidence. Agents move on demand when hearing requests from nearby static cameras. Each agent independently selects its target with limited scene knowledge and computes its robotic control for collision-free active visual tracking. Collision avoidance among robots and targets can be achieved by the Optimal Reciprocal Collision Avoidance (ORCA) method. To further address view maintenance during collision avoidance manoeuvres, this thesis proposes an ORCA-based method with adaptive responsibility sharing and heading-aware robotic control mapping. Experimental results show that the proposed methods achieve higher tracking accuracy and better view maintenance compared with the state-of-the-art methods.Queen Mary University of London and Chinese Scholarship Council

    Cognitive visual tracking and camera control

    Get PDF
    Cognitive visual tracking is the process of observing and understanding the behaviour of a moving person. This paper presents an efficient solution to extract, in real-time, high-level information from an observed scene, and generate the most appropriate commands for a set of pan-tilt-zoom (PTZ) cameras in a surveillance scenario. Such a high-level feedback control loop, which is the main novelty of our work, will serve to reduce uncertainties in the observed scene and to maximize the amount of information extracted from it. It is implemented with a distributed camera system using SQL tables as virtual communication channels, and Situation Graph Trees for knowledge representation, inference and high-level camera control. A set of experiments in a surveillance scenario show the effectiveness of our approach and its potential for real applications of cognitive vision

    Learning Intelligent Dialogs for Bounding Box Annotation

    Get PDF
    We introduce Intelligent Annotation Dialogs for bounding box annotation. We train an agent to automatically choose a sequence of actions for a human annotator to produce a bounding box in a minimal amount of time. Specifically, we consider two actions: box verification, where the annotator verifies a box generated by an object detector, and manual box drawing. We explore two kinds of agents, one based on predicting the probability that a box will be positively verified, and the other based on reinforcement learning. We demonstrate that (1) our agents are able to learn efficient annotation strategies in several scenarios, automatically adapting to the image difficulty, the desired quality of the boxes, and the detector strength; (2) in all scenarios the resulting annotation dialogs speed up annotation compared to manual box drawing alone and box verification alone, while also outperforming any fixed combination of verification and drawing in most scenarios; (3) in a realistic scenario where the detector is iteratively re-trained, our agents evolve a series of strategies that reflect the shifting trade-off between verification and drawing as the detector grows stronger.Comment: This paper appeared at CVPR 201

    Realtime Multilevel Crowd Tracking using Reciprocal Velocity Obstacles

    Full text link
    We present a novel, realtime algorithm to compute the trajectory of each pedestrian in moderately dense crowd scenes. Our formulation is based on an adaptive particle filtering scheme that uses a multi-agent motion model based on velocity-obstacles, and takes into account local interactions as well as physical and personal constraints of each pedestrian. Our method dynamically changes the number of particles allocated to each pedestrian based on different confidence metrics. Additionally, we use a new high-definition crowd video dataset, which is used to evaluate the performance of different pedestrian tracking algorithms. This dataset consists of videos of indoor and outdoor scenes, recorded at different locations with 30-80 pedestrians. We highlight the performance benefits of our algorithm over prior techniques using this dataset. In practice, our algorithm can compute trajectories of tens of pedestrians on a multi-core desktop CPU at interactive rates (27-30 frames per second). To the best of our knowledge, our approach is 4-5 times faster than prior methods, which provide similar accuracy

    Human Motion Trajectory Prediction: A Survey

    Full text link
    With growing numbers of intelligent autonomous systems in human environments, the ability of such systems to perceive, understand and anticipate human behavior becomes increasingly important. Specifically, predicting future positions of dynamic agents and planning considering such predictions are key tasks for self-driving vehicles, service robots and advanced surveillance systems. This paper provides a survey of human motion trajectory prediction. We review, analyze and structure a large selection of work from different communities and propose a taxonomy that categorizes existing methods based on the motion modeling approach and level of contextual information used. We provide an overview of the existing datasets and performance metrics. We discuss limitations of the state of the art and outline directions for further research.Comment: Submitted to the International Journal of Robotics Research (IJRR), 37 page

    Aerial-Ground collaborative sensing: Third-Person view for teleoperation

    Full text link
    Rapid deployment and operation are key requirements in time critical application, such as Search and Rescue (SaR). Efficiently teleoperated ground robots can support first-responders in such situations. However, first-person view teleoperation is sub-optimal in difficult terrains, while a third-person perspective can drastically increase teleoperation performance. Here, we propose a Micro Aerial Vehicle (MAV)-based system that can autonomously provide third-person perspective to ground robots. While our approach is based on local visual servoing, it further leverages the global localization of several ground robots to seamlessly transfer between these ground robots in GPS-denied environments. Therewith one MAV can support multiple ground robots on a demand basis. Furthermore, our system enables different visual detection regimes, and enhanced operability, and return-home functionality. We evaluate our system in real-world SaR scenarios.Comment: Accepted for publication in 2018 IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR
    corecore