3,977 research outputs found

    Multi-Object Tracking by Flying Cameras Based on a Forward-Backward Interaction

    Get PDF
    The automatic analysis of images acquired by cameras mounted on board of drones (flying cameras) is attracting many scientists working in the field of computer vision; the interest is related to the increasing need of algorithms able to understand the scenes acquired by flying cameras, by detecting the moving objects, calculating their trajectories, and finally understanding their activities. The problem is made challenging by the fact that, in the most general case, the drone flies without any awareness of the environment; thus, no initial set-up configuration based on the appearance of the area of interest can be used for simplifying the task, as it generally happens when working with fixed cameras. Moreover, the apparent movements of the objects in the images are superimposed to that generated by the camera, associated with the flight of the drone (varying in the altitude, speed, and the angles of yaw and pitch). Finally, it has to be considered that the algorithm should involve simple visual computational models as the drone can only host embedded computers having limited computing resources. This paper proposes a detection and tracking algorithm based on a novel paradigm suitably combining a forward tracking based on local data association with a backward chain, aimed at automatically tuning the operating parameters frame by frame, so as to be totally independent on the visual appearance of the flying area. This also definitively drops any time-consuming manual configuration procedure by a human operator. Although the method is self-configured and requires low-computational resources, its accuracy on a wide data set of real videos demonstrates its applicability in real contexts, even running over embedded platforms. Experimental results are given on a set of 53 videos and more than 60 000 frames

    EVEN-VE: Eyes Visibility Based Egocentric Navigation for Virtual Environments

    Get PDF
    Navigation is one of the 3D interactions often needed to interact with a synthetic world. The latest advancements in image processing have made possible gesture based interaction with a virtual world. However, the speed with which a 3D virtual world responds to a user’s gesture is far greater than posing of the gesture itself. To incorporate faster and natural postures in the realm of Virtual Environment (VE), this paper presents a novel eyes-based interaction technique for navigation and panning. Dynamic wavering and positioning of eyes are deemed as interaction instructions by the system. The opening of eyes preceded by closing for a distinct time-threshold, activates forward or backward navigation. Supporting 2-Degree of Freedom head’s gestures (Rolling and Pitching) panning is performed over the xy-plane. The proposed technique was implemented in a case-study project; EWI (Eyes Wavering based Interaction). With EWI, real time detection and tracking of eyes are performed by the libraries of OpenCV at the backend. To interactively follow trajectory of both the eyes, dynamic mapping is performed in OpenGL. The technique was evaluated in two separate sessions by a total of 28 users to assess accuracy, speed and suitability of the system in Virtual Reality (VR). Using an ordinary camera, an average accuracy of 91% was achieved. However, assessment made by using a high quality camera testified that accuracy of the system could be raised to a higher level besides increase in navigation speed. Results of the unbiased statistical evaluations suggest/demonstrate applicability of the system in the emerging domains of virtual and augmented realities

    MRSL: AUTONOMOUS NEURAL NETWORK-BASED SELF-STABILIZING SYSTEM

    Get PDF
    Stabilizing and localizing the positioning systems autonomously in the areas without GPS accessibility is a difficult task. In this thesis we describe a methodology called Most Reliable Straight Line (MRSL) for stabilizing and positioning camera-based objects in 3-D space. The camera-captured images are used to identify easy-to-track points “interesting points� and track them on two consecutive images. The distance between each of interesting points on the two consecutive images are compared and one with the maximum length is assigned to MRSL, which is used to indicate the deviation from the original position. To correct this our trained algorithm is deployed to reduce the deviation by issuing relevant commands, this action is repeated until MRSL converges to zero. To test the accuracy and robustness, the algorithm was deployed to control positioning of a Quadcopter. It was demonstrated that the Quadcopter (a) was highly robust to any external forces, (b) can fly even if the Quadcopter experiences loss of engine, (c) can fly smoothly and positions itself on a desired location

    Occlusion reasoning for multiple object visual tracking

    Full text link
    Thesis (Ph.D.)--Boston UniversityOcclusion reasoning for visual object tracking in uncontrolled environments is a challenging problem. It becomes significantly more difficult when dense groups of indistinguishable objects are present in the scene that cause frequent inter-object interactions and occlusions. We present several practical solutions that tackle the inter-object occlusions for video surveillance applications. In particular, this thesis proposes three methods. First, we propose "reconstruction-tracking," an online multi-camera spatial-temporal data association method for tracking large groups of objects imaged with low resolution. As a variant of the well-known Multiple-Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly occluded observations from multiple camera views and performs temporal data association in 3D. Second, we develop "track linking," a class of offline batch processing algorithms for long-term occlusions, where the decision has to be made based on the observations from the entire tracking sequence. We construct a graph representation to characterize occlusion events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions. Third, we propose a novel Bayesian framework where detection and data association are combined into a single module and solved jointly. Almost all traditional tracking systems address the detection and data association tasks separately in sequential order. Such a design implies that the output of the detector has to be reliable in order to make the data association work. Our framework takes advantage of the often complementary nature of the two subproblems, which not only avoids the error propagation issue from which traditional "detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum suppression" of hypotheses by modeling the likelihood of the entire image. The thesis describes a substantial number of experiments, involving challenging, notably distinct simulated and real data, including infrared and visible-light data sets recorded ourselves or taken from data sets publicly available. In these videos, the number of objects ranges from a dozen to a hundred per frame in both monocular and multiple views. The experiments demonstrate that our approaches achieve results comparable to those of state-of-the-art approaches

    An Exploration Of Unmanned Aerial Vehicle Direct Manipulation Through 3d Spatial Interaction

    Get PDF
    We present an exploration that surveys the strengths and weaknesses of various 3D spatial interaction techniques, in the context of directly manipulating an Unmanned Aerial Vehicle (UAV). Particularly, a study of touch- and device- free interfaces in this domain is provided. 3D spatial interaction can be achieved using hand-held motion control devices such as the Nintendo Wiimote, but computer vision systems offer a different and perhaps more natural method. In general, 3D user interfaces (3DUI) enable a user to interact with a system on a more robust and potentially more meaningful scale. We discuss the design and development of various 3D interaction techniques using commercially available computer vision systems, and provide an exploration of the effects that these techniques have on an overall user experience in the UAV domain. Specific qualities of the user experience are targeted, including the perceived intuition, ease of use, comfort, and others. We present a complete user study for upper-body gestures, and preliminary reactions towards 3DUI using hand-and-finger gestures are also discussed. The results provide evidence that supports the use of 3DUI in this domain, as well as the use of certain styles of techniques over others

    3D Object following based on visual information for Unmanned Aerial Vehicles

    Get PDF
    This article presents a novel system and a control strategy for visual following of a 3D moving object by an Unmanned Aerial Vehicle UAV. The presented strategy is based only on the visual information given by an adaptive tracking method based on the color information, which jointly with the dynamics of a camera fixed to a rotary wind UAV are used to develop an Image-based visual servoing IBVS system. This system is focused on continuously following a 3D moving target object, maintaining it with a fixed distance and centered on the image plane. The algorithm is validated on real flights on outdoors scenarios, showing the robustness of the proposed systems against winds perturbations, illumination and weather changes among others. The obtained results indicate that the proposed algorithms is suitable for complex controls task, such object following and pursuit, flying in formation, as well as their use for indoor navigatio

    Exploring Alternative Control Modalities for Unmanned Aerial Vehicles

    Get PDF
    Unmanned aerial vehicles (UAVs), commonly known as drones, are defined by the International Civil Aviation Organization (ICAO) as an aircraft without a human pilot on board. They are currently utilized primarily in the defense and security sectors but are moving towards the general market in surprisingly powerful and inexpensive forms. While drones are presently restricted to non-commercial recreational use in the USA, it is expected that they will soon be widely adopted for both commercial and consumer use. Potentially, UAVs can revolutionize various business sectors including private security, agricultural practices, product transport and maybe even aerial advertising. Business Insider foresees that 12% of the expected $98 billion cumulative global spending on aerial drones through the following decade will be for business purposes.[28] At the moment, most drones are controlled by some sort of classic joystick or multitouch remote controller. While drone manufactures have improved the overall controllability of their products, most drones shipped today are still quite challenging for inexperienced users to pilot. In order to help mitigate the controllability challenges and flatten the learning curve, gesture controls can be utilized to improve piloting UAVs. The purpose of this study was to develop and evaluate an improved and more intuitive method of flying UAVs by supporting the use of hand gestures, and other non-traditional control modalities. The goal was to employ and test an end-to-end UAV system that provides an easy-to-use control interface for novice drone users. The expectation was that by implementing gesture-based navigation, the novice user will have an overall enjoyable and safe experience quickly learning how to navigate a drone with ease, and avoid losing or damaging the vehicle while they are on the initial learning curve. During the course of this study we have learned that while this approach does offer lots of promise, there are a number of technical challenges that make this problem much more challenging than anticipated. This thesis details our approach to the problem, analyzes the user data we collected, and summarizes the lessons learned

    Exploring Motion Signatures for Vision-Based Tracking, Recognition and Navigation

    Get PDF
    As cameras become more and more popular in intelligent systems, algorithms and systems for understanding video data become more and more important. There is a broad range of applications, including object detection, tracking, scene understanding, and robot navigation. Besides the stationary information, video data contains rich motion information of the environment. Biological visual systems, like human and animal eyes, are very sensitive to the motion information. This inspires active research on vision-based motion analysis in recent years. The main focus of motion analysis has been on low level motion representations of pixels and image regions. However, the motion signatures can benefit a broader range of applications if further in-depth analysis techniques are developed. In this dissertation, we mainly discuss how to exploit motion signatures to solve problems in two applications: object recognition and robot navigation. First, we use bird species recognition as the application to explore motion signatures for object recognition. We begin with study of the periodic wingbeat motion of flying birds. To analyze the wing motion of a flying bird, we establish kinematics models for bird wings, and obtain wingbeat periodicity in image frames after the perspective projection. Time series of salient extremities on bird images are extracted, and the wingbeat frequency is acquired for species classification. Physical experiments show that the frequency based recognition method is robust to segmentation errors and measurement lost up to 30%. In addition to the wing motion, the body motion of the bird is also analyzed to extract the flying velocity in 3D space. An interacting multi-model approach is then designed to capture the combined object motion patterns and different environment conditions. The proposed systems and algorithms are tested in physical experiments, and the results show a false positive rate of around 20% with a low false negative rate close to zero. Second, we explore motion signatures for vision-based vehicle navigation. We discover that motion vectors (MVs) encoded in Moving Picture Experts Group (MPEG) videos provide rich information of the motion in the environment, which can be used to reconstruct the vehicle ego-motion and the structure of the scene. However, MVs suffer from high noise level. To handle the challenge, an error propagation model for MVs is first proposed. Several steps, including MV merging, plane-at-infinity elimination, and planar region extraction, are designed to further reduce noises. The extracted planes are used as landmarks in an extended Kalman filter (EKF) for simultaneous localization and mapping. Results show that the algorithm performs localization and plane mapping with a relative trajectory error below 5:1%. Exploiting the fact that MVs encodes both environment information and moving obstacles, we further propose to track moving objects at the same time of localization and mapping. This enables the two critical navigation functionalities, localization and obstacle avoidance, to be performed in a single framework. MVs are labeled as stationary or moving according to their consistency to geometric constraints. Therefore, the extracted planes are separated into moving objects and the stationary scene. Multiple EKFs are used to track the static scene and the moving objects simultaneously. In physical experiments, we show a detection rate of moving objects at 96:6% and a mean absolute localization error below 3:5 meters
    corecore