33,649 research outputs found

    Moving Object Detection and Segmentation for Remote Aerial Video Surveillance

    Get PDF
    Unmanned Aerial Vehicles (UAVs) equipped with video cameras are a flexible support to ensure civil and military safety and security. In this thesis, a video processing chain is presented for moving object detection in aerial video surveillance. A Track-Before-Detect (TBD) algorithm is applied to detect motion that is independent of the camera motion. Novel robust and fast object detection and segmentation approaches improve the baseline TBD and outperform current state-of-the-art methods

    Video Processing with Additional Information

    Get PDF
    Cameras are frequently deployed along with many additional sensors in aerial and ground-based platforms. Many video datasets have metadata containing measurements from inertial sensors, GPS units, etc. Hence the development of better video processing algorithms using additional information attains special significance. We first describe an intensity-based algorithm for stabilizing low resolution and low quality aerial videos. The primary contribution is the idea of minimizing the discrepancy in the intensity of selected pixels between two images. This is an application of inverse compositional alignment for registering images of low resolution and low quality, for which minimizing the intensity difference over salient pixels with high gradients results in faster and better convergence than when using all the pixels. Secondly, we describe a feature-based method for stabilization of aerial videos and segmentation of small moving objects. We use the coherency of background motion to jointly track features through the sequence. This enables accurate tracking of large numbers of features in the presence of repetitive texture, lack of well conditioned feature windows etc. We incorporate the segmentation problem within the joint feature tracking framework and propose the first combined joint-tracking and segmentation algorithm. The proposed approach enables highly accurate tracking, and segmentation of feature tracks that is used in a MAP-MRF framework for obtaining dense pixelwise labeling of the scene. We demonstrate competitive moving object detection in challenging video sequences of the VIVID dataset containing moving vehicles and humans that are small enough to cause background subtraction approaches to fail. Structure from Motion (SfM) has matured to a stage, where the emphasis is on developing fast, scalable and robust algorithms for large reconstruction problems. The availability of additional sensors such as inertial units and GPS along with video cameras motivate the development of SfM algorithms that leverage these additional measurements. In the third part, we study the benefits of the availability of a specific form of additional information - the vertical direction (gravity) and the height of the camera both of which can be conveniently measured using inertial sensors, and a monocular video sequence for 3D urban modeling. We show that in the presence of this information, the SfM equations can be rewritten in a bilinear form. This allows us to derive a fast, robust, and scalable SfM algorithm for large scale applications. The proposed SfM algorithm is experimentally demonstrated to have favorable properties compared to the sparse bundle adjustment algorithm. We provide experimental evidence indicating that the proposed algorithm converges in many cases to solutions with lower error than state-of-art implementations of bundle adjustment. We also demonstrate that for the case of large reconstruction problems, the proposed algorithm takes lesser time to reach its solution compared to bundle adjustment. We also present SfM results using our algorithm on the Google StreetView research dataset, and several other datasets

    A fast and robust hand-driven 3D mouse

    Get PDF
    The development of new interaction paradigms requires a natural interaction. This means that people should be able to interact with technology with the same models used to interact with everyday real life, that is through gestures, expressions, voice. Following this idea, in this paper we propose a non intrusive vision based tracking system able to capture hand motion and simple hand gestures. The proposed device allows to use the hand as a "natural" 3D mouse, where the forefinger tip or the palm centre are used to identify a 3D marker and the hand gesture can be used to simulate the mouse buttons. The approach is based on a monoscopic tracking algorithm which is computationally fast and robust against noise and cluttered backgrounds. Two image streams are processed in parallel exploiting multi-core architectures, and their results are combined to obtain a constrained stereoscopic problem. The system has been implemented and thoroughly tested in an experimental environment where the 3D hand mouse has been used to interact with objects in a virtual reality application. We also provide results about the performances of the tracker, which demonstrate precision and robustness of the proposed syste

    Independent Motion Detection with Event-driven Cameras

    Full text link
    Unlike standard cameras that send intensity images at a constant frame rate, event-driven cameras asynchronously report pixel-level brightness changes, offering low latency and high temporal resolution (both in the order of micro-seconds). As such, they have great potential for fast and low power vision algorithms for robots. Visual tracking, for example, is easily achieved even for very fast stimuli, as only moving objects cause brightness changes. However, cameras mounted on a moving robot are typically non-stationary and the same tracking problem becomes confounded by background clutter events due to the robot ego-motion. In this paper, we propose a method for segmenting the motion of an independently moving object for event-driven cameras. Our method detects and tracks corners in the event stream and learns the statistics of their motion as a function of the robot's joint velocities when no independently moving objects are present. During robot operation, independently moving objects are identified by discrepancies between the predicted corner velocities from ego-motion and the measured corner velocities. We validate the algorithm on data collected from the neuromorphic iCub robot. We achieve a precision of ~ 90 % and show that the method is robust to changes in speed of both the head and the target.Comment: 7 pages, 6 figure

    Enabling Depth-driven Visual Attention on the iCub Humanoid Robot: Instructions for Use and New Perspectives

    Get PDF
    The importance of depth perception in the interactions that humans have within their nearby space is a well established fact. Consequently, it is also well known that the possibility of exploiting good stereo information would ease and, in many cases, enable, a large variety of attentional and interactive behaviors on humanoid robotic platforms. However, the difficulty of computing real-time and robust binocular disparity maps from moving stereo cameras often prevents from relying on this kind of cue to visually guide robots' attention and actions in real-world scenarios. The contribution of this paper is two-fold: first, we show that the Efficient Large-scale Stereo Matching algorithm (ELAS) by A. Geiger et al. 2010 for computation of the disparity map is well suited to be used on a humanoid robotic platform as the iCub robot; second, we show how, provided with a fast and reliable stereo system, implementing relatively challenging visual behaviors in natural settings can require much less effort. As a case of study we consider the common situation where the robot is asked to focus the attention on one object close in the scene, showing how a simple but effective disparity-based segmentation solves the problem in this case. Indeed this example paves the way to a variety of other similar applications
    • 

    corecore