56,796 research outputs found

    Target following camera system based on real-time recognition and tracking

    Get PDF
    A real-time moving target following camera system is presented in this study. The motion of the camera is controlled based on the real-time recognition and tracking of the target object. Scale Invariant Feature Transform (SIFT) based recognition system and Kanade-Lucas-Tomasi (KLT) tracker based tracking system is presented to recognize and track the moving target. SIFT algorithm is slow but efficient in recognizing the objects even though they undergone some affine transformations. KLT tracker algorithm is simple and has reduced computations, hence improves the tracking performance. The analysis is performed in hardware which consists of a camera mounted on a two servo motor setup, one for pan and other for tilt, and an Arduino board capable of handling the movement of two servo motors. As there is hardware implementation, a computationally simplified technique is employed. Since both SIFT and KLT tracker are feature based techniques, we pass the features extracted by SIFT to KLT tracker for simplifying the process. The recognition and tracking tasks are performed in PC and the PWM signals are generated accordingly and sent to servo motors through Arduino. The proposed algorithm is able to track objects even in its absence for a certain while

    Flow-Guided Feature Aggregation for Video Object Detection

    Full text link
    Extending state-of-the-art object detectors from image to video is challenging. The accuracy of detection suffers from degenerated object appearances in videos, e.g., motion blur, video defocus, rare poses, etc. Existing work attempts to exploit temporal information on box level, but such methods are not trained end-to-end. We present flow-guided feature aggregation, an accurate and end-to-end learning framework for video object detection. It leverages temporal coherence on feature level instead. It improves the per-frame features by aggregation of nearby features along the motion paths, and thus improves the video recognition accuracy. Our method significantly improves upon strong single-frame baselines in ImageNet VID, especially for more challenging fast moving objects. Our framework is principled, and on par with the best engineered systems winning the ImageNet VID challenges 2016, without additional bells-and-whistles. The proposed method, together with Deep Feature Flow, powered the winning entry of ImageNet VID challenges 2017. The code is available at https://github.com/msracver/Flow-Guided-Feature-Aggregation

    A Unified Framework for Mutual Improvement of SLAM and Semantic Segmentation

    Full text link
    This paper presents a novel framework for simultaneously implementing localization and segmentation, which are two of the most important vision-based tasks for robotics. While the goals and techniques used for them were considered to be different previously, we show that by making use of the intermediate results of the two modules, their performance can be enhanced at the same time. Our framework is able to handle both the instantaneous motion and long-term changes of instances in localization with the help of the segmentation result, which also benefits from the refined 3D pose information. We conduct experiments on various datasets, and prove that our framework works effectively on improving the precision and robustness of the two tasks and outperforms existing localization and segmentation algorithms.Comment: 7 pages, 5 figures.This work has been accepted by ICRA 2019. The demo video can be found at https://youtu.be/Bkt53dAehj
    corecore