1,326 research outputs found

    Efficiently Tracking Homogeneous Regions in Multichannel Images

    Full text link
    We present a method for tracking Maximally Stable Homogeneous Regions (MSHR) in images with an arbitrary number of channels. MSHR are conceptionally very similar to Maximally Stable Extremal Regions (MSER) and Maximally Stable Color Regions (MSCR), but can also be applied to hyperspectral and color images while remaining extremely efficient. The presented approach makes use of the edge-based component-tree which can be calculated in linear time. In the tracking step, the MSHR are localized by matching them to the nodes in the component-tree. We use rotationally invariant region and gray-value features that can be calculated through first and second order moments at low computational complexity. Furthermore, we use a weighted feature vector to improve the data association in the tracking step. The algorithm is evaluated on a collection of different tracking scenes from the literature. Furthermore, we present two different applications: 2D object tracking and the 3D segmentation of organs.Comment: to be published in ICPRS 2017 proceeding

    Deep reinforcement learning for subpixel neural tracking

    Get PDF
    Automatically tracing elongated structures, such as axons and blood vessels, is a challenging problem in the field of biomedical imaging, but one with many downstream applications. Real, labelled data is sparse, and existing algorithms either lack robustness to different datasets, or otherwise require significant manual tuning. Here, we instead learn a tracking algorithm in a synthetic environment, and apply it to tracing axons. To do so, we formulate tracking as a reinforcement learning problem, and apply deep reinforcement learning techniques with a continuous action space to learn how to track at the subpixel level. We train our model on simple synthetic data and test it on mouse cortical two-photon microscopy images. Despite the domain gap, our model approaches the performance of a heavily engineered tracker from a standard analysis suite for neuronal microscopy. We show that fine-tuning on real data improves performance, allowing better transfer when real labelled data is available. Finally, we demonstrate that our model's uncertainty measure-a feature lacking in hand-engineered trackers-corresponds with how well it tracks the structure

    Markerless Motion Capture in the Crowd

    Full text link
    This work uses crowdsourcing to obtain motion capture data from video recordings. The data is obtained by information workers who click repeatedly to indicate body configurations in the frames of a video, resulting in a model of 2D structure over time. We discuss techniques to optimize the tracking task and strategies for maximizing accuracy and efficiency. We show visualizations of a variety of motions captured with our pipeline then apply reconstruction techniques to derive 3D structure.Comment: Presented at Collective Intelligence conference, 2012 (arXiv:1204.2991

    Robust object tracking based on weighted subspace reconstruction error with forward: Backward tracking criterion

    Full text link
    © 2015 SPIE and IS & T. It is a challenging task to develop an effective and robust object tracking method due to factors such as severe occlusion, background clutters, abrupt motion, illumination variation, and so on. A tracking algorithm based on weighted subspace reconstruction error is proposed. The discriminative weights are defined based on minimizing reconstruction error with a positive dictionary while maximizing reconstruction error with a negative dictionary. Then a confidence map for candidates is computed through the subspace reconstruction error. Finally, the location of the target object is estimated by maximizing the decision map which combines the discriminative weights and subspace reconstruction error. Furthermore, the new evaluation method based on a forward-backward tracking criterion is used to verify the proposed method and demonstrates its robustness in the updating stage and its effectiveness in the reduction of accumulated errors. Experimental results on 12 challenging video sequences show that the proposed algorithm performs favorably against 12 state-of-the-art methods in terms of accuracy and robustness

    3D head motion, point-of-regard and encoded gaze fixations in real scenes: next-generation portable video-based monocular eye tracking

    Get PDF
    Portable eye trackers allow us to see where a subject is looking when performing a natural task with free head and body movements. These eye trackers include headgear containing a camera directed at one of the subject\u27s eyes (the eye camera) and another camera (the scene camera) positioned above the same eye directed along the subject\u27s line-of-sight. The output video includes the scene video with a crosshair depicting where the subject is looking -- the point-of-regard (POR) -- that is updated for each frame. This video may be the desired final result or it may be further analyzed to obtain more specific information about the subject\u27s visual strategies. A list of the calculated POR positions in the scene video can also be analyzed. The goals of this project are to expand the information that we can obtain from a portable video-based monocular eye tracker and to minimize the amount of user interaction required to obtain and analyze this information. This work includes offline processing of both the eye and scene videos to obtain robust 2D PORs in scene video frames, identify gaze fixations from these PORs, obtain 3D head motion and ray trace fixations through volumes-of-interest (VOIs) to determine what is being fixated, when and where (3D POR). To avoid the redundancy of ray tracing a 2D POR in every video frame and to group these POR data meaningfully, a fixation-identification algorithm is employed to simplify the long list of 2D POR data into gaze fixations. In order to ray trace these fixations, the 3D motion -- position and orientation over time -- of the scene camera is computed. This camera motion is determined via an iterative structure and motion recovery algorithm that requires a calibrated camera and knowledge of the 3D location of at least four points in the scene (that can be selected from premeasured VOI vertices). The subjects 3D head motion is obtained directly from this camera motion. For the final stage of the algorithm, the 3D locations and dimensions of VOIs in the scene are required. This VOI information in world coordinates is converted to camera coordinates for ray tracing. A representative 2D POR position for each fixation is converted from image coordinates to the same camera coordinate system. Then, a ray is traced from the camera center through this position to determine which (if any) VOI is being fixated and where it is being fixated -- the 3D POR in the world. Results are presented for various real scenes. Novel visualizations of portable eye tracker data created using the results of our algorithm are also presented
    • …
    corecore