73,486 research outputs found

    Simple Online and Realtime Tracking with a Deep Association Metric

    Full text link
    Simple Online and Realtime Tracking (SORT) is a pragmatic approach to multiple object tracking with a focus on simple, effective algorithms. In this paper, we integrate appearance information to improve the performance of SORT. Due to this extension we are able to track objects through longer periods of occlusions, effectively reducing the number of identity switches. In spirit of the original framework we place much of the computational complexity into an offline pre-training stage where we learn a deep association metric on a large-scale person re-identification dataset. During online application, we establish measurement-to-track associations using nearest neighbor queries in visual appearance space. Experimental evaluation shows that our extensions reduce the number of identity switches by 45%, achieving overall competitive performance at high frame rates.Comment: 5 pages, 1 figur

    A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

    Full text link
    Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.Comment: accepted by CVPR 201

    MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation

    Full text link
    We address the problem of semi-supervised video object segmentation (VOS), where the masks of objects of interests are given in the first frame of an input video. To deal with challenging cases where objects are occluded or missing, previous work relies on greedy data association strategies that make decisions for each frame individually. In this paper, we propose a novel approach to defer the decision making for a target object in each frame, until a global view can be established with the entire video being taken into consideration. Our approach is in the same spirit as Multiple Hypotheses Tracking (MHT) methods, making several critical adaptations for the VOS problem. We employ the bounding box (bbox) hypothesis for tracking tree formation, and the multiple hypotheses are spawned by propagating the preceding bbox into the detected bbox proposals within a gated region starting from the initial object mask in the first frame. The gated region is determined by a gating scheme which takes into account a more comprehensive motion model rather than the simple Kalman filtering model in traditional MHT. To further design more customized algorithms tailored for VOS, we develop a novel mask propagation score instead of the appearance similarity score that could be brittle due to large deformations. The mask propagation score, together with the motion score, determines the affinity between the hypotheses during tree pruning. Finally, a novel mask merging strategy is employed to handle mask conflicts between objects. Extensive experiments on challenging datasets demonstrate the effectiveness of the proposed method, especially in the case of object missing.Comment: accepted to CVPR 2019 as oral presentatio

    Poisson multi-Bernoulli mixture trackers: continuity through random finite sets of trajectories

    Full text link
    The Poisson multi-Bernoulli mixture (PMBM) is an unlabelled multi-target distribution for which the prediction and update are closed. It has a Poisson birth process, and new Bernoulli components are generated on each new measurement as a part of the Bayesian measurement update. The PMBM filter is similar to the multiple hypothesis tracker (MHT), but seemingly does not provide explicit continuity between time steps. This paper considers a recently developed formulation of the multi-target tracking problem as a random finite set (RFS) of trajectories, and derives two trajectory RFS filters, called PMBM trackers. The PMBM trackers efficiently estimate the set of trajectories, and share hypothesis structure with the PMBM filter. By showing that the prediction and update in the PMBM filter can be viewed as an efficient method for calculating the time marginals of the RFS of trajectories, continuity in the same sense as MHT is established for the PMBM filter

    Presenting GECO : an eyetracking corpus of monolingual and bilingual sentence reading

    Get PDF
    This paper introduces GECO, the Ghent Eye-tracking Corpus, a monolingual and bilingual corpus of eye-tracking data of participants reading a complete novel. English monolinguals and Dutch-English bilinguals read an entire novel, which was presented in paragraphs on the screen. The bilinguals read half of the novel in their first language, and the other half in their second language. In this paper we describe the distributions and descriptive statistics of the most important reading time measures for the two groups of participants. This large eye-tracking corpus is perfectly suited for both exploratory purposes as well as more directed hypothesis testing, and it can guide the formulation of ideas and theories about naturalistic reading processes in a meaningful context. Most importantly, this corpus has the potential to evaluate the generalizability of monolingual and bilingual language theories and models to reading of long texts and narratives

    Online Visual Robot Tracking and Identification using Deep LSTM Networks

    Full text link
    Collaborative robots working on a common task are necessary for many applications. One of the challenges for achieving collaboration in a team of robots is mutual tracking and identification. We present a novel pipeline for online visionbased detection, tracking and identification of robots with a known and identical appearance. Our method runs in realtime on the limited hardware of the observer robot. Unlike previous works addressing robot tracking and identification, we use a data-driven approach based on recurrent neural networks to learn relations between sequential inputs and outputs. We formulate the data association problem as multiple classification problems. A deep LSTM network was trained on a simulated dataset and fine-tuned on small set of real data. Experiments on two challenging datasets, one synthetic and one real, which include long-term occlusions, show promising results.Comment: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, Canada, 2017. IROS RoboCup Best Paper Awar

    Mediating between AI and highly specialized users

    Get PDF
    We report part of the design experience gained in X-Media, a system for knowledge management and sharing. Consolidated techniques of interaction design (scenario-based design) had to be revisited to capture the richness and complexity of intelligent interactive systems. We show that the design of intelligent systems requires methodologies (faceted scenarios) that support the investigation of intelligent features and usability factors simultaneously. Interaction designers become mediators between intelligent technology and users, and have to facilitate reciprocal understanding
    • …
    corecore