3 research outputs found

    A Multi-View Annotation Tool for People Detection Evaluation

    Get PDF
    In this paper we introduce a novel multi-view annotation tool for generating 3D ground truth data of the real loca-tion of people in the scene. The proposed tool allows the user to accurately select the ground occupancy of people by aligning an oriented rectangle on the ground plane. In ad-dition, the height of the people can also be adjusted. In order to achieve precise ground truth data the user is aided by the video frames of multiple synchronized and calibrated cameras. Finally, the 3D annotation data can be easily con-verted to 2D image positions using the available calibration matrices. One key advantage of the proposed technique is that different methods can be compared against each other, whether they estimate the real world ground position of peo-ple or the 2D position on the camera images. Therefore, we defined two different error metrics, which quantitatively evaluate the estimated positions. We used the proposed tool to annotate two publicly available datasets, and evaluated the metrics on two state of the art algorithms

    Energy Minimization for Multiple Object Tracking

    Get PDF
    Multiple target tracking aims at reconstructing trajectories of several moving targets in a dynamic scene, and is of significant relevance for a large number of applications. For example, predicting a pedestrian’s action may be employed to warn an inattentive driver and reduce road accidents; understanding a dynamic environment will facilitate autonomous robot navigation; and analyzing crowded scenes can prevent fatalities in mass panics. The task of multiple target tracking is challenging for various reasons: First of all, visual data is often ambiguous. For example, the objects to be tracked can remain undetected due to low contrast and occlusion. At the same time, background clutter can cause spurious measurements that distract the tracking algorithm. A second challenge arises when multiple measurements appear close to one another. Resolving correspondence ambiguities leads to a combinatorial problem that quickly becomes more complex with every time step. Moreover, a realistic model of multi-target tracking should take physical constraints into account. This is not only important at the level of individual targets but also regarding interactions between them, which adds to the complexity of the problem. In this work the challenges described above are addressed by means of energy minimization. Given a set of object detections, an energy function describing the problem at hand is minimized with the goal of finding a plausible solution for a batch of consecutive frames. Such offline tracking-by-detection approaches have substantially advanced the performance of multi-target tracking. Building on these ideas, this dissertation introduces three novel techniques for multi-target tracking that extend the state of the art as follows: The first approach formulates the energy in discrete space, building on the work of Berclaz et al. (2009). All possible target locations are reduced to a regular lattice and tracking is posed as an integer linear program (ILP), enabling (near) global optimality. Unlike prior work, however, the proposed formulation includes a dynamic model and additional constraints that enable performing non-maxima suppression (NMS) at the level of trajectories. These contributions improve the performance both qualitatively and quantitatively with respect to annotated ground truth. The second technical contribution is a continuous energy function for multiple target tracking that overcomes the limitations imposed by spatial discretization. The continuous formulation is able to capture important aspects of the problem, such as target localization or motion estimation, more accurately. More precisely, the data term as well as all phenomena including mutual exclusion and occlusion, appearance, dynamics and target persistence are modeled by continuous differentiable functions. The resulting non-convex optimization problem is minimized locally by standard conjugate gradient descent in combination with custom discontinuous jumps. The more accurate representation of the problem leads to a powerful and robust multi-target tracking approach, which shows encouraging results on particularly challenging video sequences. Both previous methods concentrate on reconstructing trajectories, while disregarding the target-to-measurement assignment problem. To unify both data association and trajectory estimation into a single optimization framework, a discrete-continuous energy is presented in Part III of this dissertation. Leveraging recent advances in discrete optimization (Delong et al., 2012), it is possible to formulate multi-target tracking as a model-fitting approach, where discrete assignments and continuous trajectory representations are combined into a single objective function. To enable efficient optimization, the energy is minimized locally by alternating between the discrete and the continuous set of variables. The final contribution of this dissertation is an extensive discussion on performance evaluation and comparison of tracking algorithms, which points out important practical issues that ought not be ignored
    corecore