9 research outputs found

    Tracking Multiple Players using a Single Camera

    Get PDF
    It has been shown that multi-people tracking could be successfullly formulated as a Linear Program to process the output of multiple fixed and synchronized cameras with overlapping fields of view. In this paper, we extend this approach to the more challenging single-camera case and show that it yields excellent performance, even when the camera moves. We validate our approach on a number of basketball matches and argue that using a properly retrained people detector is key to producing the probabilities of presence that are used as input to the Linear Program

    Histograma de orientación de gradientes aplicado al seguimiento múltiple de personas basado en video

    Get PDF
    El seguimiento múltiple de personas en escenas reales es un tema muy importante en el campo de Visión Computacional dada sus múltiples aplicaciones en áreas como en los sistemas de vigilancia, robótica, seguridad peatonal, marketing, etc., además de los retos inherentes que representa la identificación de personas en escenas reales como son la complejidad de la escena misma, la concurrencia de personas y la presencia de oclusiones dentro del video debido a dicha concurrencia. Existen diversas técnicas que abordan el problema de la segmentación de imágenes y en particular la identificación de personas, desde diversas perspectivas; por su parte el presente trabajo tiene por finalidad desarrollar una propuesta basada en Histograma de Orientación de Gradientes (HOG) para el seguimiento múltiple de personas basado en video. El procedimiento propuesto se descompone en las siguientes etapas: Procesamiento de Video, este proceso consiste en la captura de los frames que componen la secuencia de video, para este propósito se usa la librería OpenCV de tal manera que se pueda capturar la secuencia desde cualquier fuente; la siguiente etapa es la Clasificación de Candidatos, esta etapa se agrupa el proceso de descripción de nuestro objeto, que para el caso de este trabajo son personas y la selección de los candidatos, para esto se hace uso de la implementación del algoritmo de HOG; por último la etapa final es el Seguimiento y Asociación, mediante el uso del algoritmo de Kalman Filter, permite determinar las asociaciones de las secuencias de objetos previamente detectados. La propuesta se aplicó sobre tres conjuntos de datos, tales son: TownCentre (960x540px), TownCentre (1920x1080px) y PETS 2009, obteniéndose los resultados para precisión: 94.47%, 90.63% y 97.30% respectivamente. Los resultados obtenidos durante las experimentaciones validan la propuesta del modelo haciendo de esta una herramienta que puede encontrar múltiples campos de aplicación, además de ser una propuesta innovadora a nivel nacional dentro del campo de Vision Computacional.Tesi

    Occlusion reasoning for multiple object visual tracking

    Full text link
    Thesis (Ph.D.)--Boston UniversityOcclusion reasoning for visual object tracking in uncontrolled environments is a challenging problem. It becomes significantly more difficult when dense groups of indistinguishable objects are present in the scene that cause frequent inter-object interactions and occlusions. We present several practical solutions that tackle the inter-object occlusions for video surveillance applications. In particular, this thesis proposes three methods. First, we propose "reconstruction-tracking," an online multi-camera spatial-temporal data association method for tracking large groups of objects imaged with low resolution. As a variant of the well-known Multiple-Hypothesis-Tracker, our approach localizes the positions of objects in 3D space with possibly occluded observations from multiple camera views and performs temporal data association in 3D. Second, we develop "track linking," a class of offline batch processing algorithms for long-term occlusions, where the decision has to be made based on the observations from the entire tracking sequence. We construct a graph representation to characterize occlusion events and propose an efficient graph-based/combinatorial algorithm to resolve occlusions. Third, we propose a novel Bayesian framework where detection and data association are combined into a single module and solved jointly. Almost all traditional tracking systems address the detection and data association tasks separately in sequential order. Such a design implies that the output of the detector has to be reliable in order to make the data association work. Our framework takes advantage of the often complementary nature of the two subproblems, which not only avoids the error propagation issue from which traditional "detection-tracking approaches" suffer but also eschews common heuristics such as "nonmaximum suppression" of hypotheses by modeling the likelihood of the entire image. The thesis describes a substantial number of experiments, involving challenging, notably distinct simulated and real data, including infrared and visible-light data sets recorded ourselves or taken from data sets publicly available. In these videos, the number of objects ranges from a dozen to a hundred per frame in both monocular and multiple views. The experiments demonstrate that our approaches achieve results comparable to those of state-of-the-art approaches

    An analytical formulation of global occlusion reasoning for multi-target tracking

    No full text
    We present a principled model for occlusion reasoning in complex scenarios with frequent inter-object occlusions, and its application to multi-target tracking. To compute the putative overlap between pairs of targets, we represent each target with a Gaussian. Conveniently, this leads to an analytical form for the relative overlap - another Gaussian - which is combined with a sigmoidal term for modeling depth relations. Our global occlusion model bears several advantages: Global target visibility can be computed efficiently in closed-form, and varying degrees of partial occlusion can be naturally accounted for. Moreover, the dependence of the occlusion on the target locations - i.e. the gradient of the overlap - can also be computed in closed-form, which makes it possible to efficiently include the proposed occlusion model in a continuous energy minimization framework. Experimental results on seven datasets confirm that the proposed formulation consistently reduces missed targets and lost trajectories, especially in challenging scenarios with crowds and severe inter-object occlusions.Anton Andriyenko, Stefan Roth, Konrad Schindle

    Localisation and tracking of people using distributed UWB sensors

    Get PDF
    In vielen Überwachungs- und Rettungsszenarien ist die Lokalisierung und Verfolgung von Personen in Innenräumen auf nichtkooperative Weise erforderlich. Für die Erkennung von Objekten durch Wände in kurzer bis mittlerer Entfernung, ist die Ultrabreitband (UWB) Radartechnologie aufgrund ihrer hohen zeitlichen Auflösung und Durchdringungsfähigkeit Erfolg versprechend. In dieser Arbeit wird ein Prozess vorgestellt, mit dem Personen in Innenräumen mittels UWB-Sensoren lokalisiert werden können. Er umfasst neben der Erfassung von Messdaten, Abstandschätzungen und dem Erkennen von Mehrfachzielen auch deren Ortung und Verfolgung. Aufgrund der schwachen Reflektion von Personen im Vergleich zum Rest der Umgebung, wird zur Personenerkennung zuerst eine Hintergrundsubtraktionsmethode verwendet. Danach wird eine konstante Falschalarmrate Methode zur Detektion und Abstandschätzung von Personen angewendet. Für Mehrfachziellokalisierung mit einem UWB-Sensor wird eine Assoziationsmethode entwickelt, um die Schätzungen des Zielabstandes den richtigen Zielen zuzuordnen. In Szenarien mit mehreren Zielen kann es vorkommen, dass ein näher zum Sensor positioniertes Ziel ein anderes abschattet. Ein Konzept für ein verteiltes UWB-Sensornetzwerk wird vorgestellt, in dem sich das Sichtfeld des Systems durch die Verwendung mehrerer Sensoren mit unterschiedlichen Blickfeldern erweitert lässt. Hierbei wurde ein Prototyp entwickelt, der durch Fusion von Sensordaten die Verfolgung von Mehrfachzielen in Echtzeit ermöglicht. Dabei spielen insbesondere auch Synchronisierungs- und Kooperationsaspekte eine entscheidende Rolle. Sensordaten können durch Zeitversatz und systematische Fehler gestört sein. Falschmessungen und Rauschen in den Messungen beeinflussen die Genauigkeit der Schätzergebnisse. Weitere Erkenntnisse über die Zielzustände können durch die Nutzung zeitlicher Informationen gewonnen werden. Ein Mehrfachzielverfolgungssystem wird auf der Grundlage des Wahrscheinlichkeitshypothesenfilters (Probability Hypothesis Density Filter) entwickelt, und die Unterschiede in der Systemleistung werden bezüglich der von den Sensoren ausgegebene Informationen, d.h. die Fusion von Ortungsinformationen und die Fusion von Abstandsinformationen, untersucht. Die Information, dass ein Ziel detektiert werden sollte, wenn es aufgrund von Abschattungen durch andere Ziele im Szenario nicht erkannt wurde, wird als dynamische Überdeckungswahrscheinlichkeit beschrieben. Die dynamische Überdeckungswahrscheinlichkeit wird in das Verfolgungssystem integriert, wodurch weniger Sensoren verwendet werden können, während gleichzeitig die Performanz des Schätzers in diesem Szenario verbessert wird. Bei der Methodenauswahl und -entwicklung wurde die Anforderung einer Echtzeitanwendung bei unbekannten Szenarien berücksichtigt. Jeder untersuchte Aspekt der Mehrpersonenlokalisierung wurde im Rahmen dieser Arbeit mit Hilfe von Simulationen und Messungen in einer realistischen Umgebung mit UWB Sensoren verifiziert.Indoor localisation and tracking of people in non-cooperative manner is important in many surveillance and rescue applications. Ultra wideband (UWB) radar technology is promising for through-wall detection of objects in short to medium distances due to its high temporal resolution and penetration capability. This thesis tackles the problem of localisation of people in indoor scenarios using UWB sensors. It follows the process from measurement acquisition, multiple target detection and range estimation to multiple target localisation and tracking. Due to the weak reflection of people compared to the rest of the environment, a background subtraction method is initially used for the detection of people. Subsequently, a constant false alarm rate method is applied for detection and range estimation of multiple persons. For multiple target localisation using a single UWB sensor, an association method is developed to assign target range estimates to the correct targets. In the presence of multiple targets it can happen that targets closer to the sensor induce shadowing over the environment hindering the detection of other targets. A concept for a distributed UWB sensor network is presented aiming at extending the field of view of the system by using several sensors with different fields of view. A real-time operational prototype has been developed taking into consideration sensor cooperation and synchronisation aspects, as well as fusion of the information provided by all sensors. Sensor data may be erroneous due to sensor bias and time offset. Incorrect measurements and measurement noise influence the accuracy of the estimation results. Additional insight of the targets states can be gained by exploiting temporal information. A multiple person tracking framework is developed based on the probability hypothesis density filter, and the differences in system performance are highlighted with respect to the information provided by the sensors i.e. location information fusion vs range information fusion. The information that a target should have been detected when it is not due to shadowing induced by other targets is described as dynamic occlusion probability. The dynamic occlusion probability is incorporated into the tracking framework, allowing fewer sensors to be used while improving the tracker performance in the scenario. The method selection and development has taken into consideration real-time application requirements for unknown scenarios at every step. Each investigated aspect of multiple person localization within the scope of this thesis has been verified using simulations and measurements in a realistic environment using M-sequence UWB sensors

    최대 가중 클릭 문제의 동적 생성법을 이용한 온라인 다중 카메라 다중 물체 추적 기법

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2016. 8. 최진영.In this dissertation, we propose an online and real-time algorithm for tracking of multiple targets with multiple cameras that have overlapping field of views. Because of its applicability, multiple target tracking with a visual sensor has been studied intensively during the recent decades. Especially, algorithms using multiple overlapping cameras have been proposed to overcome the occlusion and missing problem of target that cannot be resolved by a single camera. Since the multiple camera multiple target tracking (MCMTT) problem is more complicated than the single camera multiple target tracking (SCMTT) problem, most of MCMTT algorithms are based on a batch process which considers a whole sequence at a time. Although the batch-based algorithms have been achieved the robust performance, their usability is limited because many practical applications need an instantaneous result. The objective of this dissertation is to develop an online MCMTT algorithm that has compatible tracking performance compared to the batch-based algorithms, but requires a small amount of computations. The proposed algorithm generates track hypotheses (or simply called `track') with all possible data associations between object detections from multiple cameras through frames. Then, it picks a set of tracks that best describes the tracking of targets. To identify a good track, the quality of each track is measured by our score function. The tracking solution is, then, a set of tracks that has the maximum total score. To get the solution, we formulate the problem of finding those track set as the maximum weighted clique problem (MWCP), which is one of the widely adopted formulations for a combinatorial problem that has the pairwise compatibility relationship among the variables. MWCP is well-known NP-complete problem and its worst-case computation time is proportional to the exponent of the number of tracks. Thus, solving MWCP is intractable because the number of candidate tracks exponentially increases when the tracking progresses. To alleviate the huge computational load, we propose an online scheme that dynamically formates multiple MWCPs with small-sized subsets of candidate tracks in every frame. The scheme is motivated by that the tracking solutions from consecutive frames are very similar because the status of each target is not abruptly changed between one frame. When we assume that a specific track set is an actual solution of the previous frame, only a small number of tracks have a possibility to become a solution track of the current frame. Thus, we can narrow down the size of candidate track set with the previous solution. However, propagating only the best solution of each frame can cause irreducible error when a wrong track set is chosen as the solution because of the tracking ambiguity. To hedge the risk of this error, we find multiple good solutions at each frame and propagate the K-best solutions among them to the next frame instead of a single solution. When the candidate tracks are updated and generated with newly obtained detections at the next frame, we generate multiple subsets of the entire candidate tracks with the K-best previous solutions. Each subset consists of candidate solution tracks with respect to each of the previous solutions, and a small-sized MWCP is formated with the subset. Then, our algorithm finds multiple solutions from each MWCP and repeats above procedures until the tracking is terminated. Even the proposed algorithm solves multiple MWCPs, it has lower computational complexity than solving a single MWCP with the entire candidate tracks because the overall computational load is mainly affected by the size of the largest MWCP. Moreover, when an instantaneous result is demanded, our algorithm finds better solution than solving a single large-sized MWCP because it finds more diverse solutions under a limited solving time. Although our dynamic formulation remarkably moderates the overall computational complexity, it is still challenging to satisfy the real-time capability of the tracking system. Thus, we apply three more strategies to reduce the computation time. First, we generate tracklets, robust fragments of a target's trajectory, at each camera and generate candidate tracks with those tracklets instead of detections. This prevents a generation of many absurd tracks. Second, we adopt a heuristic algorithm called a breakout local search (BLS) to solve each MWCP. With BLS, multiple suboptimal solutions can be found efficiently within a short time. Last, we prune the candidate tracks with a probability that is calculated with the K-best solutions. The probability represents the quality of each track with respect to the overall tracking situation instead of an individual track. Thus, utilizing this probability ensures a proper pruning of candidate tracks. In the experiments with a public benchmark dataset, our algorithm shows the compatible performance compared to the state-of-the-art batch-based MCMTT algorithms. Moreover, our algorithm shows a real-time capability by achieving a satisfactory performance within a reasonable computation time. We also conduct a self-comparison to verify our dynamic MWCP formation with respect to the tracking performance and solving time. When a sufficient number of solutions are propagated, our algorithm performs better and takes shorter time than solving a single MWCP considering the entire candidate tracks.Chapter 1 Introduction 1 1.1 Background 1 1.2 Related Works 3 1.2.1 Reconstruction-and-tracking methods 4 1.2.2 Tracking-and-reconstruction methods 6 1.2.3 Unified frameworks 7 1.3 Contents of the Research 8 1.4 Thesis Organization 11 Chapter 2 Preliminaries 13 2.1 Bayesian Tracking 14 2.1.1 Recursive Bayesian Tracking 16 2.1.2 Bayesian Tracking for Multiple Targets 17 2.1.3 Multiple Hypothesis Tracking (MHT) 19 2.2 Maximum Weighted Clique Problem (MWCP) 24 2.2.1 Clique Problems 24 2.2.2 Solving MWCP 26 2.3 Breakout Local Search (BLS) 27 2.3.1 Solution exploration 28 2.3.2 Perturbation Strategies 30 2.3.3 Initial Solution and Termination Condition 32 Chapter 3 Proposed Approach 35 3.1 Problem Statements 35 3.2 Tracklet Generation 40 3.2.1 Detection-to-tracklet Matching 43 3.2.2 Matching Score with Motion Estimation 46 3.2.3 Matching Validation 49 3.3 Track Hypothesis 51 3.3.1 Tracklet Association 51 3.3.2 Online Generation of Association Sets 55 3.3.3 Track Generation 57 3.3.4 Track Score 59 3.4 Global Hypothesis 64 3.4.1 MWCP for MCMTT 65 3.4.2 BLS for MCMTT 69 3.5 Pruning 70 3.5.1 Approximated Global Track Probability 71 3.5.2 Track Pruning Scheme 72 Chapter 4 Experiments 75 4.1 Comparison with the State-of-the-art Methods 81 4.2 Influence of Parameters 84 4.3 Score Function Analysis 87 4.4 Solving Scheme Analysis 88 4.5 Qualitative Results 90 Chapter 5 Concluding Remarks 97 5.1 Conclusions 97 5.2 Future Works 98 초록 117Docto

    Energy Minimization for Multiple Object Tracking

    Get PDF
    Multiple target tracking aims at reconstructing trajectories of several moving targets in a dynamic scene, and is of significant relevance for a large number of applications. For example, predicting a pedestrian’s action may be employed to warn an inattentive driver and reduce road accidents; understanding a dynamic environment will facilitate autonomous robot navigation; and analyzing crowded scenes can prevent fatalities in mass panics. The task of multiple target tracking is challenging for various reasons: First of all, visual data is often ambiguous. For example, the objects to be tracked can remain undetected due to low contrast and occlusion. At the same time, background clutter can cause spurious measurements that distract the tracking algorithm. A second challenge arises when multiple measurements appear close to one another. Resolving correspondence ambiguities leads to a combinatorial problem that quickly becomes more complex with every time step. Moreover, a realistic model of multi-target tracking should take physical constraints into account. This is not only important at the level of individual targets but also regarding interactions between them, which adds to the complexity of the problem. In this work the challenges described above are addressed by means of energy minimization. Given a set of object detections, an energy function describing the problem at hand is minimized with the goal of finding a plausible solution for a batch of consecutive frames. Such offline tracking-by-detection approaches have substantially advanced the performance of multi-target tracking. Building on these ideas, this dissertation introduces three novel techniques for multi-target tracking that extend the state of the art as follows: The first approach formulates the energy in discrete space, building on the work of Berclaz et al. (2009). All possible target locations are reduced to a regular lattice and tracking is posed as an integer linear program (ILP), enabling (near) global optimality. Unlike prior work, however, the proposed formulation includes a dynamic model and additional constraints that enable performing non-maxima suppression (NMS) at the level of trajectories. These contributions improve the performance both qualitatively and quantitatively with respect to annotated ground truth. The second technical contribution is a continuous energy function for multiple target tracking that overcomes the limitations imposed by spatial discretization. The continuous formulation is able to capture important aspects of the problem, such as target localization or motion estimation, more accurately. More precisely, the data term as well as all phenomena including mutual exclusion and occlusion, appearance, dynamics and target persistence are modeled by continuous differentiable functions. The resulting non-convex optimization problem is minimized locally by standard conjugate gradient descent in combination with custom discontinuous jumps. The more accurate representation of the problem leads to a powerful and robust multi-target tracking approach, which shows encouraging results on particularly challenging video sequences. Both previous methods concentrate on reconstructing trajectories, while disregarding the target-to-measurement assignment problem. To unify both data association and trajectory estimation into a single optimization framework, a discrete-continuous energy is presented in Part III of this dissertation. Leveraging recent advances in discrete optimization (Delong et al., 2012), it is possible to formulate multi-target tracking as a model-fitting approach, where discrete assignments and continuous trajectory representations are combined into a single objective function. To enable efficient optimization, the energy is minimized locally by alternating between the discrete and the continuous set of variables. The final contribution of this dissertation is an extensive discussion on performance evaluation and comparison of tracking algorithms, which points out important practical issues that ought not be ignored

    Tracking Interacting Objects in Image Sequences

    Get PDF
    Object tracking in image sequences is a key challenge in computer vision. Its goal is to follow objects that move or evolve over time while preserving the identity of each object. However, most existing approaches focus on one class of objects and model only very simple interactions, such as the fact that different objects do not occupy the same spatial location at a given time instance. They ignore that objects may interact in more complex ways. For example, in a parking lot, a person may get in a car and become invisible in the scene. In this thesis, we focus on tracking interacting objects in image sequences. We show that by exploiting the relationship between different objects, we can achieve more reliable tracking results. We explore a wide range of applications, such as tracking players and the ball in team sports, tracking cars and people in a parking lot and tracking dividing cells in biomedical imagery. We start by tracking the ball in team sports, which is a very challenging task because the ball is often occluded by the players. We propose a sequential approach that tracks the players first, and then tracks the ball by deciding which player, if any, is in possession of the ball at any given time. This is very different from standard approaches that first attempt to track the ball and only then to assign possession. We show that our method substantially increases performance when applied to long basketball and soccer sequences. We then focus on simultaneously tracking interacting objects. We achieve this by formulating the tracking problem as a network-flow Mixed Integer Program, and expressing the fact that one object can appear or disappear at locations of another in terms of linear flow constraints. We demonstrate our method on scenes involving cars and passengers, bags being carried and dropped by people, and balls being passed from one player to the next in team sports. In particular, we show that by estimating jointly and globally the trajectories of different types of objects, the presence of the ones which were not initially detected based solely on image evidence can be inferred from the detections of the others. We finally extend our approach to dividing cells in biomedical imagery. In this case, cells interact by overlapping with each other and giving birth to daughter cells. We propose a novel approach to automatically detecting and tracking cell populations in time-lapse images. Unlike earlier approaches that rely on linking a predetermined and potentially incomplete set of detections, we generate an overcomplete set of competing detection hypotheses. We then perform detection and tracking simultaneously by solving an integer program to find the optimal and consistent subset. This eliminates the need for heuristics to handle missed detections due to occlusions and complex morphology. We demonstrate the effectiveness of our approach on a range of challenging image sequences consisting of clumped cells and show that it outperforms the state-of-the-art techniques

    From motion capture to interactive virtual worlds : towards unconstrained motion-capture algorithms for real-time performance-driven character animation

    Get PDF
    This dissertation takes performance-driven character animation as a representative application and advances motion capture algorithms and animation methods to meet its high demands. Existing approaches have either coarse resolution and restricted capture volume, require expensive and complex multi-camera systems, or use intrusive suits and controllers. For motion capture, set-up time is reduced using fewer cameras, accuracy is increased despite occlusions and general environments, initialization is automated, and free roaming is enabled by egocentric cameras. For animation, increased robustness enables the use of low-cost sensors input, custom control gesture definition is guided to support novice users, and animation expressiveness is increased. The important contributions are: 1) an analytic and differentiable visibility model for pose optimization under strong occlusions, 2) a volumetric contour model for automatic actor initialization in general scenes, 3) a method to annotate and augment image-pose databases automatically, 4) the utilization of unlabeled examples for character control, and 5) the generalization and disambiguation of cyclical gestures for faithful character animation. In summary, the whole process of human motion capture, processing, and application to animation is advanced. These advances on the state of the art have the potential to improve many interactive applications, within and outside virtual reality.Diese Arbeit befasst sich mit Performance-driven Character Animation, insbesondere werden Motion Capture-Algorithmen entwickelt um den hohen Anforderungen dieser Beispielanwendung gerecht zu werden. Existierende Methoden haben entweder eine geringe Genauigkeit und einen eingeschränkten Aufnahmebereich oder benötigen teure Multi-Kamera-Systeme, oder benutzen störende Controller und spezielle Anzüge. Für Motion Capture wird die Setup-Zeit verkürzt, die Genauigkeit für Verdeckungen und generelle Umgebungen erhöht, die Initialisierung automatisiert, und Bewegungseinschränkung verringert. Für Character Animation wird die Robustheit für ungenaue Sensoren erhöht, Hilfe für benutzerdefinierte Gestendefinition geboten, und die Ausdrucksstärke der Animation verbessert. Die wichtigsten Beiträge sind: 1) ein analytisches und differenzierbares Sichtbarkeitsmodell für Rekonstruktionen unter starken Verdeckungen, 2) ein volumetrisches Konturenmodell für automatische Körpermodellinitialisierung in genereller Umgebung, 3) eine Methode zur automatischen Annotation von Posen und Augmentation von Bildern in großen Datenbanken, 4) das Nutzen von Beispielbewegungen für Character Animation, und 5) die Generalisierung und Übertragung von zyklischen Gesten für genaue Charakteranimation. Es wird der gesamte Prozess erweitert, von Motion Capture bis hin zu Charakteranimation. Die Verbesserungen sind für viele interaktive Anwendungen geeignet, innerhalb und außerhalb von virtueller Realität
    corecore