78 research outputs found
So you think you can track?
This work introduces a multi-camera tracking dataset consisting of 234 hours
of video data recorded concurrently from 234 overlapping HD cameras covering a
4.2 mile stretch of 8-10 lane interstate highway near Nashville, TN. The video
is recorded during a period of high traffic density with 500+ objects typically
visible within the scene and typical object longevities of 3-15 minutes. GPS
trajectories from 270 vehicle passes through the scene are manually corrected
in the video data to provide a set of ground-truth trajectories for
recall-oriented tracking metrics, and object detections are provided for each
camera in the scene (159 million total before cross-camera fusion). Initial
benchmarking of tracking-by-detection algorithms is performed against the GPS
trajectories, and a best HOTA of only 9.5% is obtained (best recall 75.9% at
IOU 0.1, 47.9 average IDs per ground truth object), indicating the benchmarked
trackers do not perform sufficiently well at the long temporal and spatial
durations required for traffic scene understanding
Exploitation of Geographic Information Systems for Vehicular Destination Prediction
Much of the recent successes in the Iraqi theater have been achieved with the aid of technology so advanced that celebrated journalist Bob Woodward recently compared it to the Manhattan Project of WWII. Intelligence, Surveillance, and Reconnaissance (ISR) platforms have emerged as the rising star of Air Force operational capabilities as they are enablers in the quest to track and disrupt terrorist and insurgent forces. This thesis argues that ISR systems have been severely under-exploited. The proposals herein seek to improve the machine-human interface of current ISR systems such that a predictive battle-space awareness may be achieved, leading to shorter kill-chains and better utilization of high demand assets. This thesis shows that, if a vehicle is being tracked by an ISR platform, it is possible to predict where it might go within a Time Horizon. This predictive knowledge is represented graphically to enable quick decisioning. This is accomplished by using Geo-Spatial Information Systems (GIS) obtained from municipal, commercial, or other ISR sources (e.g., hyperspectral) to model an urban grid. It then employs graph-theoretic search algorithms that prune the future state-space of that vehicle\u27s environment, resulting in an envelope that constricts around all possible destinations. This thesis demonstrates an 81 % success rate for predictions carried out during experimentation. It further demonstrates a 97 % improvement over predictions made solely with models based on vehicular motion. This thesis reveals that the predictive envelopes show immense promise in improving ISR asset management, offering more intelligent interdiction of targets, and enabling ground sensor-cueing. Moreover, these predictive capabilities allow an operator to assign assets to make precise perturbations on the battle-space for true event-shaping. Finally, this thesis shows that the proposed methodologies are easily and cost-effectively deployed over existing Air Force architectures using the Software as a Service business model
Detecting, segmenting and tracking bio-medical objects
Studying the behavior patterns of biomedical objects helps scientists understand the underlying mechanisms. With computer vision techniques, automated monitoring can be implemented for efficient and effective analysis in biomedical studies. Promising applications have been carried out in various research topics, including insect group monitoring, malignant cell detection and segmentation, human organ segmentation and nano-particle tracking.
In general, applications of computer vision techniques in monitoring biomedical objects include the following stages: detection, segmentation and tracking. Challenges in each stage will potentially lead to unsatisfactory results of automated monitoring. These challenges include different foreground-background contrast, fast motion blur, clutter, object overlap and etc. In this thesis, we investigate the challenges in each stage, and we propose novel solutions with computer vision methods to overcome these challenges and help automatically monitor biomedical objects with high accuracy in different cases --Abstract, page iii
Novel Aggregated Solutions for Robust Visual Tracking in TrafïŹc Scenarios
This work proposes novel approaches for object tracking in challenging scenarios like severe occlusion, deteriorated vision and long range multi-object reidentiïŹcation. All these solutions are only based on image sequence captured by a monocular camera and do not require additional sensors. Experiments on standard benchmarks demonstrate an improved state-of-the-art performance of these approaches. Since all the presented approaches are smartly designed, they can run at a real-time speed
People detection and tracking in crowded scenes
People are often a central element of visual scenes, particularly in real-world street scenes. Thus it has been a long-standing goal in Computer Vision to develop methods aiming at analyzing humans in visual data. Due to the complexity of real-world scenes, visual understanding of people remains challenging for machine perception. In this thesis we focus on advancing the techniques for people detection and tracking in crowded street scenes. We also propose new models for human pose estimation and motion segmentation in realistic images and videos. First, we propose detection models that are jointly trained to detect single person as well as pairs of people under varying degrees of occlusion. The learning algorithm of our joint detector facilitates a tight integration of tracking and detection, because it is designed to address common failure cases during tracking due to long-term inter-object occlusions. Second, we propose novel multi person tracking models that formulate tracking as a graph partitioning problem. Our models jointly cluster detection hypotheses in space and time, eliminating the need for a heuristic non-maximum suppression. Furthermore, for crowded scenes, our tracking model encodes long-range person re-identification information into the detection clustering process in a unified and rigorous manner. Third, we explore the visual tracking task in different granularity. We present a tracking model that simultaneously clusters object bounding boxes and pixel level trajectories over time. This approach provides a rich understanding of the motion of objects in the scene. Last, we extend our tracking model for the multi person pose estimation task. We introduce a joint subset partitioning and labelling model where we simultaneously estimate the poses of all the people in the scene. In summary, this thesis addresses a number of diverse tasks that aim to enable vision systems to analyze people in realistic images and videos. In particular, the thesis proposes several novel ideas and rigorous mathematical formulations, pushes the boundary of state-of-the-arts and results in superior performance.Personen sind oft ein zentraler Bestandteil visueller Szenen, besonders in natĂŒrlichen StraĂenszenen. Daher ist es seit langem ein Ziel der Computer Vision, Methoden zu entwickeln, um Personen in einer Szene zu analysieren. Aufgrund der KomplexitĂ€t natĂŒrlicher Szenen bleibt das visuelle VerstĂ€ndnis von Personen eine Herausforderung fĂŒr die maschinelle Wahrnehmung. Im Zentrum dieser Arbeit steht die Weiterentwicklung von Verfahren zur Detektion und zum Tracking von Personen in StraĂenszenen mit Menschenmengen. Wir erforschen darĂŒber hinaus neue Methoden zur menschlichen PosenschĂ€tzung und Bewegungssegmentierung in realistischen Bildern und Videos. ZunĂ€chst schlagen wir Detektionsmodelle vor, die gemeinsam trainiert werden, um sowohl einzelne Personen als auch Personenpaare bei verschiedener Verdeckung zu detektieren. Der Lernalgorithmus unseres gemeinsamen Detektors erleichtert eine enge Integration von Tracking und Detektion, da er darauf konzipiert ist, hĂ€ufige FehlerfĂ€lle aufgrund langfristiger Verdeckungen zwischen Objekten wĂ€hrend des Tracking anzugehen. Zweitens schlagen wir neue Modelle fĂŒr das Tracking mehrerer Personen vor, die das Tracking als Problem der Graphenpartitionierung formulieren. Unsere Mod- elle clustern Detektionshypothesen gemeinsam in Raum und Zeit und eliminieren dadurch die Notwendigkeit einer heuristischen UnterdrĂŒckung nicht maximaler De- tektionen. Bei Szenen mit Menschenmengen kodiert unser Trackingmodell darĂŒber hinaus einheitlich und genau Informationen zur langfristigen Re-Identifizierung in den Clusteringprozess der Detektionen. Drittens untersuchen wir die visuelle Trackingaufgabe bei verschiedener Gran- ularitĂ€t. Wir stellen ein Trackingmodell vor, das im Zeitablauf gleichzeitig Begren- zungsrahmen von Objekten und Trajektorien auf Pixelebene clustert. Diese Herange- hensweise ermöglicht ein umfassendes VerstĂ€ndnis der Bewegung der Objekte in der Szene. SchlieĂlich erweitern wir unser Trackingmodell fĂŒr die PosenschĂ€tzung mehrerer Personen. Wir fĂŒhren ein Modell zur gemeinsamen Graphzerlegung und Knoten- klassifikation ein, mit dem wir gleichzeitig die Posen aller Personen in der Szene schĂ€tzen. Zusammengefasst widmet sich diese Arbeit einer Reihe verschiedener Aufgaben mit dem gemeinsamen Ziel, Bildverarbeitungssystemen die Analyse von Personen in realistischen Bildern und Videos zu ermöglichen. Insbesondere schlĂ€gt die Arbeit mehrere neue AnsĂ€tze und genaue mathematische Formulierungen vor, und sie zeigt Methoden, welche die Grenze des neuesten Stands der Technik ĂŒberschreiten und eine höhere Leistung von Bildverarbeitungssystemen ermöglichen
The Interstate-24 3D Dataset: a new benchmark for 3D multi-camera vehicle tracking
This work presents a novel video dataset recorded from overlapping highway
traffic cameras along an urban interstate, enabling multi-camera 3D object
tracking in a traffic monitoring context. Data is released from 3 scenes
containing video from at least 16 cameras each, totaling 57 minutes in length.
877,000 3D bounding boxes and corresponding object tracklets are fully and
accurately annotated for each camera field of view and are combined into a
spatially and temporally continuous set of vehicle trajectories for each scene.
Lastly, existing algorithms are combined to benchmark a number of 3D
multi-camera tracking pipelines on the dataset, with results indicating that
the dataset is challenging due to the difficulty of matching objects traveling
at high speeds across cameras and heavy object occlusion, potentially for
hundreds of frames, during congested traffic. This work aims to enable the
development of accurate and automatic vehicle trajectory extraction algorithms,
which will play a vital role in understanding impacts of autonomous vehicle
technologies on the safety and efficiency of traffic
- âŠ