1,191 research outputs found

    A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects

    Full text link
    Recently, Minimum Cost Multicut Formulations have been proposed and proven to be successful in both motion trajectory segmentation and multi-target tracking scenarios. Both tasks benefit from decomposing a graphical model into an optimal number of connected components based on attractive and repulsive pairwise terms. The two tasks are formulated on different levels of granularity and, accordingly, leverage mostly local information for motion segmentation and mostly high-level information for multi-target tracking. In this paper we argue that point trajectories and their local relationships can contribute to the high-level task of multi-target tracking and also argue that high-level cues from object detection and tracking are helpful to solve motion segmentation. We propose a joint graphical model for point trajectories and object detections whose Multicuts are solutions to motion segmentation {\it and} multi-target tracking problems at once. Results on the FBMS59 motion segmentation benchmark as well as on pedestrian tracking sequences from the 2D MOT 2015 benchmark demonstrate the promise of this joint approach

    Evaluating the influence of environmental R&D on the SO2 intensity in China: evidence from dynamic spatial Durbin model analysis

    Get PDF
    Green technology is a significant means to improve the environment and achieve sustainable development goals. According to the data of Chinese provincial panel from 2000 to 2016, our study investigated the spatial effect of environmental research and development (R&D) activities on SO2 intensity using the dynamic spatial Durbin model. First, SO2 intensity in China was shown to have obvious spatial correlation, strong path dependence, and spatial agglomeration features of ‘high-high’ as well as ‘low-low’. Second, both in the short- and long-term, environmental R&D activities had an essential negative influence on local SO2 intensity, but no significant effect on SO2 intensity in the neighbouring areas, indicating that the SO2 intensity reduction effect of environmental R&D activities was confined to local areas. Moreover, the long-term effect of environmental R&D activities on SO2 intensity was not enhanced, indicating that China’s existing green technology is insufficient, which hinders the spillover influences of environmental R&D activities. Third, the short- as well as long-term effects of practical-type R&D on SO2 intensity were significantly negative, indicating that practical-type R&D can effectively reduce SO2 intensity. Inventiontype R&D had a significant negative effect on local SO2 intensity, but no significant effect on neighbouring areas

    People detection and tracking in crowded scenes

    Get PDF
    People are often a central element of visual scenes, particularly in real-world street scenes. Thus it has been a long-standing goal in Computer Vision to develop methods aiming at analyzing humans in visual data. Due to the complexity of real-world scenes, visual understanding of people remains challenging for machine perception. In this thesis we focus on advancing the techniques for people detection and tracking in crowded street scenes. We also propose new models for human pose estimation and motion segmentation in realistic images and videos. First, we propose detection models that are jointly trained to detect single person as well as pairs of people under varying degrees of occlusion. The learning algorithm of our joint detector facilitates a tight integration of tracking and detection, because it is designed to address common failure cases during tracking due to long-term inter-object occlusions. Second, we propose novel multi person tracking models that formulate tracking as a graph partitioning problem. Our models jointly cluster detection hypotheses in space and time, eliminating the need for a heuristic non-maximum suppression. Furthermore, for crowded scenes, our tracking model encodes long-range person re-identification information into the detection clustering process in a unified and rigorous manner. Third, we explore the visual tracking task in different granularity. We present a tracking model that simultaneously clusters object bounding boxes and pixel level trajectories over time. This approach provides a rich understanding of the motion of objects in the scene. Last, we extend our tracking model for the multi person pose estimation task. We introduce a joint subset partitioning and labelling model where we simultaneously estimate the poses of all the people in the scene. In summary, this thesis addresses a number of diverse tasks that aim to enable vision systems to analyze people in realistic images and videos. In particular, the thesis proposes several novel ideas and rigorous mathematical formulations, pushes the boundary of state-of-the-arts and results in superior performance.Personen sind oft ein zentraler Bestandteil visueller Szenen, besonders in natĂŒrlichen Straßenszenen. Daher ist es seit langem ein Ziel der Computer Vision, Methoden zu entwickeln, um Personen in einer Szene zu analysieren. Aufgrund der KomplexitĂ€t natĂŒrlicher Szenen bleibt das visuelle VerstĂ€ndnis von Personen eine Herausforderung fĂŒr die maschinelle Wahrnehmung. Im Zentrum dieser Arbeit steht die Weiterentwicklung von Verfahren zur Detektion und zum Tracking von Personen in Straßenszenen mit Menschenmengen. Wir erforschen darĂŒber hinaus neue Methoden zur menschlichen PosenschĂ€tzung und Bewegungssegmentierung in realistischen Bildern und Videos. ZunĂ€chst schlagen wir Detektionsmodelle vor, die gemeinsam trainiert werden, um sowohl einzelne Personen als auch Personenpaare bei verschiedener Verdeckung zu detektieren. Der Lernalgorithmus unseres gemeinsamen Detektors erleichtert eine enge Integration von Tracking und Detektion, da er darauf konzipiert ist, hĂ€ufige FehlerfĂ€lle aufgrund langfristiger Verdeckungen zwischen Objekten wĂ€hrend des Tracking anzugehen. Zweitens schlagen wir neue Modelle fĂŒr das Tracking mehrerer Personen vor, die das Tracking als Problem der Graphenpartitionierung formulieren. Unsere Mod- elle clustern Detektionshypothesen gemeinsam in Raum und Zeit und eliminieren dadurch die Notwendigkeit einer heuristischen UnterdrĂŒckung nicht maximaler De- tektionen. Bei Szenen mit Menschenmengen kodiert unser Trackingmodell darĂŒber hinaus einheitlich und genau Informationen zur langfristigen Re-Identifizierung in den Clusteringprozess der Detektionen. Drittens untersuchen wir die visuelle Trackingaufgabe bei verschiedener Gran- ularitĂ€t. Wir stellen ein Trackingmodell vor, das im Zeitablauf gleichzeitig Begren- zungsrahmen von Objekten und Trajektorien auf Pixelebene clustert. Diese Herange- hensweise ermöglicht ein umfassendes VerstĂ€ndnis der Bewegung der Objekte in der Szene. Schließlich erweitern wir unser Trackingmodell fĂŒr die PosenschĂ€tzung mehrerer Personen. Wir fĂŒhren ein Modell zur gemeinsamen Graphzerlegung und Knoten- klassifikation ein, mit dem wir gleichzeitig die Posen aller Personen in der Szene schĂ€tzen. Zusammengefasst widmet sich diese Arbeit einer Reihe verschiedener Aufgaben mit dem gemeinsamen Ziel, Bildverarbeitungssystemen die Analyse von Personen in realistischen Bildern und Videos zu ermöglichen. Insbesondere schlĂ€gt die Arbeit mehrere neue AnsĂ€tze und genaue mathematische Formulierungen vor, und sie zeigt Methoden, welche die Grenze des neuesten Stands der Technik ĂŒberschreiten und eine höhere Leistung von Bildverarbeitungssystemen ermöglichen

    DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation

    Full text link
    This paper considers the task of articulated human pose estimation of multiple people in real world images. We propose an approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other. This joint formulation is in contrast to previous strategies, that address the problem by first detecting people and subsequently estimating their body pose. We propose a partitioning and labeling formulation of a set of body-part hypotheses generated with CNN-based part detectors. Our formulation, an instance of an integer linear program, implicitly performs non-maximum suppression on the set of part candidates and groups them to form configurations of body parts respecting geometric and appearance constraints. Experiments on four different datasets demonstrate state-of-the-art results for both single person and multi person pose estimation. Models and code available at http://pose.mpi-inf.mpg.de.Comment: Accepted at IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2016
    • 

    corecore