4 research outputs found

    Image recognition of multi-perspective data for intelligent analysis of gestures and actions

    Get PDF
    The BERMUDA project started in January 2015 and was successfully completed after less than three years in August 2017. A technical set-up and an image processing and analysis software were developed to record and evaluate multi-perspective videos. Based on two cameras, positioned relatively far from one another with tilted axes, synchronized videos were recorded in the laboratory and in real life. The evaluation comprised the background elimination, the body part classification, the clustering, the assignment to persons and eventually the reconstruction of the skeletons. Based on the skeletons, machine learning techniques were developed to recognize the poses of the persons and next for the actions performed. It was, for example, possible to detect the action of a punch, which is relevant in security issues, with a precision of 51.3 % and a recall of 60.6 %.Das Projekt BERMUDA konnte im Januar 2015 begonnen und nach knapp drei Jahren im August 2017 erfolgreich abgeschlossen werden. Es wurden ein technischer Aufbau und eine Bildverarbeitungs- und Analysesoftware entwickelt, mit denen sich multiperspektivische Videos aufzeichnen und auswerten lassen. Basierend auf zwei in größerem Abstand gewinkelt positionierten Kameras wurden synchrone Videos sowohl im Labor als auch in realen Umgebungen aufgenommen. Die Auswertung umfasst die Hintergrundeliminierung, die Körperteilklassifikation, ein Clustering, die Zuordnung zu Personen und schließlich die Rekonstruktion der Skelette. Ausgehend von den Skeletten wurden Methoden des maschinellen Lernens zur Erkennung der Haltungen und darauf aufbauend zur Gestenerkennung entwickelt. Beispielhaft konnte die im Sicherheitskontext relevante Handlung des Schlagens mit einer Genauigkeit von 51,3 % und einer Trefferquote von 60,6 % erkannt werden

    Multiple Human Pose Estimation with Temporally Consistent 3D Pictorial Structures

    Get PDF
    Multiple human 3D pose estimation from multiple camera views is a challenging task in unconstrained environments. Each individual has to be matched across each view and then the body pose has to be estimated. Additionally, the body pose of every individual changes in a consistent manner over time. To address these challenges, we propose a temporally consistent 3D Pictorial Structures model (3DPS) for multiple human pose estimation from multiple camera views. Our model builds on the 3D Pictorial Structures to introduce the notion of temporal consistency between the inferred body poses. We derive this property by relying on multi-view human tracking. Identifying each individual before inference significantly reduces the size of the state space and positively influences the performance as well. To evaluate our method, we use two challenging multiple human datasets in unconstrained environments. We compare our method with the state-of-the-art approaches and achieve better results

    Parsing human skeletons in an operating room

    Get PDF
    Multiple human pose estimation is an important yet challenging problem. In an Operating Room (OR) environment, the 3D body poses of surgeons and medical staff can provide important clues for surgical workflow analysis. For that purpose, we propose an algorithm for localizing and recovering body poses of multiple human in an OR environment under a multi-camera setup. Our model builds on 3D Pictorial Structures (3DPS) and 2D body part localization across all camera views, using Convolutional Neural Networks (ConvNets). To evaluate our algorithm, we introduce a dataset captured in a real OR environment. Our dataset is unique, challenging and publicly available with annotated ground truths. Our proposed algorithm yields to promising pose estimation results on this dataset
    corecore