Search CORE

4 research outputs found

Image recognition of multi-perspective data for intelligent analysis of gestures and actions

Author: Fechner Pascal
Fiebelkorn Richard
Gedat Egbert
Vandenhouten Jan
Vandenhouten Ralf
Publication venue: Technische Hochschule Wildau
Publication date: 01/01/2018
Field of study

The BERMUDA project started in January 2015 and was successfully completed after less than three years in August 2017. A technical set-up and an image processing and analysis software were developed to record and evaluate multi-perspective videos. Based on two cameras, positioned relatively far from one another with tilted axes, synchronized videos were recorded in the laboratory and in real life. The evaluation comprised the background elimination, the body part classification, the clustering, the assignment to persons and eventually the reconstruction of the skeletons. Based on the skeletons, machine learning techniques were developed to recognize the poses of the persons and next for the actions performed. It was, for example, possible to detect the action of a punch, which is relevant in security issues, with a precision of 51.3 % and a recall of 60.6 %.Das Projekt BERMUDA konnte im Januar 2015 begonnen und nach knapp drei Jahren im August 2017 erfolgreich abgeschlossen werden. Es wurden ein technischer Aufbau und eine Bildverarbeitungs- und Analysesoftware entwickelt, mit denen sich multiperspektivische Videos aufzeichnen und auswerten lassen. Basierend auf zwei in größerem Abstand gewinkelt positionierten Kameras wurden synchrone Videos sowohl im Labor als auch in realen Umgebungen aufgenommen. Die Auswertung umfasst die Hintergrundeliminierung, die Körperteilklassifikation, ein Clustering, die Zuordnung zu Personen und schließlich die Rekonstruktion der Skelette. Ausgehend von den Skeletten wurden Methoden des maschinellen Lernens zur Erkennung der Haltungen und darauf aufbauend zur Gestenerkennung entwickelt. Beispielhaft konnte die im Sicherheitskontext relevante Handlung des Schlagens mit einer Genauigkeit von 51,3 % und einer Trefferquote von 60,6 % erkannt werden

Multiple Human Pose Estimation with Temporally Consistent 3D Pictorial Structures

Author: C Sutton
CM Bishop
FR Kschischang
M Eichner
MW Lee
S Gammeter
X Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/07/2014
Field of study

Multiple human 3D pose estimation from multiple camera views is a challenging task in unconstrained environments. Each individual has to be matched across each view and then the body pose has to be estimated. Additionally, the body pose of every individual changes in a consistent manner over time. To address these challenges, we propose a temporally consistent 3D Pictorial Structures model (3DPS) for multiple human pose estimation from multiple camera views. Our model builds on the 3D Pictorial Structures to introduce the notion of temporal consistency between the inferred body poses. We derive this property by relying on multi-view human tracking. Identifying each individual before inference significantly reduces the size of the state space and positively influences the performance as well. To evaluate our method, we use two challenging multiple human datasets in unconstrained environments. We compare our method with the state-of-the-art approaches and achieve better results

Infoscience - École polytechnique fédérale de Lausanne

Crossref

MPG.PuRe

Parsing human skeletons in an operating room

Author: Aoki Yoshimitsu
Belagiannis Vasileios
Ben Shitrit Horesh Beny
Feussner Hubertus
Fua Pascal
Hashimoto Kiyoshi
Ilic Slobodan
Kranzfelder Michael
Navab Nassir
Schneider Armin
Stauder Ralf
Wang Xinchao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/07/2016
Field of study

Multiple human pose estimation is an important yet challenging problem. In an Operating Room (OR) environment, the 3D body poses of surgeons and medical staff can provide important clues for surgical workflow analysis. For that purpose, we propose an algorithm for localizing and recovering body poses of multiple human in an OR environment under a multi-camera setup. Our model builds on 3D Pictorial Structures (3DPS) and 2D body part localization across all camera views, using Convolutional Neural Networks (ConvNets). To evaluate our algorithm, we introduce a dataset captured in a real OR environment. Our dataset is unique, challenging and publicly available with annotated ground truths. Our proposed algorithm yields to promising pose estimation results on this dataset

Infoscience - École polytechnique fédérale de Lausanne

Parsing human skeletons in an operating room

Author: A Agarwal
Armin Schneider
CM Bishop
HB Shitrit
Horesh Beny Ben Shitrit
Hubertus Feussner
I Tsochantaridis
J Berclaz
J Shotton
Kiyoshi Hashimoto
L Sigal
M Hofmann
MA Fischler
Michael Kranzfelder
Nassir Navab
Pascal Fua
PF Felzenszwalb
R Hartley
R Plankers
Ralf Stauder
Slobodan Ilic
T Zhao
TB Moeslund
Vasileios Belagiannis
X Wang
Xinchao Wang
Y LeCun
Y Liu
Yoshimitsu Aoki
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref