Search CORE

916 research outputs found

An original framework for understanding human actions and body language by using deep neural networks

Author: MASSARONI CRISTIANO
Publication venue
Publication date: 28/02/2020
Field of study

The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

Archivio della ricerca- Università di Roma La Sapienza

A multimodal corpus for the evaluation of computational models for (grounded) language acquisition

Author: Cimiano Philipp
Gaspers Judith
Lemme Andre
Panzner Maximilian
Rohlfing Katharina
Wrede Sebastian
Publication venue
Publication date: 01/01/2014
Field of study

Gaspers J, Panzner M, Lemme A, Cimiano P, Rohlfing K, Wrede S. A multimodal corpus for the evaluation of computational models for (grounded) language acquisition. In: EACL Workshop on Cognitive Aspects of Computational Language Learning. 2014

Publications at Bielefeld University

Driver Distraction Identification with an Ensemble of Convolutional Neural Networks

Author: Abouelnaga Yehya
Eraqi Hesham M.
Moustafa Mohamed N.
Saad Mohamed H.
Publication venue
Publication date: 01/01/2019
Field of study

The World Health Organization (WHO) reported 1.25 million deaths yearly due to road traffic accidents worldwide and the number has been continuously increasing over the last few years. Nearly fifth of these accidents are caused by distracted drivers. Existing work of distracted driver detection is concerned with a small set of distractions (mostly, cell phone usage). Unreliable ad-hoc methods are often used.In this paper, we present the first publicly available dataset for driver distraction identification with more distraction postures than existing alternatives. In addition, we propose a reliable deep learning-based solution that achieves a 90% accuracy. The system consists of a genetically-weighted ensemble of convolutional neural networks, we show that a weighted ensemble of classifiers using a genetic algorithm yields in a better classification confidence. We also study the effect of different visual elements in distraction detection by means of face and hand localizations, and skin segmentation. Finally, we present a thinned version of our ensemble that could achieve 84.64% classification accuracy and operate in a real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949

arXiv.org e-Print Archive

Directory of Open Access Journals

Scipedia

Vision based referee sign language recognition system for the RoboCup MSL league

Author: D. Kelly
F. Ribeiro
F.-S. Chen
H. Cooper
K. Oka
L.P. Reis
L.R. Rabiner
M. Oshita
R. Almeida
Y. Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

In RoboCup Middle Size league (MSL) the main referee uses assisting technology, controlled by a second referee, to support him, in particular for conveying referee decisions for robot players with the help of a wireless communication system. In this paper a vision-based system is introduced, able to interpret dynamic and static gestures of the referee, thus eliminating the need for a second one. The referee's gestures are interpreted by the system and sent directly to the Referee Box, which sends the proper commands to the robots. The system is divided into four modules: a real time hand tracking and feature extraction, a SVM (Support Vector Machine) for static hand posture identification, an HMM (Hidden Markov Model) for dynamic unistroke hand gesture recognition, and a FSM (Finite State Machine) to control the various system states transitions. The experimental results showed that the system works very reliably, being able to recognize the combination of gestures and hand postures in real-time. For the hand posture recognition, with the SVM model trained with the selected features, an accuracy of 98,2% was achieved. Also, the system has many advantages over the current implemented one, like avoiding the necessity of a second referee, working on noisy environments, working on wireless jammed situations. This system is easy to implement and train and may be an inexpensive solution

Universidade do Minho: RepositoriUM

Crossref

RGB-D datasets using microsoft kinect or similar sensors: a survey

Author: Galili
Guan
Hu
Kolner
Mulvad
Nakazawa
Palushani
Palushani
Publication venue: Springer
Publication date: 01/01/2015
Field of study

RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

Northumbria Research Link

Crossref

Springer - Publisher Connector

Online Research Database In Technology

Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment

Author: Andrea Generosi
Margherita Peruzzini
Maura Mengoni
Riccardo Karim Khamaisi
Silvia Ceccacci
Thomas Agostinelli
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

This paper introduces a low-cost and low computational marker-less motion capture system based on the acquisition of frame images through standard RGB cameras. It exploits the open-source deep learning model CMU, from the tf-pose-estimation project. Its numerical accuracy and its usefulness for ergonomic assessment are evaluated by a proper experiment, designed and performed to: (1) compare the data provided by it with those collected from a motion capture golden standard system; (2) compare the RULA scores obtained with data provided by it with those obtained with data provided by the Vicon Nexus system and those estimated through video analysis, by a team of three expert ergonomists. Tests have been conducted in standardized laboratory conditions and involved a total of six subjects. Results suggest that the proposed system can predict angles with good consistency and give evidence about the tool’s usefulness for ergonomist

IRIS UniversitÃ Politecnica delle Marche

Archivio istituzionale della ricerca - Università di Macerata

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

인간공학적 자세 평가를 위한 비디오 기반의 작업 자세 입력 시스템 개발

Author: 김경빈
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(석사) -- 서울대학교대학원 : 공과대학 산업공학과, 2022. 8. 윤명환.Work-related musculoskeletal disorders are a crucial problem for the worker’s safety and productivity of the workplace. The purpose of this study is to propose and develop a video-based work pose entry system for ergonomic postural assessment methods, Rapid Upper Limb Assessment(RULA) and Rapid Entire Body Assessment(REBA). This study developed a work pose entry system using the YOLOv3 algorithm for human tracking and the SPIN approach for 3D human pose estimation. The work pose entry system takes in a 2D video and scores of few evaluation items as input and outputs a final RULA or REBA score and the corresponding action level. An experiment for validation was conducted to 20 evaluators which were classified into two groups, experienced and novice, based on their level of knowledge or experience on ergonomics and musculoskeletal disorders. Participants were asked to manually evaluate working postures of 20 working videos taken at an automobile assembly plant, recording their scores on an Excel worksheet. Scores were generated by the work pose entry system based on individual items that need to be inputted, and the results of manual evaluation and results from the work pose entry system were compared. Descriptive statistics and Mann-Whitney U test showed that using the proposed work pose entry system decreased the difference and standard deviation between the groups. Also, findings showed that experienced evaluators tend to score higher than novice evaluators. Fisher’s exact test was also conducted on evaluation items that are inputted into the work pose entry system, and results have shown that some items that may seem apparent can be perceived differently between groups as well. The work pose entry system developed in this study can contribute to increasing consistency of ergonomic risk assessment and reducing time and effort of ergonomic practitioners during the process. Directions for future research on developing work pose entry systems for ergonomic posture assessment using computer vision are also suggested in the current study.작업 관련 근골격계 질환은 근로자의 안전과 작업장의 생산성 향상에 중요한 문제다. 본 연구의 목적은 인간공학적 자세 분석에 사용되는 대표적인 방법인 Rapid Upper Limb Assessment(RULA) 및 Rapid Entire Body Assessment(REBA)를 위한 비디오 기반의 작업 자세 입력 시스템을 제안하는 것이다. 본 연구는 영상 내 사람 탐지 및 추적을 위한 YOLOv3 알고리즘과 3차원 사람 자세 추정을 위한 SPIN 접근법을 사용하는 시스템을 개발했다. 해당 작업 자세 입력 시스템은 2차원 영상과 몇 개의 평가 항목 점수를 입력으로 받아 최종 RULA 또는 REBA 점수와 해당 조치수준(Action level)을 출력한다. 본 연구에서 제안하는 작업 자세 입력 시스템이 일관적인 결과를 산출하는지 알아보기 위해 인간공학 및 근골격계 질환에 대한 지식이나 경험을 기준으로 숙련된 평가자와 초보 평가자의 두 그룹으로 분류된 평가자 20명을 대상으로 검증 실험을 진행했다. 참가자들은 국내 자동차 조립 공장에서 찍은 20개의 작업 영상의 작업 자세를 수동으로 평가하여 Excel 워크시트에 점수를 기록하였다. 시스템 사용 시 입력해야 하는 개별 항목을 기준으로 시스템을 통한 점수를 생성하고 기존의 전통적인 방법으로 평가한 결과와 시스템에서 얻은 결과를 비교하였으며, 기술 통계와 Mann-Whitney U test는 제안된 시스템을 사용하면 그룹 간의 차이와 표준편차가 감소한다는 것을 보여주었다. 또한, 경험이 많은 평가자들이 초보 평가자들보다 더 높은 점수를 받는 경향이 있다는 것을 보여주었다. 시스템에 입력되는 평가 항목과 경험 정도와의 관계를 확인하기 위해 Fisher’s exact test를 수행하였으며, 결과는 명백해 보일 수 있는 일부 항목도 그룹 간에 다르게 인식될 수 있음을 보여주었다. 이 도구에서 개발된 작업 자세 입력 시스템은 인간공학적 자세 평가의 일관성을 높이고 평가 과정 중 중에 인간공학적 평가자의 시간과 노력을 줄이는 데 기여할 수 있다. 또한 컴퓨터 비전을 활용한 인간공학적 자세 평가를 위한 작업 자세 입력 시스템 개발에 대한 향후 연구 방향도 이번 연구에서 제시된다.Chapter 1 Introduction 1 1.1 Background 1 1.2 Research Objectives 4 1.3 Organization of the Thesis 5 Chapter 2 Literature Review 6 2.1 Overview 6 2.2 Work-related Musculoskeletal Disorders 6 2.3 Ergonomic Posture Analysis 7 2.3.1 Self-reports 7 2.3.2 Observational Methods 7 2.3.3 Direct Methods 15 2.3.4 Vision-based Methods 17 2.4 3D Human Pose Estimation 19 2.4.1 Model-free Approaches 20 2.4.2 Model-based Approaches 21 Chapter 3 Proposed System Design 23 3.1 Overview 23 3.2 Human Tracking 24 3.3 3D Human Pose Estimation 24 3.4 Score Calculation 26 3.4.1 Posture Score Calculation 26 3.4.2 Output of the Proposed System 31 Chapter 4 Validation Experiment 32 4.1 Hypotheses 32 4.2 Methods 32 4.2.1 Participants 32 4.2.2 Apparatus 33 4.2.3 Procedure 33 4.2.4 Data Analysis 37 4.3 Results 38 4.3.1 RULA 38 4.3.2 REBA 46 4.3.3 Evaluation Items for Manual Input 54 Chapter 5 Discussion 56 5.1 Group Difference 56 5.1.1 RULA 57 5.1.2 REBA 57 5.2 Evaluation Items for Manual Input 58 5.3 Proposed Work Pose Entry System 59 Chapter 6 Conclusion 62 6.1 Conclusion 62 6.2 Limitation, Contribution, and Future Direction 62 Bibliography 65 국문초록 77석

SNU Open Repository and Archive