916 research outputs found

    An original framework for understanding human actions and body language by using deep neural networks

    Get PDF
    The evolution of both fields of Computer Vision (CV) and Artificial Neural Networks (ANNs) has allowed the development of efficient automatic systems for the analysis of people's behaviour. By studying hand movements it is possible to recognize gestures, often used by people to communicate information in a non-verbal way. These gestures can also be used to control or interact with devices without physically touching them. In particular, sign language and semaphoric hand gestures are the two foremost areas of interest due to their importance in Human-Human Communication (HHC) and Human-Computer Interaction (HCI), respectively. While the processing of body movements play a key role in the action recognition and affective computing fields. The former is essential to understand how people act in an environment, while the latter tries to interpret people's emotions based on their poses and movements; both are essential tasks in many computer vision applications, including event recognition, and video surveillance. In this Ph.D. thesis, an original framework for understanding Actions and body language is presented. The framework is composed of three main modules: in the first one, a Long Short Term Memory Recurrent Neural Networks (LSTM-RNNs) based method for the Recognition of Sign Language and Semaphoric Hand Gestures is proposed; the second module presents a solution based on 2D skeleton and two-branch stacked LSTM-RNNs for action recognition in video sequences; finally, in the last module, a solution for basic non-acted emotion recognition by using 3D skeleton and Deep Neural Networks (DNNs) is provided. The performances of RNN-LSTMs are explored in depth, due to their ability to model the long term contextual information of temporal sequences, making them suitable for analysing body movements. All the modules were tested by using challenging datasets, well known in the state of the art, showing remarkable results compared to the current literature methods

    A multimodal corpus for the evaluation of computational models for (grounded) language acquisition

    Get PDF
    Gaspers J, Panzner M, Lemme A, Cimiano P, Rohlfing K, Wrede S. A multimodal corpus for the evaluation of computational models for (grounded) language acquisition. In: EACL Workshop on Cognitive Aspects of Computational Language Learning. 2014

    Driver Distraction Identification with an Ensemble of Convolutional Neural Networks

    Get PDF
    The World Health Organization (WHO) reported 1.25 million deaths yearly due to road traffic accidents worldwide and the number has been continuously increasing over the last few years. Nearly fifth of these accidents are caused by distracted drivers. Existing work of distracted driver detection is concerned with a small set of distractions (mostly, cell phone usage). Unreliable ad-hoc methods are often used.In this paper, we present the first publicly available dataset for driver distraction identification with more distraction postures than existing alternatives. In addition, we propose a reliable deep learning-based solution that achieves a 90% accuracy. The system consists of a genetically-weighted ensemble of convolutional neural networks, we show that a weighted ensemble of classifiers using a genetic algorithm yields in a better classification confidence. We also study the effect of different visual elements in distraction detection by means of face and hand localizations, and skin segmentation. Finally, we present a thinned version of our ensemble that could achieve 84.64% classification accuracy and operate in a real-time environment.Comment: arXiv admin note: substantial text overlap with arXiv:1706.0949

    Vision based referee sign language recognition system for the RoboCup MSL league

    Get PDF
    In RoboCup Middle Size league (MSL) the main referee uses assisting technology, controlled by a second referee, to support him, in particular for conveying referee decisions for robot players with the help of a wireless communication system. In this paper a vision-based system is introduced, able to interpret dynamic and static gestures of the referee, thus eliminating the need for a second one. The referee's gestures are interpreted by the system and sent directly to the Referee Box, which sends the proper commands to the robots. The system is divided into four modules: a real time hand tracking and feature extraction, a SVM (Support Vector Machine) for static hand posture identification, an HMM (Hidden Markov Model) for dynamic unistroke hand gesture recognition, and a FSM (Finite State Machine) to control the various system states transitions. The experimental results showed that the system works very reliably, being able to recognize the combination of gestures and hand postures in real-time. For the hand posture recognition, with the SVM model trained with the selected features, an accuracy of 98,2% was achieved. Also, the system has many advantages over the current implemented one, like avoiding the necessity of a second referee, working on noisy environments, working on wireless jammed situations. This system is easy to implement and train and may be an inexpensive solution

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Preliminary Validation of a Low-Cost Motion Analysis System Based on RGB Cameras to Support the Evaluation of Postural Risk Assessment

    Get PDF
    This paper introduces a low-cost and low computational marker-less motion capture system based on the acquisition of frame images through standard RGB cameras. It exploits the open-source deep learning model CMU, from the tf-pose-estimation project. Its numerical accuracy and its usefulness for ergonomic assessment are evaluated by a proper experiment, designed and performed to: (1) compare the data provided by it with those collected from a motion capture golden standard system; (2) compare the RULA scores obtained with data provided by it with those obtained with data provided by the Vicon Nexus system and those estimated through video analysis, by a team of three expert ergonomists. Tests have been conducted in standardized laboratory conditions and involved a total of six subjects. Results suggest that the proposed system can predict angles with good consistency and give evidence about the toolโ€™s usefulness for ergonomist

    ์ธ๊ฐ„๊ณตํ•™์  ์ž์„ธ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ๋น„๋””์˜ค ๊ธฐ๋ฐ˜์˜ ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์‚ฐ์—…๊ณตํ•™๊ณผ, 2022. 8. ์œค๋ช…ํ™˜.Work-related musculoskeletal disorders are a crucial problem for the workerโ€™s safety and productivity of the workplace. The purpose of this study is to propose and develop a video-based work pose entry system for ergonomic postural assessment methods, Rapid Upper Limb Assessment(RULA) and Rapid Entire Body Assessment(REBA). This study developed a work pose entry system using the YOLOv3 algorithm for human tracking and the SPIN approach for 3D human pose estimation. The work pose entry system takes in a 2D video and scores of few evaluation items as input and outputs a final RULA or REBA score and the corresponding action level. An experiment for validation was conducted to 20 evaluators which were classified into two groups, experienced and novice, based on their level of knowledge or experience on ergonomics and musculoskeletal disorders. Participants were asked to manually evaluate working postures of 20 working videos taken at an automobile assembly plant, recording their scores on an Excel worksheet. Scores were generated by the work pose entry system based on individual items that need to be inputted, and the results of manual evaluation and results from the work pose entry system were compared. Descriptive statistics and Mann-Whitney U test showed that using the proposed work pose entry system decreased the difference and standard deviation between the groups. Also, findings showed that experienced evaluators tend to score higher than novice evaluators. Fisherโ€™s exact test was also conducted on evaluation items that are inputted into the work pose entry system, and results have shown that some items that may seem apparent can be perceived differently between groups as well. The work pose entry system developed in this study can contribute to increasing consistency of ergonomic risk assessment and reducing time and effort of ergonomic practitioners during the process. Directions for future research on developing work pose entry systems for ergonomic posture assessment using computer vision are also suggested in the current study.์ž‘์—… ๊ด€๋ จ ๊ทผ๊ณจ๊ฒฉ๊ณ„ ์งˆํ™˜์€ ๊ทผ๋กœ์ž์˜ ์•ˆ์ „๊ณผ ์ž‘์—…์žฅ์˜ ์ƒ์‚ฐ์„ฑ ํ–ฅ์ƒ์— ์ค‘์š”ํ•œ ๋ฌธ์ œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์˜ ๋ชฉ์ ์€ ์ธ๊ฐ„๊ณตํ•™์  ์ž์„ธ ๋ถ„์„์— ์‚ฌ์šฉ๋˜๋Š” ๋Œ€ํ‘œ์ ์ธ ๋ฐฉ๋ฒ•์ธ Rapid Upper Limb Assessment(RULA) ๋ฐ Rapid Entire Body Assessment(REBA)๋ฅผ ์œ„ํ•œ ๋น„๋””์˜ค ๊ธฐ๋ฐ˜์˜ ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ์„ ์ œ์•ˆํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์˜์ƒ ๋‚ด ์‚ฌ๋žŒ ํƒ์ง€ ๋ฐ ์ถ”์ ์„ ์œ„ํ•œ YOLOv3 ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ 3์ฐจ์› ์‚ฌ๋žŒ ์ž์„ธ ์ถ”์ •์„ ์œ„ํ•œ SPIN ์ ‘๊ทผ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ–ˆ๋‹ค. ํ•ด๋‹น ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ์€ 2์ฐจ์› ์˜์ƒ๊ณผ ๋ช‡ ๊ฐœ์˜ ํ‰๊ฐ€ ํ•ญ๋ชฉ ์ ์ˆ˜๋ฅผ ์ž…๋ ฅ์œผ๋กœ ๋ฐ›์•„ ์ตœ์ข… RULA ๋˜๋Š” REBA ์ ์ˆ˜์™€ ํ•ด๋‹น ์กฐ์น˜์ˆ˜์ค€(Action level)์„ ์ถœ๋ ฅํ•œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ œ์•ˆํ•˜๋Š” ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ์ด ์ผ๊ด€์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์‚ฐ์ถœํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด ์ธ๊ฐ„๊ณตํ•™ ๋ฐ ๊ทผ๊ณจ๊ฒฉ๊ณ„ ์งˆํ™˜์— ๋Œ€ํ•œ ์ง€์‹์ด๋‚˜ ๊ฒฝํ—˜์„ ๊ธฐ์ค€์œผ๋กœ ์ˆ™๋ จ๋œ ํ‰๊ฐ€์ž์™€ ์ดˆ๋ณด ํ‰๊ฐ€์ž์˜ ๋‘ ๊ทธ๋ฃน์œผ๋กœ ๋ถ„๋ฅ˜๋œ ํ‰๊ฐ€์ž 20๋ช…์„ ๋Œ€์ƒ์œผ๋กœ ๊ฒ€์ฆ ์‹คํ—˜์„ ์ง„ํ–‰ํ–ˆ๋‹ค. ์ฐธ๊ฐ€์ž๋“ค์€ ๊ตญ๋‚ด ์ž๋™์ฐจ ์กฐ๋ฆฝ ๊ณต์žฅ์—์„œ ์ฐ์€ 20๊ฐœ์˜ ์ž‘์—… ์˜์ƒ์˜ ์ž‘์—… ์ž์„ธ๋ฅผ ์ˆ˜๋™์œผ๋กœ ํ‰๊ฐ€ํ•˜์—ฌ Excel ์›Œํฌ์‹œํŠธ์— ์ ์ˆ˜๋ฅผ ๊ธฐ๋กํ•˜์˜€๋‹ค. ์‹œ์Šคํ…œ ์‚ฌ์šฉ ์‹œ ์ž…๋ ฅํ•ด์•ผ ํ•˜๋Š” ๊ฐœ๋ณ„ ํ•ญ๋ชฉ์„ ๊ธฐ์ค€์œผ๋กœ ์‹œ์Šคํ…œ์„ ํ†ตํ•œ ์ ์ˆ˜๋ฅผ ์ƒ์„ฑํ•˜๊ณ  ๊ธฐ์กด์˜ ์ „ํ†ต์ ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ํ‰๊ฐ€ํ•œ ๊ฒฐ๊ณผ์™€ ์‹œ์Šคํ…œ์—์„œ ์–ป์€ ๊ฒฐ๊ณผ๋ฅผ ๋น„๊ตํ•˜์˜€์œผ๋ฉฐ, ๊ธฐ์ˆ  ํ†ต๊ณ„์™€ Mann-Whitney U test๋Š” ์ œ์•ˆ๋œ ์‹œ์Šคํ…œ์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ทธ๋ฃน ๊ฐ„์˜ ์ฐจ์ด์™€ ํ‘œ์ค€ํŽธ์ฐจ๊ฐ€ ๊ฐ์†Œํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ๋˜ํ•œ, ๊ฒฝํ—˜์ด ๋งŽ์€ ํ‰๊ฐ€์ž๋“ค์ด ์ดˆ๋ณด ํ‰๊ฐ€์ž๋“ค๋ณด๋‹ค ๋” ๋†’์€ ์ ์ˆ˜๋ฅผ ๋ฐ›๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์‹œ์Šคํ…œ์— ์ž…๋ ฅ๋˜๋Š” ํ‰๊ฐ€ ํ•ญ๋ชฉ๊ณผ ๊ฒฝํ—˜ ์ •๋„์™€์˜ ๊ด€๊ณ„๋ฅผ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด Fisherโ€™s exact test๋ฅผ ์ˆ˜ํ–‰ํ•˜์˜€์œผ๋ฉฐ, ๊ฒฐ๊ณผ๋Š” ๋ช…๋ฐฑํ•ด ๋ณด์ผ ์ˆ˜ ์žˆ๋Š” ์ผ๋ถ€ ํ•ญ๋ชฉ๋„ ๊ทธ๋ฃน ๊ฐ„์— ๋‹ค๋ฅด๊ฒŒ ์ธ์‹๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ด ๋„๊ตฌ์—์„œ ๊ฐœ๋ฐœ๋œ ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ์€ ์ธ๊ฐ„๊ณตํ•™์  ์ž์„ธ ํ‰๊ฐ€์˜ ์ผ๊ด€์„ฑ์„ ๋†’์ด๊ณ  ํ‰๊ฐ€ ๊ณผ์ • ์ค‘ ์ค‘์— ์ธ๊ฐ„๊ณตํ•™์  ํ‰๊ฐ€์ž์˜ ์‹œ๊ฐ„๊ณผ ๋…ธ๋ ฅ์„ ์ค„์ด๋Š” ๋ฐ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋‹ค. ๋˜ํ•œ ์ปดํ“จํ„ฐ ๋น„์ „์„ ํ™œ์šฉํ•œ ์ธ๊ฐ„๊ณตํ•™์  ์ž์„ธ ํ‰๊ฐ€๋ฅผ ์œ„ํ•œ ์ž‘์—… ์ž์„ธ ์ž…๋ ฅ ์‹œ์Šคํ…œ ๊ฐœ๋ฐœ์— ๋Œ€ํ•œ ํ–ฅํ›„ ์—ฐ๊ตฌ ๋ฐฉํ–ฅ๋„ ์ด๋ฒˆ ์—ฐ๊ตฌ์—์„œ ์ œ์‹œ๋œ๋‹ค.Chapter 1 Introduction 1 1.1 Background 1 1.2 Research Objectives 4 1.3 Organization of the Thesis 5 Chapter 2 Literature Review 6 2.1 Overview 6 2.2 Work-related Musculoskeletal Disorders 6 2.3 Ergonomic Posture Analysis 7 2.3.1 Self-reports 7 2.3.2 Observational Methods 7 2.3.3 Direct Methods 15 2.3.4 Vision-based Methods 17 2.4 3D Human Pose Estimation 19 2.4.1 Model-free Approaches 20 2.4.2 Model-based Approaches 21 Chapter 3 Proposed System Design 23 3.1 Overview 23 3.2 Human Tracking 24 3.3 3D Human Pose Estimation 24 3.4 Score Calculation 26 3.4.1 Posture Score Calculation 26 3.4.2 Output of the Proposed System 31 Chapter 4 Validation Experiment 32 4.1 Hypotheses 32 4.2 Methods 32 4.2.1 Participants 32 4.2.2 Apparatus 33 4.2.3 Procedure 33 4.2.4 Data Analysis 37 4.3 Results 38 4.3.1 RULA 38 4.3.2 REBA 46 4.3.3 Evaluation Items for Manual Input 54 Chapter 5 Discussion 56 5.1 Group Difference 56 5.1.1 RULA 57 5.1.2 REBA 57 5.2 Evaluation Items for Manual Input 58 5.3 Proposed Work Pose Entry System 59 Chapter 6 Conclusion 62 6.1 Conclusion 62 6.2 Limitation, Contribution, and Future Direction 62 Bibliography 65 ๊ตญ๋ฌธ์ดˆ๋ก 77์„
    • โ€ฆ
    corecore