683 research outputs found
Depth Images-based Human Detection, Tracking and Activity Recognition Using Spatiotemporal Features and Modified HMM
Human activity recognition using depth information is an emerging and challenging technology in computer vision due to its considerable attention by many practical applications such as smart home/office system, personal health care and 3D video games. This paper presents a novel framework of 3D human body detection, tracking and recognition from depth video sequences using spatiotemporal features and modified HMM. To detect human silhouette, raw depth data is examined to extract human silhouette by considering spatial continuity and constraints of human motion information. While, frame differentiation is used to track human movements. Features extraction mechanism consists of spatial depth shape features and temporal joints features are used to improve classification performance. Both of'these features are fused together to recognize different activities using the modified hidden Markov model (M-HMM). The proposed approach is evaluated on two challenging depth video datasets. Moreover, our system has significant abilities to handle subject's body parts rotation and body parts missing which provide major contributions in human activity recognition.1165Ysciescopuskc
A Depth Video-based Human Detection and Activity Recognition using Multi-features and Embedded Hidden Markov Models for Health Care Monitoring Systems
Increase in number of elderly people who are living independently needs especial care in the form of healthcare monitoring systems. Recent advancements in depth video technologies have made human activity recognition (HAR) realizable for elderly healthcare applications. In this paper, a depth video-based novel method for HAR is presented using robust multi-features and embedded Hidden Markov Models (HMMs) to recognize daily life activities of elderly people living alone in indoor environment such as smart homes. In the proposed HAR framework, initially, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, multi-view differentiation and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, modeled, trained and recognized with specific embedded HMM having active feature values. Furthermore, we construct a new online human activity dataset by a depth sensor to evaluate the proposed features. Our experiments on three depth datasets demonstrated that the proposed multi-features are efficient and robust over the state of the art features for human action and activity recognition
Improved Behavior Monitoring and Classification Using Cues Parameters Extraction from Camera Array Images
Behavior monitoring and classification is a mechanism used to automatically identify or verify individual based on their human detection, tracking and behavior recognition from video sequences captured by a depth camera. In this paper, we designed a system that precisely classifies the nature of 3D body postures obtained by Kinect using an advanced recognizer. We proposed novel features that are suitable for depth data. These features are robust to noise, invariant to translation and scaling, and capable of monitoring fast human bodyparts movements. Lastly, advanced hidden Markov model is used to recognize different activities. In the extensive experiments, we have seen that our system consistently outperforms over three depth-based behavior datasets, i.e., IM-DailyDepthActivity, MSRDailyActivity3D and MSRAction3D in both posture classification and behavior recognition. Moreover, our system handles subject's body parts rotation, self-occlusion and body parts missing which significantly track complex activities and improve recognition rate. Due to easy accessible, low-cost and friendly deployment process of depth camera, the proposed system can be applied over various consumer-applications including patient-monitoring system, automatic video surveillance, smart homes/offices and 3D games
Two Hand Gesture Based 3D Navigation in Virtual Environments
Natural interaction is gaining popularity due to its simple, attractive, and realistic nature, which realizes direct Human Computer Interaction (HCI). In this paper, we presented a novel two hand gesture based interaction technique for 3 dimensional (3D) navigation in Virtual Environments (VEs). The system used computer vision techniques for the detection of hand gestures (colored thumbs) from real scene and performed different navigation (forward, backward, up, down, left, and right) tasks in the VE. The proposed technique also allow users to efficiently control speed during navigation. The proposed technique is implemented via a VE for experimental purposes. Forty (40) participants performed the experimental study. Experiments revealed that the proposed technique is feasible, easy to learn and use, having less cognitive load on users. Finally gesture recognition engines were used to assess the accuracy and performance of the proposed gestures. kNN achieved high accuracy rates (95.7%) as compared to SVM (95.3%). kNN also has high performance rates in terms of training time (3.16 secs) and prediction speed (6600 obs/sec) as compared to SVM with 6.40 secs and 2900 obs/sec
GIFT: Gesture-Based Interaction by Fingers Tracking, an Interaction Technique for Virtual Environment
Three Dimensional (3D) interaction is the plausible human interaction inside a Virtual Environment (VE). The rise of the Virtual Reality (VR) applications in various domains demands for a feasible 3D interface. Ensuring immersivity in a virtual space, this paper presents an interaction technique where manipulation is performed by the perceptive gestures of the two dominant fingers; thumb and index. The two fingertip-thimbles made of paper are used to trace states and positions of the fingers by an ordinary camera. Based on the positions of the fingers, the basic interaction tasks; selection, scaling, rotation, translation and navigation are performed by intuitive gestures of the fingers. Without keeping a gestural database, the features-free detection of the fingers guarantees speedier interactions. Moreover, the system is user-independent and depends neither on the size nor on the color of the users’ hand. With a case-study project; Interactions by the Gestures of Fingers (IGF) the technique is implemented for evaluation. The IGF application traces gestures of the fingers using the libraries of OpenCV at the back-end. At the front-end, the objects of the VE are rendered accordingly using the Open Graphics Library; OpenGL. The system is assessed in a moderate lighting condition by a group of 15 users. Furthermore, usability of the technique is investigated in games. Outcomes of the evaluations revealed that the approach is suitable for VR applications both in terms of cost and accuracy
State of the art of audio- and video based solutions for AAL
Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach.
This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users.
The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted.
The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio
VGAN-Based Image Representation Learning for Privacy-Preserving Facial Expression Recognition
Reliable facial expression recognition plays a critical role in human-machine
interactions. However, most of the facial expression analysis methodologies
proposed to date pay little or no attention to the protection of a user's
privacy. In this paper, we propose a Privacy-Preserving Representation-Learning
Variational Generative Adversarial Network (PPRL-VGAN) to learn an image
representation that is explicitly disentangled from the identity information.
At the same time, this representation is discriminative from the standpoint of
facial expression recognition and generative as it allows expression-equivalent
face image synthesis. We evaluate the proposed model on two public datasets
under various threat scenarios. Quantitative and qualitative results
demonstrate that our approach strikes a balance between the preservation of
privacy and data utility. We further demonstrate that our model can be
effectively applied to other tasks such as expression morphing and image
completion
- …