1,186 research outputs found

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    Biometric Spoofing: A JRC Case Study in 3D Face Recognition

    Get PDF
    Based on newly available and affordable off-the-shelf 3D sensing, processing and printing technologies, the JRC has conducted a comprehensive study on the feasibility of spoofing 3D and 2.5D face recognition systems with low-cost self-manufactured models and presents in this report a systematic and rigorous evaluation of the real risk posed by such attacking approach which has been complemented by a test campaign. The work accomplished and presented in this report, covers theories, methodologies, state of the art techniques, evaluation databases and also aims at providing an outlook into the future of this extremely active field of research.JRC.G.6-Digital Citizen Securit

    Multimodal Affective State Recognition in Serious Games Applications

    Get PDF
    A challenging research issue, which has recently attracted a lot of attention, is the incorporation of emotion recognition technology in serious games applications, in order to improve the quality of interaction and enhance the gaming experience. To this end, in this paper, we present an emotion recognition methodology that utilizes information extracted from multimodal fusion analysis to identify the affective state of players during gameplay scenarios. More specifically, two monomodal classifiers have been designed for extracting affective state information based on facial expression and body motion analysis. For the combination of different modalities a deep model is proposed that is able to make a decision about player’s affective state, while also being robust in the absence of one information cue. In order to evaluate the performance of our methodology, a bimodal database was created using Microsoft’s Kinect sensor, containing feature vectors extracted from users' facial expressions and body gestures. The proposed method achieved higher recognition rate in comparison with mono-modal, as well as early-fusion algorithms. Our methodology outperforms all other classifiers, achieving an overall recognition rate of 98.3%

    Towards Reversible De-Identification in Video Sequences Using 3D Avatars and Steganography

    Full text link
    We propose a de-identification pipeline that protects the privacy of humans in video sequences by replacing them with rendered 3D human models, hence concealing their identity while retaining the naturalness of the scene. The original images of humans are steganographically encoded in the carrier image, i.e. the image containing the original scene and the rendered 3D human models. We qualitatively explore the feasibility of our approach, utilizing the Kinect sensor and its libraries to detect and localize human joints. A 3D avatar is rendered into the scene using the obtained joint positions, and the original human image is steganographically encoded in the new scene. Our qualitative evaluation shows reasonably good results that merit further exploration.Comment: Part of the Proceedings of the Croatian Computer Vision Workshop, CCVW 2015, Year

    Hand gesture recognition using Kinect.

    Get PDF
    Hand gesture recognition (HGR) is an important research topic because some situations require silent communication with sign languages. Computational HGR systems assist silent communication, and help people learn a sign language. In this thesis. a novel method for contact-less HGR using Microsoft Kinect for Xbox is described, and a real-time HCR system is implemented with Microsoft Visual Studio 2010. Two different scenarios for HGR are provided: the Popular Gesture with nine gestures, and the Numbers with nine gestures. The system allows the users to select a scenario, and it is able to detect hand gestures made by users. to identify fingers, and to recognize the meanings of gestures, and to display the meanings and pictures on screen. The accuracy of the HGR system is from 84% to 99% with single hand gestures, and from 90% to 100% if both hands perform the same gesture at the same time. Because the depth sensor of Kinect is an infrared camera, the lighting conditions. signers\u27 skin colors and clothing, and background have little impact on the performance of this system. The accuracy and the robustness make this system a versatile component that can be integrated in a variety of applications in daily life

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Understanding public speakers’ performance: first contributions to support a computational approach

    Get PDF
    Communication is part of our everyday life and our ability to communicate can have a significant role in a variety of contexts in our personal, academic, and professional lives. For long, the characterization of what is a good communicator has been subject to research and debate by several areas, particularly in Education, with a focus on improving the performance of teachers. In this context, the literature suggests that the ability to communicate is not only defined by the verbal component, but also by a plethora of non-verbal contributions providing redundant or complementary information, and, sometimes, being the message itself. However, even though we can recognize a good or bad communicator, objectively, little is known about what aspects – and to what extent—define the quality of a presentation. The goal of this work is to create the grounds to support the study of the defining characteristics of a good communicator in a more systematic and objective form. To this end, we conceptualize and provide a first prototype for a computational approach to characterize the different elements that are involved in communication, from audiovisual data, illustrating the outcomes and applicability of the proposed methods on a video database of public speakers.publishe

    Artificial Vision Algorithms for Socially Assistive Robot Applications: A Review of the Literature

    Get PDF
    Today, computer vision algorithms are very important for different fields and applications, such as closed-circuit television security, health status monitoring, and recognizing a specific person or object and robotics. Regarding this topic, the present paper deals with a recent review of the literature on computer vision algorithms (recognition and tracking of faces, bodies, and objects) oriented towards socially assistive robot applications. The performance, frames per second (FPS) processing speed, and hardware implemented to run the algorithms are highlighted by comparing the available solutions. Moreover, this paper provides general information for researchers interested in knowing which vision algorithms are available, enabling them to select the one that is most suitable to include in their robotic system applicationsBeca Conacyt Doctorado No de CVU: 64683
    • …