1,995 research outputs found

    A Survey of Applications and Human Motion Recognition with Microsoft Kinect

    Get PDF
    Microsoft Kinect, a low-cost motion sensing device, enables users to interact with computers or game consoles naturally through gestures and spoken commands without any other peripheral equipment. As such, it has commanded intense interests in research and development on the Kinect technology. In this paper, we present, a comprehensive survey on Kinect applications, and the latest research and development on motion recognition using data captured by the Kinect sensor. On the applications front, we review the applications of the Kinect technology in a variety of areas, including healthcare, education and performing arts, robotics, sign language recognition, retail services, workplace safety training, as well as 3D reconstructions. On the technology front, we provide an overview of the main features of both versions of the Kinect sensor together with the depth sensing technologies used, and review literatures on human motion recognition techniques used in Kinect applications. We provide a classification of motion recognition techniques to highlight the different approaches used in human motion recognition. Furthermore, we compile a list of publicly available Kinect datasets. These datasets are valuable resources for researchers to investigate better methods for human motion recognition and lower-level computer vision tasks such as segmentation, object detection and human pose estimation

    A review of computer vision-based approaches for physical rehabilitation and assessment

    Get PDF
    The computer vision community has extensively researched the area of human motion analysis, which primarily focuses on pose estimation, activity recognition, pose or gesture recognition and so on. However for many applications, like monitoring of functional rehabilitation of patients with musculo skeletal or physical impairments, the requirement is to comparatively evaluate human motion. In this survey, we capture important literature on vision-based monitoring and physical rehabilitation that focuses on comparative evaluation of human motion during the past two decades and discuss the state of current research in this area. Unlike other reviews in this area, which are written from a clinical objective, this article presents research in this area from a computer vision application perspective. We propose our own taxonomy of computer vision-based rehabilitation and assessment research which are further divided into sub-categories to capture novelties of each research. The review discusses the challenges of this domain due to the wide ranging human motion abnormalities and difficulty in automatically assessing those abnormalities. Finally, suggestions on the future direction of research are offered

    Vision-based methods for state estimation and control of robotic systems with application to mobile and surgical robots

    Get PDF
    For autonomous systems that need to perceive the surrounding environment for the accomplishment of a given task, vision is a highly informative exteroceptive sensory source. When gathering information from the available sensors, in fact, the richness of visual data allows to provide a complete description of the environment, collecting geometrical and semantic information (e.g., object pose, distances, shapes, colors, lights). The huge amount of collected data allows to consider both methods exploiting the totality of the data (dense approaches), or a reduced set obtained from feature extraction procedures (sparse approaches). This manuscript presents dense and sparse vision-based methods for control and sensing of robotic systems. First, a safe navigation scheme for mobile robots, moving in unknown environments populated by obstacles, is presented. For this task, dense visual information is used to perceive the environment (i.e., detect ground plane and obstacles) and, in combination with other sensory sources, provide an estimation of the robot motion with a linear observer. On the other hand, sparse visual data are extrapolated in terms of geometric primitives, in order to implement a visual servoing control scheme satisfying proper navigation behaviours. This controller relies on visual estimated information and is designed in order to guarantee safety during navigation. In addition, redundant structures are taken into account to re-arrange the internal configuration of the robot and reduce its encumbrance when the workspace is highly cluttered. Vision-based estimation methods are relevant also in other contexts. In the field of surgical robotics, having reliable data about unmeasurable quantities is of great importance and critical at the same time. In this manuscript, we present a Kalman-based observer to estimate the 3D pose of a suturing needle held by a surgical manipulator for robot-assisted suturing. The method exploits images acquired by the endoscope of the robot platform to extrapolate relevant geometrical information and get projected measurements of the tool pose. This method has also been validated with a novel simulator designed for the da Vinci robotic platform, with the purpose to ease interfacing and employment in ideal conditions for testing and validation. The Kalman-based observers mentioned above are classical passive estimators, whose system inputs used to produce the proper estimation are theoretically arbitrary. This does not provide any possibility to actively adapt input trajectories in order to optimize specific requirements on the performance of the estimation. For this purpose, active estimation paradigm is introduced and some related strategies are presented. More specifically, a novel active sensing algorithm employing visual dense information is described for a typical Structure-from-Motion (SfM) problem. The algorithm generates an optimal estimation of a scene observed by a moving camera, while minimizing the maximum uncertainty of the estimation. This approach can be applied to any robotic platforms and has been validated with a manipulator arm equipped with a monocular camera

    Hand tracking for clinical applications: validation of the Google MediaPipe Hand (GMH) and the depth-enhanced GMH-D frameworks

    Full text link
    Accurate 3D tracking of hand and fingers movements poses significant challenges in computer vision. The potential applications span across multiple domains, including human-computer interaction, virtual reality, industry, and medicine. While gesture recognition has achieved remarkable accuracy, quantifying fine movements remains a hurdle, particularly in clinical applications where the assessment of hand dysfunctions and rehabilitation training outcomes necessitate precise measurements. Several novel and lightweight frameworks based on Deep Learning have emerged to address this issue; however, their performance in accurately and reliably measuring fingers movements requires validation against well-established gold standard systems. In this paper, the aim is to validate the handtracking framework implemented by Google MediaPipe Hand (GMH) and an innovative enhanced version, GMH-D, that exploits the depth estimation of an RGB-Depth camera to achieve more accurate tracking of 3D movements. Three dynamic exercises commonly administered by clinicians to assess hand dysfunctions, namely Hand Opening-Closing, Single Finger Tapping and Multiple Finger Tapping are considered. Results demonstrate high temporal and spectral consistency of both frameworks with the gold standard. However, the enhanced GMH-D framework exhibits superior accuracy in spatial measurements compared to the baseline GMH, for both slow and fast movements. Overall, our study contributes to the advancement of hand tracking technology, the establishment of a validation procedure as a good-practice to prove efficacy of deep-learning-based hand-tracking, and proves the effectiveness of GMH-D as a reliable framework for assessing 3D hand movements in clinical applications

    Analysis of the hands in egocentric vision: A survey

    Full text link
    Egocentric vision (a.k.a. first-person vision - FPV) applications have thrived over the past few years, thanks to the availability of affordable wearable cameras and large annotated datasets. The position of the wearable camera (usually mounted on the head) allows recording exactly what the camera wearers have in front of them, in particular hands and manipulated objects. This intrinsic advantage enables the study of the hands from multiple perspectives: localizing hands and their parts within the images; understanding what actions and activities the hands are involved in; and developing human-computer interfaces that rely on hand gestures. In this survey, we review the literature that focuses on the hands using egocentric vision, categorizing the existing approaches into: localization (where are the hands or parts of them?); interpretation (what are the hands doing?); and application (e.g., systems that used egocentric hand cues for solving a specific problem). Moreover, a list of the most prominent datasets with hand-based annotations is provided

    Considerations for the future development of virtual technology as a rehabilitation tool

    Get PDF
    BACKGROUND: Virtual environments (VE) are a powerful tool for various forms of rehabilitation. Coupling VE with high-speed networking [Tele-Immersion] that approaches speeds of 100 Gb/sec can greatly expand its influence in rehabilitation. Accordingly, these new networks will permit various peripherals attached to computers on this network to be connected and to act as fast as if connected to a local PC. This innovation may soon allow the development of previously unheard of networked rehabilitation systems. Rapid advances in this technology need to be coupled with an understanding of how human behavior is affected when immersed in the VE. METHODS: This paper will discuss various forms of VE that are currently available for rehabilitation. The characteristic of these new networks and examine how such networks might be used for extending the rehabilitation clinic to remote areas will be explained. In addition, we will present data from an immersive dynamic virtual environment united with motion of a posture platform to record biomechanical and physiological responses to combined visual, vestibular, and proprioceptive inputs. A 6 degree-of-freedom force plate provides measurements of moments exerted on the base of support. Kinematic data from the head, trunk, and lower limb was collected using 3-D video motion analysis. RESULTS: Our data suggest that when there is a confluence of meaningful inputs, neither vision, vestibular, or proprioceptive inputs are suppressed in healthy adults; the postural response is modulated by all existing sensory signals in a non-additive fashion. Individual perception of the sensory structure appears to be a significant component of the response to these protocols and underlies much of the observed response variability. CONCLUSION: The ability to provide new technology for rehabilitation services is emerging as an important option for clinicians and patients. The use of data mining software would help analyze the incoming data to provide both the patient and the therapist with evaluation of the current treatment and modifications needed for future therapies. Quantification of individual perceptual styles in the VE will support development of individualized treatment programs. The virtual environment can be a valuable tool for therapeutic interventions that require adaptation to complex, multimodal environments

    A Low-Cost System Using a Big-Data Deep-Learning Framework for Assessing Physical Telerehabilitation: A Proof-of-Concept

    Get PDF
    The consolidation of telerehabilitation for the treatment of many diseases over the last decades is a consequence of its cost-effective results and its ability to offer access to rehabilitation in remote areas. Telerehabilitation operates over a distance, so vulnerable patients are never exposed to unnecessary risks. Despite its low cost, the need for a professional to assess therapeutic exercises and proper corporal movements online should also be mentioned. The focus of this paper is on a telerehabilitation system for patients suffering from Parkinson’s disease in remote villages and other less accessible locations. A full-stack is presented using big data frameworks that facilitate communication between the patient and the occupational therapist, the recording of each session, and real-time skeleton identification using artificial intelligence techniques. Big data technologies are used to process the numerous videos that are generated during the course of treating simultaneous patients. Moreover, the skeleton of each patient can be estimated using deep neural networks for automated evaluation of corporal exercises, which is of immense help to the therapists in charge of the treatment programs.This work was supported by project PI19/00670 of the Ministerio de Ciencia, Innovación y Universidades, Instituto de Salud Carlos III, Spain. The authors gratefully acknowledge the support of the NVIDIA Corporation and its donation of the TITAN Xp GPU used in this research. In addition, this work was partially supported by the European Social Fund, as the authors José Miguel Ramírez-Sanz, José Luis Garrido-Labrador, and Alicia Olivares-Gil are the recipients of a pre-doctoral grant (EDU/875/2021) from the Conserjería de Educación de la Junta de Castilla y León

    Recent Developments and Future Challenges in Medical Mixed Reality

    Get PDF
    As AR technology matures, we have seen many applicationsemerge in entertainment, education and training. However, the useof AR is not yet common in medical practice, despite the great po-tential of this technology to help not only learning and training inmedicine, but also in assisting diagnosis and surgical guidance. Inthis paper, we present recent trends in the use of AR across all med-ical specialties and identify challenges that must be overcome tonarrow the gap between academic research and practical use of ARin medicine. A database of 1403 relevant research papers publishedover the last two decades has been reviewed by using a novel re-search trend analysis method based on text mining algorithm. Wesemantically identified 10 topics including varies of technologiesand applications based on the non-biased and in-personal cluster-ing results from the Latent Dirichlet Allocatio (LDA) model andanalysed the trend of each topic from 1995 to 2015. The statisticresults reveal a taxonomy that can best describes the developmentof the medical AR research during the two decades. And the trendanalysis provide a higher level of view of how the taxonomy haschanged and where the focus will goes. Finally, based on the valu-able results, we provide a insightful discussion to the current limi-tations, challenges and future directions in the field. Our objectiveis to aid researchers to focus on the application areas in medicalAR that are most needed, as well as providing medical practitioners with latest technology advancements
    corecore