10 research outputs found

    Basic daily activity recognition with a data glove

    Get PDF
    Many people in the world are affected by the Alzheimer disease leading to the dysfunctionality of the hand. In one side, this symptom is not the most important of this disease and not much attention is given to this one. In the other side, the literrature provides two main solutions such as computer vision and data glove allowing to recognize hand gestures for virtual reality or robotic applications. From this finding and need, we decided to developed our own data glove prototype allowing to monitor the evolution of the dysfunctionality of the hand by recognizing objects in basic daily activities. Our approach is simple, cheap (~220$) and efficient (~100% of correct predictions) considering that we are abstracting all the theory about the gesture recognition. Also, we can access directly and easily to the raw data. Finally, the proposed prototype is described in a way that researchers can reproduce it

    Precision and power grip detection in egocentric hand-object Interaction using machine learning

    Get PDF
    This project, was carried out in Yverdon-les-Bains, Switzerland, between the University of Applied Sciences and Arts Western Switzerland (HEIG-VD / HES-SO) and the Centre Hospitalier Universitaire Vaudois (CHUV) in Lausanne, it focuses on the detection of grasp types from an egocentric point of view. The objective is to accurately determine the kind of grasp (power, precision and none) performed by a user based on images captured from their perspective. The successful implementation of this grasp detection system would greatly benefit the evaluation of patients undergoing upper limb rehabilitation. Various computer vision frameworks were utilized to detect hands, interacting objects, and depth information in the images. These extracted features were then fed into deep learning models for grasp prediction. Both custom recorded datasets and open-source datasets, such as EpicKitchen and the Yale dataset, were employed for training and evaluation. In conclusion, this project achieved satisfactory results in the detection of grasp types from an egocentric viewpoint, with a 0.76 F1-macro score in the final test set. The utilization of diverse videos, including custom recordings and publicly available datasets, facilitated comprehensive training and evaluation. A robust pipeline was developed through iterative refinement, enabling the extraction of crucial features from each frame to predict grasp types accurately. Furthermore, data mixtures were proposed to enhance dataset size and improve the generalization performance of the models, which played a crucial role in the project's final stages

    Analysis of the hands in egocentric vision: A survey

    Full text link
    Egocentric vision (a.k.a. first-person vision - FPV) applications have thrived over the past few years, thanks to the availability of affordable wearable cameras and large annotated datasets. The position of the wearable camera (usually mounted on the head) allows recording exactly what the camera wearers have in front of them, in particular hands and manipulated objects. This intrinsic advantage enables the study of the hands from multiple perspectives: localizing hands and their parts within the images; understanding what actions and activities the hands are involved in; and developing human-computer interfaces that rely on hand gestures. In this survey, we review the literature that focuses on the hands using egocentric vision, categorizing the existing approaches into: localization (where are the hands or parts of them?); interpretation (what are the hands doing?); and application (e.g., systems that used egocentric hand cues for solving a specific problem). Moreover, a list of the most prominent datasets with hand-based annotations is provided

    Computational Learning for Hand Pose Estimation

    Get PDF
    Rapid advances in human–computer interaction interfaces have been promising a realistic environment for gaming and entertainment in the last few years. However, the use of traditional input devices such as trackballs, keyboards, or joysticks has been a bottleneck for natural interactions between a human and computer as two points of freedom of these devices cannot suitably emulate the interactions in a three-dimensional space. Consequently, a comprehensive hand tracking technology is expected as a smart and intuitive option to these input tools to enhance virtual and augmented reality experiences. In addition, the recent emergence of low-cost depth sensing cameras has led to their broad use of RGB-D data in computer vision, raising expectations of a full 3D interpretation of hand movements for human–computer interaction interfaces. Although the use of hand gestures or hand postures has become essential for a wide range of applications in computer games and augmented/virtual reality, 3D hand pose estimation is still an open and challenging problem because of the following reasons: (i) the hand pose exists in a high-dimensional space because each finger and the palm is associated with several degrees of freedom, (ii) the fingers exhibit self-similarity and often occlude to each other, (iii) global 3D rotations make pose estimation more difficult, and (iv) hands only exist in few pixels in images and the noise in acquired data coupled with fast finger movement confounds continuous hand tracking. The success of hand tracking would naturally depend on synthesizing our knowledge of the hand (i.e., geometric shape, constraints on pose configurations) and latent features about hand poses from the RGB-D data stream (i.e., region of interest, key feature points like finger tips and joints, and temporal continuity). In this thesis, we propose novel methods to leverage the paradigm of analysis by synthesis and create a prediction model using a population of realistic 3D hand poses. The overall goal of this work is to design a concrete framework so the computers can learn and understand about perceptual attributes of human hands (i.e., self-occlusions or self-similarities of the fingers) and to develop a pragmatic solution to the real-time hand pose estimation problem implementable on a standard computer. This thesis can be broadly divided into four parts: learning hand (i) from recommendiations of similar hand poses, (ii) from low-dimensional visual representations, (iii) by hallucinating geometric representations, and (iv) from a manipulating object. Each research work covers our algorithmic contributions to solve the 3D hand pose estimation problem. Additionally, the research work in the appendix proposes a pragmatic technique for applying our ideas to mobile devices with low computational power. Following a given structure, we first overview the most relevant works on depth sensor-based 3D hand pose estimation in the literature both with and without manipulating an object. Two different approaches prevalent for categorizing hand pose estimation, model-based methods and appearance-based methods, are discussed in detail. In this chapter, we also introduce some works relevant to deep learning and trials to achieve efficient compression of the network structure. Next, we describe a synthetic 3D hand model and its motion constraints for simulating realistic human hand movements. The section for the primary research work starts in the following chapter. We discuss our attempts to produce a better estimation model for 3D hand pose estimation by learning hand articulations from recommendations of similar poses. Specifically, the unknown pose parameters for input depth data are estimated by collaboratively learning the known parameters of all neighborhood poses. Subsequently, we discuss deep-learned, discriminative, and low-dimensional features and a hierarchical solution of the stated problem based on the matrix completion framework. This work is further extended by incorporating a function of geometric properties on the surface of the hand described by heat diffusion, which is robust to capture both the local geometry of the hand and global structural representations. The problem of the hands interactions with a physical object is also considered in the following chapter. The main insight is that the interacting object can be a source of constraint on hand poses. In this view, we employ pose dependency on the shape of the object to learn the discriminative features of the hand–object interaction, rather than losing hand information caused by partial or full object occlusions. Subsequently, we present a compressive learning technique in the appendix. Our approach is flexible, enabling us to add more layers and go deeper in the deep learning architecture while keeping the number of parameters the same. Finally, we conclude this thesis work by summarizing the presented approaches for hand pose estimation and then propose future directions to further achieve performance improvements through (i) realistically rendered synthetic hand images, (ii) incorporating RGB images as an input, (iii) hand perseonalization, (iv) use of unstructured point cloud, and (v) embedding sensing techniques

    Design and recognition of microgestures for always-available input

    Get PDF
    Gestural user interfaces for computing devices most commonly require the user to have at least one hand free to interact with the device, for example, moving a mouse, touching a screen, or performing mid-air gestures. Consequently, users find it difficult to operate computing devices while holding or manipulating everyday objects. This limits the users from interacting with the digital world during a significant portion of their everyday activities, such as, using tools in the kitchen or workshop, carrying items, or workout with sports equipment. This thesis pushes the boundaries towards the bigger goal of enabling always-available input. Microgestures have been recognized for their potential to facilitate direct and subtle interactions. However, it remains an open question how to interact using gestures with computing devices when both of the user’s hands are occupied holding everyday objects. We take a holistic approach and focus on three core contributions: i) To understand end-users preferences, we present an empirical analysis of users’ choice of microgestures when holding objects of diverse geometries. Instead of designing a gesture set for a specific object or geometry and to identify gestures that generalize, this thesis leverages the taxonomy of grasp types established from prior research. ii) We tackle the critical problem of avoiding false activation by introducing a novel gestural input concept that leverages a single-finger movement, which stands out from everyday finger motions during holding and manipulating objects. Through a data-driven approach, we also systematically validate the concept’s robustness with different everyday actions. iii) While full sensor coverage on the user’s hand would allow detailed hand-object interaction, minimal instrumentation is desirable for real-world use. This thesis addresses the problem of identifying sparse sensor layouts. We present the first rapid computational method, along with a GUI-based design tool that enables iterative design based on the designer’s high-level requirements. Furthermore, we demonstrate that minimal form-factor devices, like smart rings, can be used to effectively detect microgestures in hands-free and busy scenarios. Overall, the presented findings will serve as both conceptual and technical foundations for enabling interaction with computing devices wherever and whenever users need them.Benutzerschnittstellen für Computergeräte auf Basis von Gesten erfordern für eine Interaktion meist mindestens eine freie Hand, z.B. um eine Maus zu bewegen, einen Bildschirm zu berühren oder Gesten in der Luft auszuführen. Daher ist es für Nutzer schwierig, Geräte zu bedienen, während sie Gegenstände halten oder manipulieren. Dies schränkt die Interaktion mit der digitalen Welt während eines Großteils ihrer alltäglichen Aktivitäten ein, etwa wenn sie Küchengeräte oder Werkzeug verwenden, Gegenstände tragen oder mit Sportgeräten trainieren. Diese Arbeit erforscht neue Wege in Richtung des größeren Ziels, immer verfügbare Eingaben zu ermöglichen. Das Potential von Mikrogesten für die Erleichterung von direkten und feinen Interaktionen wurde bereits erkannt. Die Frage, wie der Nutzer mit Geräten interagiert, wenn beide Hände mit dem Halten von Gegenständen belegt sind, bleibt jedoch offen. Wir verfolgen einen ganzheitlichen Ansatz und konzentrieren uns auf drei Kernbeiträge: i) Um die Präferenzen der Endnutzer zu verstehen, präsentieren wir eine empirische Analyse der Wahl von Mikrogesten beim Halten von Objekte mit diversen Geometrien. Anstatt einen Satz an Gesten für ein bestimmtes Objekt oder eine bestimmte Geometrie zu entwerfen, nutzt diese Arbeit die aus früheren Forschungen stammenden Taxonomien an Griff-Typen. ii) Wir adressieren das Problem falscher Aktivierungen durch ein neuartiges Eingabekonzept, das die sich von alltäglichen Fingerbewegungen abhebende Bewegung eines einzelnen Fingers nutzt. Durch einen datengesteuerten Ansatz validieren wir zudem systematisch die Robustheit des Konzepts bei diversen alltäglichen Aktionen. iii) Auch wenn eine vollständige Sensorabdeckung an der Hand des Nutzers eine detaillierte Hand-Objekt-Interaktion ermöglichen würde, ist eine minimale Ausstattung für den Einsatz in der realen Welt wünschenswert. Diese Arbeit befasst sich mit der Identifizierung reduzierter Sensoranordnungen. Wir präsentieren die erste, schnelle Berechnungsmethode in einem GUI-basierten Designtool, das iteratives Design basierend auf den Anforderungen des Designers ermöglicht. Wir zeigen zudem, dass Geräte mit minimalem Formfaktor wie smarte Ringe für die Erkennung von Mikrogesten verwendet werden können. Insgesamt dienen die vorgestellten Ergebnisse sowohl als konzeptionelle als auch als technische Grundlage für die Realisierung von Interaktion mit Computergeräten wo und wann immer Nutzer sie benötigen.Bosch Researc

    Grasping for the Task:Human Principles for Robot Hands

    Get PDF
    The significant advances made in the design and construction of anthropomorphic robot hands, endow them with prehensile abilities reaching that of humans. However, using these powerful hands with the same level of expertise that humans display is a big challenge for robots. Traditional approaches use finger-tip (precision) or enveloping (power) methods to generate the best force closure grasps. However, this ignores the variety of prehensile postures available to the hand and also the larger context of arm action. This thesis explores a paradigm for grasp formation based on generating oppositional pressure within the hand, which has been proposed as a functional basis for grasping in humans (MacKenzie and Iberall, 1994). A set of opposition primitives encapsulates the hand's ability to generate oppositional forces. The oppositional intention encoded in a primitive serves as a guide to match the hand to the object, quantify its functional ability and relate this to the arm. In this thesis we leverage the properties of opposition primitives to both interpret grasps formed by humans and to construct grasps for a robot considering the larger context of arm action. In the first part of the thesis we examine the hypothesis that hand representation schemes based on opposition are correlated with hand function. We propose hand-parameters describing oppositional intention and compare these with commonly used methods such as joint angles, joint synergies and shape features. We expect that opposition-based parameterizations, which take an interaction-based perspective of a grasp, are able to discriminate between grasps that are similar in shape but different in functional intent. We test this hypothesis using qualitative assessment of precision and power capabilities found in existing grasp taxonomies. The next part of the thesis presents a general method to recognize oppositional intention manifested in human grasp demonstrations. A data glove instrumented with tactile sensors is used to provide the raw information regarding hand configuration and interaction force. For a grasp combining several cooperating oppositional intentions, hand surfaces can be simultaneously involved in multiple oppositional roles. We characterize the low-level interactions between different surfaces of the hand based on captured interaction force and reconstructed hand surface geometry. This is subsequently used to separate out and prioritize multiple and possibly overlapping oppositional intentions present in the demonstrated grasp. We evaluate our method on several human subjects across a wide range of hand functions. The last part of the thesis applies the properties encoded in opposition primitives to optimize task performance of the arm, for tasks where the arm assumes the dominant role. For these tasks, choosing the strongest power grasp available (from a force-closure sense) may constrain the arm to a sub-optimal configuration. Weaker grasp components impose fewer constraints on the hand, and can therefore explore a wider region of the object relative pose space. We take advantage of this to find the good arm configurations from a task perspective. The final hand-arm configuration is obtained by trading of overall robustness in the grasp with ability of the arm to perform the task. We validate our approach, using the tasks of cutting, hammering, screw-driving and opening a bottle-cap, for both human and robotic hand-arm systems

    A Taxonomy of Everyday Grasps in Action*

    No full text
    Abstract-Grasping has been well studied in the robotics and human subjects literature, and numerous taxonomies have been developed to capture the range of grasps employed in work settings or everyday life. But how completely do these taxonomies capture grasping actions that we see every day? We asked two subjects to monitor every action that they performed with their hands during a typical day, as well as to roleplay actions important for self-care, rehabilitation, and various careers and then to classify all grasping actions using existing taxonomies. While our subjects were able to classify many grasps, they also found a collection of grasps that could not be classified. In addition, our subjects observed that single entries in the taxonomy captured not one grasp, but many. When we investigated, we found that these grasps were distinguished by features related to the grasping action, such as intended motion, force, and stiffness -properties also needed for robot control. We suggest a format for augmenting grasp taxonomies that includes features of motion, force, and stiffness using a language that can be understood and expressed by subjects with light training, as would be needed, for example, for annotating examples or coaching a robot. This paper describes our study, the results, and documents our annotated database
    corecore