Hand Shape Recognition Using a {ToF} Camera : An Application to Sign Language

Abstract

This master's thesis investigates the benefit of utilizing depth information acquired by a time-of-flight (ToF) camera for hand shape recognition from unrestricted viewpoints. Specifically, we assess the hypothesis that classical 3D content descriptors might be inappropriate for ToF depth images due to the 2.5D nature and noisiness of the data and possible expensive computations in 3D space. Instead, we extend 2D descriptors to make use of the additional semantics of depth images. Our system is based on the appearance-based retrieval paradigm, using a synthetic 3D hand model to generate its database. The system is able to run at interactive frame rates. For increased robustness, no color, intensity, or time coherence information is used. A novel, domain-specific algorithm for segmenting the forearm from the upper body based on reprojecting the acquired geometry into the lateral view is introduced. Moreover, three kinds of descriptors exploiting depth data are proposed and the made design choices are experimentally supported. The whole system is then evaluated on an American sign language fingerspelling dataset. However, the retrieval performance still leaves room for improvements. Several insights and possible reasons are discussed

    Similar works

    Full text

    thumbnail-image

    Available Versions