1,125 research outputs found

    GART: The Gesture and Activity Recognition Toolkit

    Get PDF
    Presented at the 12th International Conference on Human-Computer Interaction, Beijing, China, July 2007.The original publication is available at www.springerlink.comThe Gesture and Activity Recognition Toolit (GART) is a user interface toolkit designed to enable the development of gesture-based applications. GART provides an abstraction to machine learning algorithms suitable for modeling and recognizing different types of gestures. The toolkit also provides support for the data collection and the training process. In this paper, we present GART and its machine learning abstractions. Furthermore, we detail the components of the toolkit and present two example gesture recognition applications

    Visual recognition of American sign language using hidden Markov models

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1995.Includes bibliographical references (leaves 48-52).by Thad Eugene Starner.M.S

    The use of deep learning solutions to develop a practice tool to support LĂĄmh language for communication partners

    Get PDF
    This study has proposed an alternative to promote the learning and enhancement of LĂĄmh language for communication partners that support current users by creating a real time detection tool to recognise 20 chosen LĂĄmh signs based on existing studies in the field. This implementation was carried out by generating primary data composed by MediaPipe landmark numpy arrays of 40 frames and 45 repetitions per sign. The Neural Networks were built using the Python library Keras and the applied SVM models were built with the library sklearn. The real time detection was carried out by integrating the mentioned elements with the library OpenCV. Neural Networks with different architectures with Long Short-Term Memory (LSTM) and 1D Convolutional Neural Network (CNN) were compared with SVM classifications applied with cross-validations to achieve the optimal hyperparameters in order to determine the most appropriate model. The final chosen model after the assessment of the training and testing accuracy and loss was the two 1-D CNN layers with 32 and 64 nodes respectively, a dropout of 0.2 followed by two LSTM layers with 32 and 64 nodes respectively and a dense layer of 32 nodes. The training accuracy was 99.86%, the testing accuracy was 93.33%, the training loss was 0.0035 and the testing loss was 0.1791. This was the model which performed better in a real-time detection environment, easily detecting 8 different LĂĄmh signs and detecting other 6 with reservations. For future work, some skeletal motion signs should be captured again and other data augmentation strategies should be adopted, like capturing hips and legs landmarks alongside the signs and explore the augmentation of the data by promoting offset measures of the landmark coordinates of the skeletons captured by MediaPipe. Once the corrections of the methodology achieve better real time results, works toward tool accessibility and user experience should be investigated in order to generate a LĂĄmh language real-time detection tool that could potentially promote LĂĄmh and become a learning alternative for communication partners

    Online annotations tools for micro-level human behavior labeling on videos

    Get PDF
    Abstract. Successful machine learning and computer vision approach generally require significant amounts of annotated data for learning. These methods including identification, retrieval, classification of events, and analysis of human behavior from a video. Micro-level human behavior analysis usually requires laborious efforts for obtaining the precise labels. As the quantity of online video grows, the crowdsourcing approach provides a method for workers without a professional background to complete the annotation task. These workers require training to understand implicit knowledge of human behavior. The motivation of this study was to enhance the interaction between annotation workers for training purposes. By observing experienced local researchers in Oulu, the key problem with annotation is the precision of the results. The goal of this study was to provide training tools for people to improve the label quality, it illustrates the importance of training. In this study, a new annotation tool was developed to test workers’ performance in reviewing other annotations. This tool filters very noisy input by comment and vote feature. The result indicated that users were more likely to annotate micro behavior and time that refer to other opinions, and it was a more effective and reliable way to train. Besides, this study reported the development process with React and Firebase, it emphasized the use of more Web resources and tools to develop annotation tools

    Data-driven machine translation for sign languages

    Get PDF
    This thesis explores the application of data-driven machine translation (MT) to sign languages (SLs). The provision of an SL MT system can facilitate communication between Deaf and hearing people by translating information into the native and preferred language of the individual. We begin with an introduction to SLs, focussing on Irish Sign Language - the native language of the Deaf in Ireland. We describe their linguistics and mechanics including similarities and differences with spoken languages. Given the lack of a formalised written form of these languages, an outline of annotation formats is discussed as well as the issue of data collection. We summarise previous approaches to SL MT, highlighting the pros and cons of each approach. Initial experiments in the novel area of example-based MT for SLs are discussed and an overview of the problems that arise when automatically translating these manual-visual languages is given. Following this we detail our data-driven approach, examining the MT system used and modifications made for the treatment of SLs and their annotation. Through sets of automatically evaluated experiments in both language directions, we consider the merits of data-driven MT for SLs and outline the mainstream evaluation metrics used. To complete the translation into SLs, we discuss the addition and manual evaluation of a signing avatar for real SL output

    Mobile Pen and Paper Interaction

    Get PDF
    Although smartphones, tablets and other mobile devices become increasingly popular, pen and paper continue to play an important role in mobile settings, such as note taking or creative discussions. However, information on paper documents remains static and usage practices involving sharing, researching, linking or in any other way digitally processing information on paper are hindered by the gap between the digital and physical worlds. A considerable body of research has leveraged digital pen technology in order to overcome this problem with respect to static settings, however, systematically neglecting the mobile domain. Only recently, several approaches began exploring the mobile domain and developing initial insights into mobile pen-and-paper interaction (mPPI), e.g., to publish digital sketches, [Cowan et al., 2011], link paper and digital artifacts, [Pietrzak et al., 2012] or compose music, [Tsandilas, 2012]. However, applications designed to integrate the most common mobile tools pen, paper and mobile devices, thereby combining the benefits of both worlds in a hybrid mPPI ensemble, are hindered by the lack of supporting infrastructures and limited theoretical understanding of interaction design in the domain. This thesis advances the field by contributing a novel infrastructural approach toward supporting mPPI. It allows applications employing digital pen technology in controlling interactive functionality while preserving mobile characteristics of pen and paper. In addition, it contributes a conceptual framework of user interaction in the domain suiting to serve as basis for novel mPPI toolkits. Such toolkits ease development of mPPI solutions by focusing on expressing interaction rather than designing user interfaces by means of rigid widget sets. As such, they provide the link between infrastructure and interaction in the domain. Lastly, this thesis presents a novel, empirically substantiated theory of interaction in hybrid mPPI ensembles. This theory informs interaction design of mPPI, ultimately allowing to develop compelling and engaging interactive systems employing this modality
    • 

    corecore