400 research outputs found

    Multichannel Attention Network for Analyzing Visual Behavior in Public Speaking

    Get PDF
    Public speaking is an important aspect of human communication and interaction. The majority of computational work on public speaking concentrates on analyzing the spoken content, and the verbal behavior of the speakers. While the success of public speaking largely depends on the content of the talk, and the verbal behavior, non-verbal (visual) cues, such as gestures and physical appearance also play a significant role. This paper investigates the importance of visual cues by estimating their contribution towards predicting the popularity of a public lecture. For this purpose, we constructed a large database of more than 18001800 TED talk videos. As a measure of popularity of the TED talks, we leverage the corresponding (online) viewers' ratings from YouTube. Visual cues related to facial and physical appearance, facial expressions, and pose variations are extracted from the video frames using convolutional neural network (CNN) models. Thereafter, an attention-based long short-term memory (LSTM) network is proposed to predict the video popularity from the sequence of visual features. The proposed network achieves state-of-the-art prediction accuracy indicating that visual cues alone contain highly predictive information about the popularity of a talk. Furthermore, our network learns a human-like attention mechanism, which is particularly useful for interpretability, i.e. how attention varies with time, and across different visual cues by indicating their relative importance

    From Unimodal to Multimodal: improving the sEMG-Based Pattern Recognition via deep generative models

    Full text link
    Multimodal hand gesture recognition (HGR) systems can achieve higher recognition accuracy. However, acquiring multimodal gesture recognition data typically requires users to wear additional sensors, thereby increasing hardware costs. This paper proposes a novel generative approach to improve Surface Electromyography (sEMG)-based HGR accuracy via virtual Inertial Measurement Unit (IMU) signals. Specifically, we trained a deep generative model based on the intrinsic correlation between forearm sEMG signals and forearm IMU signals to generate virtual forearm IMU signals from the input forearm sEMG signals at first. Subsequently, the sEMG signals and virtual IMU signals were fed into a multimodal Convolutional Neural Network (CNN) model for gesture recognition. To evaluate the performance of the proposed approach, we conducted experiments on 6 databases, including 5 publicly available databases and our collected database comprising 28 subjects performing 38 gestures, containing both sEMG and IMU data. The results show that our proposed approach outperforms the sEMG-based unimodal HGR method (with increases of 2.15%-13.10%). It demonstrates that incorporating virtual IMU signals, generated by deep generative models, can significantly enhance the accuracy of sEMG-based HGR. The proposed approach represents a successful attempt to transition from unimodal HGR to multimodal HGR without additional sensor hardware

    Recognizing Handshapes using Small Datasets

    Get PDF
    Advances in convolutional neural networks have made possible significant improvements in the state-of-the-art in image classification. However, their success on a particular field rests on the possibility of obtaining labeled data to train networks. Handshape recognition from images, an important subtask of both gesture and sign language recognition, suffers from such a lack of data. Furthermore, hands are highly deformable objects and therefore handshape classification models require larger datasets. We analyze both state of the art models for image classification, as well as data augmentation schemes and specific models to tackle problems with small datasets. In particular, we perform experiments with Wide- DenseNet, a state of the art convolutional architecture and Prototypical Networks, a state of the art few-shot learning meta model. In both cases, we also quantify the impact of data augmentation on accuracy. Our results show that on small and simple data sets such as CIARP, all models and variations of achieve perfect accuracy, and therefore the utility of the data is highly doubtful, despite its having 6000 samples. On the other hand, in small but complex datasets such as LSA16 (800 samples), specialized methods such as Prototypical Networks do have an advantage over other methods. On RWTH, another complex and small dataset with close to 4000 samples, a traditional and state-of-the-art method such as Wide-DenseNet surpasses all other models. Also, data augmentation consistently increases accuracy for Wide-DenseNet, but not fo Prototypical Networks.XX Workshop de Agentes y Sistemas inteligentes.Red de Universidades con Carreras en Informátic

    Recognizing Handshapes using Small Datasets

    Get PDF
    Advances in convolutional neural networks have made possible significant improvements in the state-of-the-art in image classification. However, their success on a particular field rests on the possibility of obtaining labeled data to train networks. Handshape recognition from images, an important subtask of both gesture and sign language recognition, suffers from such a lack of data. Furthermore, hands are highly deformable objects and therefore handshape classification models require larger datasets. We analyze both state of the art models for image classification, as well as data augmentation schemes and specific models to tackle problems with small datasets. In particular, we perform experiments with Wide- DenseNet, a state of the art convolutional architecture and Prototypical Networks, a state of the art few-shot learning meta model. In both cases, we also quantify the impact of data augmentation on accuracy. Our results show that on small and simple data sets such as CIARP, all models and variations of achieve perfect accuracy, and therefore the utility of the data is highly doubtful, despite its having 6000 samples. On the other hand, in small but complex datasets such as LSA16 (800 samples), specialized methods such as Prototypical Networks do have an advantage over other methods. On RWTH, another complex and small dataset with close to 4000 samples, a traditional and state-of-the-art method such as Wide-DenseNet surpasses all other models. Also, data augmentation consistently increases accuracy for Wide-DenseNet, but not fo Prototypical Networks.XX Workshop de Agentes y Sistemas inteligentes.Red de Universidades con Carreras en Informátic

    Recognizing Handshapes using Small Datasets

    Get PDF
    Advances in convolutional neural networks have made possible significant improvements in the state-of-the-art in image classification. However, their success on a particular field rests on the possibility of obtaining labeled data to train networks. Handshape recognition from images, an important subtask of both gesture and sign language recognition, suffers from such a lack of data. Furthermore, hands are highly deformable objects and therefore handshape classification models require larger datasets. We analyze both state of the art models for image classification, as well as data augmentation schemes and specific models to tackle problems with small datasets. In particular, we perform experiments with Wide- DenseNet, a state of the art convolutional architecture and Prototypical Networks, a state of the art few-shot learning meta model. In both cases, we also quantify the impact of data augmentation on accuracy. Our results show that on small and simple data sets such as CIARP, all models and variations of achieve perfect accuracy, and therefore the utility of the data is highly doubtful, despite its having 6000 samples. On the other hand, in small but complex datasets such as LSA16 (800 samples), specialized methods such as Prototypical Networks do have an advantage over other methods. On RWTH, another complex and small dataset with close to 4000 samples, a traditional and state-of-the-art method such as Wide-DenseNet surpasses all other models. Also, data augmentation consistently increases accuracy for Wide-DenseNet, but not fo Prototypical Networks.XX Workshop de Agentes y Sistemas inteligentes.Red de Universidades con Carreras en Informátic

    Myoelectric Control Systems for Hand Rehabilitation Device: A Review

    Get PDF
    One of the challenges of the hand rehabilitation device is to create a smooth interaction between the device and user. The smooth interaction can be achieved by considering myoelectric signal generated by human's muscle. Therefore, the so-called myoelectric control system (MCS) has been developed since the 1940s. Various MCS's has been proposed, developed, tested, and implemented in various hand rehabilitation devices for different purposes. This article presents a review of MCS in the existing hand rehabilitation devices. The MCS can be grouped into main groups, the non-pattern recognition and pattern recognition ones. In term of implementation, it can be classified as MCS for prosthetic and exoskeleton hand. Main challenges for MCS today is the robustness issue that hampers the implementation of MCS on the clinical application

    A Review on Human-Computer Interaction and Intelligent Robots

    Get PDF
    In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research
    corecore