    Gesture imitation and recognition using Kinect sensor and extreme learning machines

    This study presents a framework that recognizes and imitates human upper-body motions in real time. The framework consists of two parts. In the first part, a transformation algorithm is applied to 3D human motion data captured by a Kinect. The data are then converted into the robot’s joint angles by the algorithm. The human upper-body motions are successfully imitated by the NAO humanoid robot in real time. In the second part, the human action recognition algorithm is implemented for upper-body gestures. A human action dataset is also created for the upper-body movements. Each action is performed 10 times by twenty-four users. The collected joint angles are divided into six action classes. Extreme Learning Machines (ELMs) are used to classify the human actions. Additionally, the Feed-Forward Neural Networks (FNNs) and K-Nearest Neighbor (K-NN) classifiers are used for comparison. According to the comparative results, ELMs produce a good human action recognition performance

    Human activity learning for assistive robotics using a classifier ensemble

    Assistive robots in ambient assisted living environments can be equipped with learning capabilities to effectively learn and execute human activities. This paper proposes a human activity learning (HAL) system for application in assistive robotics. An RGB-depth sensor is used to acquire information of human activities, and a set of statistical, spatial and temporal features for encoding key aspects of human activities are extracted from the acquired information of human activities. Redundant features are removed and the relevant features used in the HAL model. An ensemble of three individual classifiers—support vector machines (SVMs), K-nearest neighbour and random forest - is employed to learn the activities. The performance of the proposed system is improved when compared with the performance of other methods using a single classifier. This approach is evaluated on experimental dataset created for this work and also on a benchmark dataset—the Cornell Activity Dataset (CAD-60). Experimental results show the overall performance achieved by the proposed system is comparable to the state of the art and has the potential to benefit applications in assistive robots for reducing the time spent in learning activities

    Human robot interaction by understanding upper body gestures

    In this paper, a human–robot interaction system based on a novel combination of sensors is proposed. It allows one person to interact with a humanoid social robot using natural body language. The robot understands the meaning of human upper body gestures and expresses itself by using a combination of body movements, facial expressions, and verbal language. A set of 12 upper body gestures is involved for communication. This set also includes gestures with human–object interactions. The gestures are characterized by head, arm, and hand posture information. The wearable Immersion CyberGlove II is employed to capture the hand posture. This information is combined with the head and arm posture captured from Microsoft Kinect. This is a new sensor solution for human-gesture capture. Based on the posture data from the CyberGlove II and Kinect, an effective and real-time human gesture recognition method is proposed. The gesture understanding approach based on an innovative combination of sensors is the main contribution of this paper. To verify the effectiveness of the proposed gesture recognition method, a human body gesture data set is built. The experimental results demonstrate that our approach can recognize the upper body gestures with high accuracy in real time. In addition, for robot motion generation and control, a novel online motion planning method is proposed. In order to generate appropriate dynamic motion, a quadratic programming (QP)-based dual-arms kinematic motion generation scheme is proposed, and a simplified recurrent neural network is employed to solve the QP problem. The integration of a handshake within the HRI system illustrates the effectiveness of the proposed online generation method.Published versio

    SiAM-dp : an open development platform for massively multimodal dialogue systems in cyber-physical environments

    Cyber-physical environments enhance natural environments of daily life such as homes, factories, offices, and cars by connecting the cybernetic world of computers and communication with the real physical world. While under the keyword of Industrie 4.0, cyber-physical environments will take a relevant role in the next industrial revolution, and they will also appear in homes, offices, workshops, and numerous other areas. In this new world, classical interaction concepts where users exclusively interact with a single stationary device, PC or smartphone become less dominant and make room for new occurrences of interaction between humans and the environment itself. Furthermore, new technologies and a rising spectrum of applicable modalities broaden the possibilities for interaction designers to include more natural and intuitive non-verbal and verbal communication. The dynamic characteristic of a cyber-physical environment and the mobility of users confronts developers with the challenge of developing systems that are flexible concerning the connected and used devices and modalities. This implies new opportunities for cross-modal interaction that go beyond dual modalities interaction as is well known nowadays. This thesis addresses the support of application developers with a platform for the declarative and model based development of multimodal dialogue applications, with a focus on distributed input and output devices in cyber-physical environments. The main contributions can be divided into three parts: - Design of models and strategies for the specification of dialogue applications in a declarative development approach. This includes models for the definition of project resources, dialogue behaviour, speech recognition grammars, and graphical user interfaces and mapping rules, which convert the device specific representation of input and output description to a common representation language. - The implementation of a runtime platform that provides a flexible and extendable architecture for the easy integration of new devices and components. The platform realises concepts and strategies of multimodal human-computer interaction and is the basis for full-fledged multimodal dialogue applications for arbitrary device setups, domains, and scenarios. - A software development toolkit that is integrated in the Eclipse rich client platform and provides wizards and editors for creating and editing new multimodal dialogue applications.Cyber-physische Umgebungen (CPEs) erweitern natürliche Alltagsumgebungen wie Heim, Fabrik, Büro und Auto durch Verbindung der kybernetischen Welt der Computer und Kommunikation mit der realen, physischen Welt. Die möglichen Anwendungsgebiete hierbei sind weitreichend. Während unter dem Stichwort Industrie 4.0 cyber-physische Umgebungen eine bedeutende Rolle für die nächste industrielle Revolution spielen werden, erhalten sie ebenfalls Einzug in Heim, Büro, Werkstatt und zahlreiche weitere Bereiche. In solch einer neuen Welt geraten klassische Interaktionskonzepte, in denen Benutzer ausschließlich mit einem einzigen Gerät, PC oder Smartphone interagieren, immer weiter in den Hintergrund und machen Platz für eine neue Ausprägung der Interaktion zwischen dem Menschen und der Umgebung selbst. Darüber hinaus sorgen neue Technologien und ein wachsendes Spektrum an einsetzbaren Modalitäten dafür, dass sich im Interaktionsdesign neue Möglichkeiten für eine natürlichere und intuitivere verbale und nonverbale Kommunikation auftun. Die dynamische Natur von cyber-physischen Umgebungen und die Mobilität der Benutzer darin stellt Anwendungsentwickler vor die Herausforderung, Systeme zu entwickeln, die flexibel bezüglich der verbundenen und verwendeten Geräte und Modalitäten sind. Dies impliziert auch neue Möglichkeiten in der modalitätsübergreifenden Kommunikation, die über duale Interaktionskonzepte, wie sie heutzutage bereits üblich sind, hinausgehen. Die vorliegende Arbeit befasst sich mit der Unterstützung von Anwendungsentwicklern mit Hilfe einer Plattform zur deklarativen und modellbasierten Entwicklung von multimodalen Dialogapplikationen mit einem Fokus auf verteilte Ein- und Ausgabegeräte in cyber-physischen Umgebungen. Die bearbeiteten Aufgaben können grundlegend in drei Teile gegliedert werden: - Die Konzeption von Modellen und Strategien für die Spezifikation von Dialoganwendungen in einem deklarativen Entwicklungsansatz. Dies beinhaltet Modelle für das Definieren von Projektressourcen, Dialogverhalten, Spracherkennergrammatiken, graphischen Benutzerschnittstellen und Abbildungsregeln, die die gerätespezifische Darstellung von Ein- und Ausgabegeräten in eine gemeinsame Repräsentationssprache transformieren. - Die Implementierung einer Laufzeitumgebung, die eine flexible und erweiterbare Architektur für die einfache Integration neuer Geräte und Komponenten bietet. Die Plattform realisiert Konzepte und Strategien der multimodalen Mensch-Maschine-Interaktion und ist die Basis vollwertiger multimodaler Dialoganwendungen für beliebige Domänen, Szenarien und Gerätekonfigurationen. - Eine Softwareentwicklungsumgebung, die in die Eclipse Rich Client Plattform integriert ist und Entwicklern Assistenten und Editoren an die Hand gibt, die das Erstellen und Editieren von neuen multimodalen Dialoganwendungen unterstützen