239 research outputs found

    MULTI-MODAL TASK INSTRUCTIONS TO ROBOTS BY NAIVE USERS

    Get PDF
    This thesis presents a theoretical framework for the design of user-programmable robots. The objective of the work is to investigate multi-modal unconstrained natural instructions given to robots in order to design a learning robot. A corpus-centred approach is used to design an agent that can reason, learn and interact with a human in a natural unconstrained way. The corpus-centred design approach is formalised and developed in detail. It requires the developer to record a human during interaction and analyse the recordings to find instruction primitives. These are then implemented into a robot. The focus of this work has been on how to combine speech and gesture using rules extracted from the analysis of a corpus. A multi-modal integration algorithm is presented, that can use timing and semantics to group, match and unify gesture and language. The algorithm always achieves correct pairings on a corpus and initiates questions to the user in ambiguous cases or missing information. The domain of card games has been investigated, because of its variety of games which are rich in rules and contain sequences. A further focus of the work is on the translation of rule-based instructions. Most multi-modal interfaces to date have only considered sequential instructions. The combination of frame-based reasoning, a knowledge base organised as an ontology and a problem solver engine is used to store these rules. The understanding of rule instructions, which contain conditional and imaginary situations require an agent with complex reasoning capabilities. A test system of the agent implementation is also described. Tests to confirm the implementation by playing back the corpus are presented. Furthermore, deployment test results with the implemented agent and human subjects are presented and discussed. The tests showed that the rate of errors that are due to the sentences not being defined in the grammar does not decrease by an acceptable rate when new grammar is introduced. This was particularly the case for complex verbal rule instructions which have a large variety of being expressed

    SiAM-dp : an open development platform for massively multimodal dialogue systems in cyber-physical environments

    Get PDF
    Cyber-physical environments enhance natural environments of daily life such as homes, factories, offices, and cars by connecting the cybernetic world of computers and communication with the real physical world. While under the keyword of Industrie 4.0, cyber-physical environments will take a relevant role in the next industrial revolution, and they will also appear in homes, offices, workshops, and numerous other areas. In this new world, classical interaction concepts where users exclusively interact with a single stationary device, PC or smartphone become less dominant and make room for new occurrences of interaction between humans and the environment itself. Furthermore, new technologies and a rising spectrum of applicable modalities broaden the possibilities for interaction designers to include more natural and intuitive non-verbal and verbal communication. The dynamic characteristic of a cyber-physical environment and the mobility of users confronts developers with the challenge of developing systems that are flexible concerning the connected and used devices and modalities. This implies new opportunities for cross-modal interaction that go beyond dual modalities interaction as is well known nowadays. This thesis addresses the support of application developers with a platform for the declarative and model based development of multimodal dialogue applications, with a focus on distributed input and output devices in cyber-physical environments. The main contributions can be divided into three parts: - Design of models and strategies for the specification of dialogue applications in a declarative development approach. This includes models for the definition of project resources, dialogue behaviour, speech recognition grammars, and graphical user interfaces and mapping rules, which convert the device specific representation of input and output description to a common representation language. - The implementation of a runtime platform that provides a flexible and extendable architecture for the easy integration of new devices and components. The platform realises concepts and strategies of multimodal human-computer interaction and is the basis for full-fledged multimodal dialogue applications for arbitrary device setups, domains, and scenarios. - A software development toolkit that is integrated in the Eclipse rich client platform and provides wizards and editors for creating and editing new multimodal dialogue applications.Cyber-physische Umgebungen (CPEs) erweitern natürliche Alltagsumgebungen wie Heim, Fabrik, Büro und Auto durch Verbindung der kybernetischen Welt der Computer und Kommunikation mit der realen, physischen Welt. Die möglichen Anwendungsgebiete hierbei sind weitreichend. Während unter dem Stichwort Industrie 4.0 cyber-physische Umgebungen eine bedeutende Rolle für die nächste industrielle Revolution spielen werden, erhalten sie ebenfalls Einzug in Heim, Büro, Werkstatt und zahlreiche weitere Bereiche. In solch einer neuen Welt geraten klassische Interaktionskonzepte, in denen Benutzer ausschließlich mit einem einzigen Gerät, PC oder Smartphone interagieren, immer weiter in den Hintergrund und machen Platz für eine neue Ausprägung der Interaktion zwischen dem Menschen und der Umgebung selbst. Darüber hinaus sorgen neue Technologien und ein wachsendes Spektrum an einsetzbaren Modalitäten dafür, dass sich im Interaktionsdesign neue Möglichkeiten für eine natürlichere und intuitivere verbale und nonverbale Kommunikation auftun. Die dynamische Natur von cyber-physischen Umgebungen und die Mobilität der Benutzer darin stellt Anwendungsentwickler vor die Herausforderung, Systeme zu entwickeln, die flexibel bezüglich der verbundenen und verwendeten Geräte und Modalitäten sind. Dies impliziert auch neue Möglichkeiten in der modalitätsübergreifenden Kommunikation, die über duale Interaktionskonzepte, wie sie heutzutage bereits üblich sind, hinausgehen. Die vorliegende Arbeit befasst sich mit der Unterstützung von Anwendungsentwicklern mit Hilfe einer Plattform zur deklarativen und modellbasierten Entwicklung von multimodalen Dialogapplikationen mit einem Fokus auf verteilte Ein- und Ausgabegeräte in cyber-physischen Umgebungen. Die bearbeiteten Aufgaben können grundlegend in drei Teile gegliedert werden: - Die Konzeption von Modellen und Strategien für die Spezifikation von Dialoganwendungen in einem deklarativen Entwicklungsansatz. Dies beinhaltet Modelle für das Definieren von Projektressourcen, Dialogverhalten, Spracherkennergrammatiken, graphischen Benutzerschnittstellen und Abbildungsregeln, die die gerätespezifische Darstellung von Ein- und Ausgabegeräten in eine gemeinsame Repräsentationssprache transformieren. - Die Implementierung einer Laufzeitumgebung, die eine flexible und erweiterbare Architektur für die einfache Integration neuer Geräte und Komponenten bietet. Die Plattform realisiert Konzepte und Strategien der multimodalen Mensch-Maschine-Interaktion und ist die Basis vollwertiger multimodaler Dialoganwendungen für beliebige Domänen, Szenarien und Gerätekonfigurationen. - Eine Softwareentwicklungsumgebung, die in die Eclipse Rich Client Plattform integriert ist und Entwicklern Assistenten und Editoren an die Hand gibt, die das Erstellen und Editieren von neuen multimodalen Dialoganwendungen unterstützen

    Desenvolvimento de um sistema de diálogo para interação com robôs

    Get PDF
    Mestrado em Engenharia de Computadores e TelemáticaService robots operate in the same environment as humans and perform actions that a human usually performs. These robots must be able to operate autonomously in unknown and dynamic environments, as well as to maneuver with several people and know how to deal with them. By complying with these requirements, they are able to successfully address humans and fulfill their requests whenever they need assistance in a certain task. Natural language communication, including speech that is the most natural way of communication between humans, becomes relevant in the field of Human-Robot Interaction (HRI). By endowing service robots with intuitive spoken interfaces, the specification of the human required tasks is facilitated. However, this is a complicated task to achieve due to the resources involved in creating a sufficiently intuitive spoken interface and because of the difficulty of deploying it in different robots. The main objective of this thesis is the definition, implementation and evaluation of a dialogue system that can be easily integrated into any robotic platform and that functions as a flexible base for the creation of any conversational scenario in the Portuguese language. The system must meet the basic requirements for intuitive and natural communications, namely the characteristics of human-human conversations. A system was developed that functions as a base to give continuity to future work on Spoken Dialog Systems. The system incorporates the client-server architecture, where the client runs on the robot and captures what the user says. The client takes advantage on external dialogue management services. They are executed by the server, which processes the audio obtained, returning an appropriate response given the context of the dialogue. The development was based on a critical analysis of the state of the art in order for the system to be as faithful as possible to what is already done. Through the evaluation phase of the system, it was managed to obtain by few volunteers the conclusion that the main objective was accomplished: a base system was created that is flexible enough to explore different contexts of conversation, such as interacting with children or providing information on a university environment.Os robôs de serviço operam no mesmo ambiente dos humanos e executam ações que um humano normalmente executaria. Estes robôs devem ser capazes de operar de forma autónoma em ambientes desconhecidos e dinâmicos, assim como de manobrar em ambientes com várias pessoas e de saberem lidar com elas. Ao respeitarem estes requisitos, conseguirão abordar com sucesso os humanos e cumprir as suas solicitações sempre que estes precisem de assistência em alguma tarefa. A comunicação por linguagem natural, nomeadamente a fala que é a forma mais abrangente de comunicação entre humanos, torna-se relevante na área da Interação humano-robô (IHR). Ao dotar os robôs de serviço com sistemas de voz intuitivos facilita-se a especificação das tarefas a realizar. No entanto, é uma tarefa complicada de se realizar devido aos recursos envolvidos na criação de uma interação suficientemente intuitiva e devido à dificuldade de funcionar em diversos robôs. O objetivo principal deste trabalho é a definição, implementação e avaliação de um sistema de diálogo que seja de fácil integração em qualquer sistema robótico e que funcione como uma base flexível para qualquer cenário de conversação na língua Portuguesa. Deve obedecer a requisitos base de comunicação intuitiva e natural, nomeadamente a características de conversas entre humanos. Foi desenvolvido um sistema que funciona como uma base para dar continuidade a trabalho futuro em sistemas de diálogo. O sistema incorpora a arquitetura cliente-servidor onde o cliente é executado no robô e capta o que o utilizador diz. O cliente tira partido de serviços de gestão de diálogo externos ao robô, executados pelo servidor, que processa o áudio obtido, devolvendo uma resposta ao cliente adequada ao contexto do diálogo. O desenvolvimento foi baseado numa análise crítica do estado da arte para se tentar manter fiel ao que já foi feito e de forma a se tomarem as principais decisões durante a implementação. Mediante a fase de avaliação do sistema, tanto a nível do ponto de vista da interação como do programador, conseguiu-se obter por parte de alguns voluntários que o objetivo principal foi cumprido: foi criada uma base suficientemente flexível para explorar diferentes contextos de conversação, nomeadamente interagir com crianças ou fornecimento de informações em ambiente universitário

    "A model-driven approach for designing multi-platform user interface dialogues": dialogues specification

    Get PDF
    Human-computer interaction becomes sophisticated, multimodal and multi device and needs to be well-designed with the aim of facilitating application correction (i.e. to correcting errors/bugs in the application) or extension (i.e. adding new functionalities or modifying existing tasks). This thesis is focused on building a methodology of designing and specifying User Interface (UI) behaviour. The Unified Modelling Language (UML) is used to describe in detail the conceptual model and to define all its objects. The methodology flux diagram is provided with the specification of the consistency and the completeness properties of the transformation model. To support the methodology, we implement a graphic Dialog Editor in which Models are organized in three levels (abstract, concrete and final) according to Cameleon Reference Framework (CFR) and, whose process respects the Model Driven Engineering (MDE) approach. Furthermore, the use of Dialog Editor is illustrated through a simple exam...Les interfaces Homme-Machine deviennent de plus en plus complexes. Leur conception nécessite des nouveaux outils et/ou méthodes. En exploitant l'aproche orienté-modèle, cette thèse repond à ce besoin en proposant une méthodologie de conception des dialogues multi-plateform

    The Future of Humanoid Robots

    Get PDF
    This book provides state of the art scientific and engineering research findings and developments in the field of humanoid robotics and its applications. It is expected that humanoids will change the way we interact with machines, and will have the ability to blend perfectly into an environment already designed for humans. The book contains chapters that aim to discover the future abilities of humanoid robots by presenting a variety of integrated research in various scientific and engineering fields, such as locomotion, perception, adaptive behavior, human-robot interaction, neuroscience and machine learning. The book is designed to be accessible and practical, with an emphasis on useful information to those working in the fields of robotics, cognitive science, artificial intelligence, computational methods and other fields of science directly or indirectly related to the development and usage of future humanoid robots. The editor of the book has extensive R&D experience, patents, and publications in the area of humanoid robotics, and his experience is reflected in editing the content of the book

    Fusion multimodale pour les systèmes d'interaction

    Get PDF
    Les chercheurs en informatique et en génie informatique consacrent une partie importante de leurs efforts sur la communication et l'interaction entre l'homme et la machine. En effet, avec l'avènement du traitement multimodal et du multimédia en temps réel, l'ordinateur n'est plus considéré seulement comme un outil de calcul, mais comme une machine de traitement, de communication, de collection et de contrôle, une machine qui accompagne, aide et favorise de nombreuses activités dans la vie quotidienne. Une interface multimodale permet une interaction plus flexible et naturelle entre l’homme et la machine, en augmentant la capacité des systèmes multimodaux pour une meilleure correspondance avec les besoin de l’homme. Dans ce type d’interaction, un moteur de fusion est un composant fondamental qui interprète plusieurs sources de communications, comme les commandes vocales, les gestes, le stylet, etc. ce qui rend l’interaction homme-machine plus riche et plus efficace. Notre projet de recherche permettra une meilleure compréhension de la fusion et de l'interaction multimodale, par la construction d'un moteur de fusion en utilisant des technologies de Web sémantique. L'objectif est de développer un système expert pour l'interaction multimodale personne-machine qui mènera à la conception d'un outil de surveillance pour personnes âgées, afin de leurs assurer une aide et une confiance en soi, à domicile comme à l'extérieur

    Programming Robots for Activities of Everyday Life

    Get PDF
    Text-based programming remains a challenge to novice programmers in\ua0all programming domains including robotics. The use of robots is gainingconsiderable traction in several domains since robots are capable of assisting\ua0humans in repetitive and hazardous tasks. In the near future, robots willbe used in tasks of everyday life in homes, hotels, airports, museums, etc.\ua0However, robotic missions have been either predefined or programmed usinglow-level APIs, making mission specification task-specific and error-prone.\ua0To harness the full potential of robots, it must be possible to define missionsfor specific applications domains as needed. The specification of missions of\ua0robotic applications should be performed via easy-to-use, accessible ways, and\ua0at the same time, be accurate, and unambiguous. Simplicity and flexibility in\ua0programming such robots are important, since end-users come from diverse\ua0domains, not necessarily with suffcient programming knowledge.The main objective of this licentiate thesis is to empirically understand the\ua0state-of-the-art in languages and tools used for specifying robot missions byend-users. The findings will form the basis for interventions in developing\ua0future languages for end-user robot programming.During the empirical study, DSLs for robot mission specification were\ua0analyzed through published literature, their websites, user manuals, samplemissions and using the languages to specify missions for supported robots.After extracting data from 30 environments, 133 features were identified.\ua0A feature matrix mapping the features to the environments was developedwith a feature model for robotic mission specification DSLs.Our results show that most end-user facing environments exist in the\ua0education domain for teaching novice programmers and STEM subjects. Mostof the visual languages are developed using Blockly and Scratch libraries.\ua0The end-user domain abstraction needs more work since most of the visualenvironments abstract robotic and programming language concepts but not\ua0end-user concepts. In future works, it is important to focus on the development\ua0of reusable libraries for end-user concepts; and further, explore how end-user\ua0facing environments can be adapted for novice programmers to learn\ua0general programming skills and robot programming in low resource settings\ua0in developing countries, like Uganda
    • …
    corecore