Search CORE

31 research outputs found

Head movement in conversation

Author: Gurion T
Publication venue
Publication date: 22/03/2023
Field of study

This work explores the function and form of head movement and specifically head nods in free conversation. It opens with a comparison of three theories that are often considered as triggers for head nods: mimicry, backchannel responses, and responses to speakers' trouble. Early in this work it is assumed that head nods are well defined in terms of movement, and that they can be directly attributed, or at least better explained, by one theory compared to the others. To test that, comparisons between the theories are conducted following two different approaches. In one set of experiments a novel virtual reality method enables the analysis of perceived plausibility of head nods generated by models inspired by these theories. The results suggest that participants could not consciously assess differences between the predictions of the different theories. In part, this is due to a mixture of gamification and study design challenges. In addition, these experiments raise the question of whether or not it is reasonable to expect people to consciously process and report issues with the non-verbal behaviour of their conversational partners. In a second set of experiments the predictions of the theories are compared directly to head nods that are automatically detected from motion capture data. Matching the predictions with automatically detected head nods showed that not only are most predictions wrong, but also that most of the detected head nods are not accounted by any of the theories under question. Whereas these experiments do not adequately answer which theory best describe head nods in conversation, they suggest new avenues to explore: are head nods well defined in the sense that multiple people will agree that a specific motion is a head nod? and if so, what are their movement characteristics and what is their reliance on conversational context? Exploring these questions revealed a complex picture of what people consider to be head nods and their reliance on context. First, the agreement on what is a head nod is moderate, even when annotators are presented with video snippets that include only automatically detected nods. Second, head nods share movement characteristics with other behaviours, specifically laughter. Lastly, head nods are more accurately defined by their semantic characteristics than by their movement properties, suggesting that future detectors should incorporate more contextual features than movement alone. Overall, this thesis questions the coherence of our intuitive notion of a head nod and the adequacy of current approaches to describe the movements involved. It shows how some of the common theories that describe head movement and nods fail to explain most head movement in free conversation. In addition, it highlights subtleties in head movement and nods that are often overlooked. The findings from this work can inform the development of future head nods detection approaches, and provide a better understanding of non-verbal communication in general

Queen Mary Research Online

Multi-Robot Systems: Challenges, Trends and Applications

Author
Publication venue: 'MDPI AG'
Publication date: 06/05/2022
Field of study

This book is a printed edition of the Special Issue entitled “Multi-Robot Systems: Challenges, Trends, and Applications” that was published in Applied Sciences. This Special Issue collected seventeen high-quality papers that discuss the main challenges of multi-robot systems, present the trends to address these issues, and report various relevant applications. Some of the topics addressed by these papers are robot swarms, mission planning, robot teaming, machine learning, immersive technologies, search and rescue, and social robotics

Directory of Open Access Books (DOAB)

A Study of Accomodation of Prosodic and Temporal Features in Spoken Dialogues in View of Speech Technology Applications

Author: Kousidis Spyridon, [Thesis]
Publication venue: Dublin Institute of Technology
Publication date: 01/01/2010
Field of study

Inter-speaker accommodation is a well-known property of human speech and human interaction in general. Broadly it refers to the behavioural patterns of two (or more) interactants and the effect of the (verbal and non-verbal) behaviour of each to that of the other(s). Implementation of thisbehavior in spoken dialogue systems is desirable as an improvement on the naturalness of humanmachine interaction. However, traditional qualitative descriptions of accommodation phenomena do not provide sufficient information for such an implementation. Therefore, a quantitativedescription of inter-speaker accommodation is required. This thesis proposes a methodology of monitoring accommodation during a human or humancomputer dialogue, which utilizes a moving average filter over sequential frames for each speaker. These frames are time-aligned across the speakers, hence the name Time Aligned Moving Average (TAMA). Analysis of spontaneous human dialogue recordings by means of the TAMA methodology reveals ubiquitous accommodation of prosodic features (pitch, intensity and speech rate) across interlocutors, and allows for statistical (time series) modeling of the behaviour, in a way which is meaningful for implementation in spoken dialogue system (SDS) environments.In addition, a novel dialogue representation is proposed that provides an additional point of view to that of TAMA in monitoring accommodation of temporal features (inter-speaker pause length and overlap frequency). This representation is a percentage turn distribution of individual speakercontributions in a dialogue frame which circumvents strict attribution of speaker-turns, by considering both interlocutors as synchronously active. Both TAMA and turn distribution metrics indicate that correlation of average pause length and overlap frequency between speakers can be attributed to accommodation (a debated issue), and point to possible improvements in SDS “turntaking” behaviour. Although the findings of the prosodic and temporal analyses can directly inform SDS implementations, further work is required in order to describe inter-speaker accommodation sufficiently, as well as to develop an adequate testing platform for evaluating the magnitude ofperceived improvement in human-machine interaction. Therefore, this thesis constitutes a first step towards a convincingly useful implementation of accommodation in spoken dialogue systems

Arrow@TUDublin

Human-Robot Interaction architecture for interactive and lively social robots

Author: Fernández Rodicio Enrique
Publication venue: 'MDPI AG'
Publication date: 09/09/2021
Field of study

Mención Internacional en el título de doctorLa sociedad está experimentando un proceso de envejecimiento que puede provocar un desequilibrio entre la población en edad de trabajar y aquella fuera del mercado de trabajo. Una de las soluciones a este problema que se están considerando hoy en día es la introducción de robots en multiples sectores, incluyendo el de servicios. Sin embargo, para que esto sea una solución viable, estos robots necesitan ser capaces de interactuar con personas de manera satisfactoria, entre otras habilidades. En el contexto de la aplicación de robots sociales al cuidado de mayores, esta tesis busca proporcionar a un robot social las habilidades necesarias para crear interacciones entre humanos y robots que sean naturales. En concreto, esta tesis se centra en tres problemas que deben ser solucionados: (i) el modelado de interacciones entre humanos y robots; (ii) equipar a un robot social con las capacidades expresivas necesarias para una comunicación satisfactoria; y (iii) darle al robot una apariencia vivaz. La solución al problema de modelado de diálogos presentada en esta tesis propone diseñar estos diálogos como una secuencia de elementos atómicos llamados Actos Comunicativos (CAs, por sus siglas en inglés). Se pueden parametrizar en tiempo de ejecución para completar diferentes objetivos comunicativos, y están equipados con mecanismos para manejar algunas de las imprecisiones que pueden aparecer durante interacciones. Estos CAs han sido identificados a partir de la combinación de dos dimensiones: iniciativa (si la tiene el robot o el usuario) e intención (si se pretende obtener o proporcionar información). Estos CAs pueden ser combinados siguiendo una estructura jerárquica para crear estructuras mas complejas que sean reutilizables. Esto simplifica el proceso para crear nuevas interacciones, permitiendo a los desarrolladores centrarse exclusivamente en diseñar el flujo del diálogo, sin tener que preocuparse de reimplementar otras funcionalidades que tienen que estar presentes en todas las interacciones (como el manejo de errores, por ejemplo). La expresividad del robot está basada en el uso de una librería de gestos, o expresiones, multimodales predefinidos, modelados como estructuras similares a máquinas de estados. El módulo que controla la expresividad recibe peticiones para realizar dichas expresiones, planifica su ejecución para evitar cualquier conflicto que pueda aparecer, las carga, y comprueba que su ejecución se complete sin problemas. El sistema es capaz también de generar estas expresiones en tiempo de ejecución a partir de una lista de acciones unimodales (como decir una frase, o mover una articulación). Una de las características más importantes de la arquitectura de expresividad propuesta es la integración de una serie de métodos de modulación que pueden ser usados para modificar los gestos del robot en tiempo de ejecución. Esto permite al robot adaptar estas expresiones en base a circunstancias particulares (aumentando al mismo tiempo la variabilidad de la expresividad del robot), y usar un número limitado de gestos para mostrar diferentes estados internos (como el estado emocional). Teniendo en cuenta que ser reconocido como un ser vivo es un requisito para poder participar en interacciones sociales, que un robot social muestre una apariencia de vivacidad es un factor clave en interacciones entre humanos y robots. Para ello, esta tesis propone dos soluciones. El primer método genera acciones a través de las diferentes interfaces del robot a intervalos. La frecuencia e intensidad de estas acciones están definidas en base a una señal que representa el pulso del robot. Dicha señal puede adaptarse al contexto de la interacción o al estado interno del robot. El segundo método enriquece las interacciones verbales entre el robot y el usuario prediciendo los gestos no verbales más apropiados en base al contenido del diálogo y a la intención comunicativa del robot. Un modelo basado en aprendizaje automático recibe la transcripción del mensaje verbal del robot, predice los gestos que deberían acompañarlo, y los sincroniza para que cada gesto empiece en el momento preciso. Este modelo se ha desarrollado usando una combinación de un encoder diseñado con una red neuronal Long-Short Term Memory, y un Conditional Random Field para predecir la secuencia de gestos que deben acompañar a la frase del robot. Todos los elementos presentados conforman el núcleo de una arquitectura de interacción humano-robot modular que ha sido integrada en múltiples plataformas, y probada bajo diferentes condiciones. El objetivo central de esta tesis es contribuir al área de interacción humano-robot con una nueva solución que es modular e independiente de la plataforma robótica, y que se centra en proporcionar a los desarrolladores las herramientas necesarias para desarrollar aplicaciones que requieran interacciones con personas.Society is experiencing a series of demographic changes that can result in an unbalance between the active working and non-working age populations. One of the solutions considered to mitigate this problem is the inclusion of robots in multiple sectors, including the service sector. But for this to be a viable solution, among other features, robots need to be able to interact with humans successfully. This thesis seeks to endow a social robot with the abilities required for a natural human-robot interactions. The main objective is to contribute to the body of knowledge on the area of Human-Robot Interaction with a new, platform-independent, modular approach that focuses on giving roboticists the tools required to develop applications that involve interactions with humans. In particular, this thesis focuses on three problems that need to be addressed: (i) modelling interactions between a robot and an user; (ii) endow the robot with the expressive capabilities required for a successful communication; and (iii) endow the robot with a lively appearance. The approach to dialogue modelling presented in this thesis proposes to model dialogues as a sequence of atomic interaction units, called Communicative Acts, or CAs. They can be parametrized in runtime to achieve different communicative goals, and are endowed with mechanisms oriented to solve some of the uncertainties related to interaction. Two dimensions have been used to identify the required CAs: initiative (the robot or the user), and intention (either retrieve information or to convey it). These basic CAs can be combined in a hierarchical manner to create more re-usable complex structures. This approach simplifies the creation of new interactions, by allowing developers to focus exclusively on designing the flow of the dialogue, without having to re-implement functionalities that are common to all dialogues (like error handling, for example). The expressiveness of the robot is based on the use of a library of predefined multimodal gestures, or expressions, modelled as state machines. The module managing the expressiveness receives requests for performing gestures, schedules their execution in order to avoid any possible conflict that might arise, loads them, and ensures that their execution goes without problems. The proposed approach is also able to generate expressions in runtime based on a list of unimodal actions (an utterance, the motion of a limb, etc...). One of the key features of the proposed expressiveness management approach is the integration of a series of modulation techniques that can be used to modify the robot’s expressions in runtime. This would allow the robot to adapt them to the particularities of a given situation (which would also increase the variability of the robot expressiveness), and to display different internal states with the same expressions. Considering that being recognized as a living being is a requirement for engaging in social encounters, the perception of a social robot as a living entity is a key requirement to foster human-robot interactions. In this dissertation, two approaches have been proposed. The first method generates actions for the different interfaces of the robot at certain intervals. The frequency and intensity of these actions are defined by a signal that represents the pulse of the robot, which can be adapted to the context of the interaction or the internal state of the robot. The second method enhances the robot’s utterance by predicting the appropriate non-verbal expressions that should accompany them, according to the content of the robot’s message, as well as its communicative intention. A deep learning model receives the transcription of the robot’s utterances, predicts which expressions should accompany it, and synchronizes them, so each gesture selected starts at the appropriate time. The model has been developed using a combination of a Long-Short Term Memory network-based encoder and a Conditional Random Field for generating a sequence of gestures that are combined with the robot’s utterance. All the elements presented above conform the core of a modular Human-Robot Interaction architecture that has been integrated in multiple platforms, and tested under different conditions.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Fernando Torres Medina.- Secretario: Concepción Alicia Monje Micharet.- Vocal: Amirabdollahian Farshi

Universidad Carlos III de Madrid e-Archivo

Brokered bargaining: nuclear crises between middle powers

Author: Yusuf Moeed Wasim
Publication venue
Publication date: 22/01/2016
Field of study

This dissertation studies nuclear crisis behavior. Specifically, it theorizes behavior between middle powers with nuclear weapons that are nested within a world with larger hegemonic states. The situation represents a paradigm shift from the bipolar context of the Cold War where all nuclear crises involved one or both superpowers, thereby implying an absence of stronger third parties that could fundamentally alter their crisis behavior. We have focused on the India-Pakistan rivalry, and specifically on their three nuclear crises since South Asia's overt nuclearization: the 1999 Kargil crisis; the 2001-02 standoff; and the 2008 Mumbai crisis. These three case studies form the universe of crises between two middle power nuclear states with stronger third parties present to influence their behavior. Using the structured focused comparison method and relying on existing empirical analyses of these crises, interviews with relevant officials and experts, and newspaper archival research, we have process-traced the key developments in each crisis to identify the processes and mechanisms underpinning behavior. The dissertation argues that middle power nuclear crises ought to be seen as trilateral engagements that accord a key crisis management role to stronger third parties. Crisis behavior can be best understood through "brokered bargaining" - defined as a three-cornered bargaining exercise between the two principal antagonists and a third party which is primarily seeking crisis de-escalation. Brokered bargaining theory predicts that this three-cornered engagement will play out in the expected manner each time a middle power nuclear crisis occurs as long as the outside actors do not intervene as competitor third parties. We reject theories that posit the dynamics of bilateral nuclear deterrence as the principal drivers of de-escalation, and equally, analyses that see third parties as standalone explanations for peaceful outcomes. We contend that it is the process of trilateral interaction encompassed by the brokered bargaining model and marked by a recursive interplay of perceptions, expectations, incentives, and strategies of the three actors that shapes crisis behavior, and in turn, trajectories and outcomes. The research is generalizable to potential nuclear rivalries in the Middle East and remains relevant to the Sino-Indian dyad and rivalries on the Korean peninsula.2019-05-3

Boston University Institutional Repository (OpenBU)