945 research outputs found

    Evaluating the replicability and specificity of evidence for natural pedagogy theory

    Get PDF
    Do infants understand that they are being communicated to? This thesis first outlines issues facing the field of infancy research that affect confidence in the literature on this (and any) topic to date. Following this, an introductory chapter evaluates evidence for the three core claims of Natural Pedagogy (NP), and the compatibility of this evidence with alternative theories. This is followed by three experimental chapters. In Study 1, we attempted two replications of the study with the highest theoretical value for NP (Yoon et al., 2008). This study has high stakes theoretically, as it is the only study providing evidence for the most specific claim of NP that is difficult to explain by low-level mechanisms. Therefore, a replication of this result that included a reduction of possible confounds and a more sophisticated measure of attention throughout the task was of great theoretical value. In this study, we were unable to replicate the original findings. In Study 2 we went beyond the evidence for the claims made in the outline of NP, and instead generated a new, specific prediction that we believe NP would make. This is important, as theories are only useful if they can make clear, testable predictions. In this study, we pitted pedagogically demonstrated actions and simple actions against each other and evaluated infants’ transmission of these actions to someone else. We found no evidence for NP, finding evidence for preferential transmission of simple actions instead. In Study 3 we went beyond NP, and tested a clear prediction stemming from an alternative low-level theory for how infants develop gaze-following ability. We found evidence that infants learn to gaze-follow through reinforcement. Overall, this thesis contributes to the vast literature on infants as recipients of communication, as well as highlighting methods for conducting open and reproducible infancy research

    Understanding and Supporting Trade-offs in the Design of Visualizations for Communication.

    Full text link
    A shift in the availability of usable tools and public data has prompted mass manufacturing of information visualizations to communicate data insights to broad audiences. Despite available software, professional and novice creators of visualizations that are intended to communicate data insights to broad audiences may struggle to balance conflicting considerations in design. Studying professional practice suggests that expert visualization designers and analysts negotiate difficult design trade-offs in creating customized visualizations, many of which involve deciding how and how much data to present given a priori design goals. This dissertation presents three studies that demonstrate how studying expert visual design and data modeling practice can advance visualization design tools. Insights from these formative studies inform the development of specific frameworks and algorithms. The first study addresses the often ignored, persuasive dimension of narrative visualizations. The framework I propose characterizes the persuasive dimension of visualization design by providing empirical evidence of several classes of rhetorical design strategies that trade-off comprehensive, unbiased data presentation goals with intentions to persuade users toward intended interpretations. The rhetorical visualization framework highlights a second trade-off: the act of dividing and sequencing information from a multivariate data set in separate visualizations for ordered presentation. I contribute initial evidence of ordering principles that designers apply to ease comprehension and support storytelling goals with a visualization presentation. The principles are used in developing a novel algorithmic approach to supporting designers of visualizations in making decisions related to visualization presentation order and structuring, highlighting the importance of optimizing for both local or “single visualization” design in tandem with global “sequence” design. The final design trade-off concerns how to convey uncertainty to end-users in order to support accurate conclusions despite diverse educational backgrounds. I demonstrate how non-statistician end-users can produce more cautious and at times more accurate estimates of the reliability of data patterns through the use of a comparative sample plots method motivated by statistical resampling approaches to modeling uncertainty. Taken together, my results deepen understanding of the act of designing visualizations for potentially diverse online audiences, and provide tools to support more effective design.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/107170/1/jhullman_1.pd

    Human-Robot Interaction architecture for interactive and lively social robots

    Get PDF
    Mención Internacional en el título de doctorLa sociedad está experimentando un proceso de envejecimiento que puede provocar un desequilibrio entre la población en edad de trabajar y aquella fuera del mercado de trabajo. Una de las soluciones a este problema que se están considerando hoy en día es la introducción de robots en multiples sectores, incluyendo el de servicios. Sin embargo, para que esto sea una solución viable, estos robots necesitan ser capaces de interactuar con personas de manera satisfactoria, entre otras habilidades. En el contexto de la aplicación de robots sociales al cuidado de mayores, esta tesis busca proporcionar a un robot social las habilidades necesarias para crear interacciones entre humanos y robots que sean naturales. En concreto, esta tesis se centra en tres problemas que deben ser solucionados: (i) el modelado de interacciones entre humanos y robots; (ii) equipar a un robot social con las capacidades expresivas necesarias para una comunicación satisfactoria; y (iii) darle al robot una apariencia vivaz. La solución al problema de modelado de diálogos presentada en esta tesis propone diseñar estos diálogos como una secuencia de elementos atómicos llamados Actos Comunicativos (CAs, por sus siglas en inglés). Se pueden parametrizar en tiempo de ejecución para completar diferentes objetivos comunicativos, y están equipados con mecanismos para manejar algunas de las imprecisiones que pueden aparecer durante interacciones. Estos CAs han sido identificados a partir de la combinación de dos dimensiones: iniciativa (si la tiene el robot o el usuario) e intención (si se pretende obtener o proporcionar información). Estos CAs pueden ser combinados siguiendo una estructura jerárquica para crear estructuras mas complejas que sean reutilizables. Esto simplifica el proceso para crear nuevas interacciones, permitiendo a los desarrolladores centrarse exclusivamente en diseñar el flujo del diálogo, sin tener que preocuparse de reimplementar otras funcionalidades que tienen que estar presentes en todas las interacciones (como el manejo de errores, por ejemplo). La expresividad del robot está basada en el uso de una librería de gestos, o expresiones, multimodales predefinidos, modelados como estructuras similares a máquinas de estados. El módulo que controla la expresividad recibe peticiones para realizar dichas expresiones, planifica su ejecución para evitar cualquier conflicto que pueda aparecer, las carga, y comprueba que su ejecución se complete sin problemas. El sistema es capaz también de generar estas expresiones en tiempo de ejecución a partir de una lista de acciones unimodales (como decir una frase, o mover una articulación). Una de las características más importantes de la arquitectura de expresividad propuesta es la integración de una serie de métodos de modulación que pueden ser usados para modificar los gestos del robot en tiempo de ejecución. Esto permite al robot adaptar estas expresiones en base a circunstancias particulares (aumentando al mismo tiempo la variabilidad de la expresividad del robot), y usar un número limitado de gestos para mostrar diferentes estados internos (como el estado emocional). Teniendo en cuenta que ser reconocido como un ser vivo es un requisito para poder participar en interacciones sociales, que un robot social muestre una apariencia de vivacidad es un factor clave en interacciones entre humanos y robots. Para ello, esta tesis propone dos soluciones. El primer método genera acciones a través de las diferentes interfaces del robot a intervalos. La frecuencia e intensidad de estas acciones están definidas en base a una señal que representa el pulso del robot. Dicha señal puede adaptarse al contexto de la interacción o al estado interno del robot. El segundo método enriquece las interacciones verbales entre el robot y el usuario prediciendo los gestos no verbales más apropiados en base al contenido del diálogo y a la intención comunicativa del robot. Un modelo basado en aprendizaje automático recibe la transcripción del mensaje verbal del robot, predice los gestos que deberían acompañarlo, y los sincroniza para que cada gesto empiece en el momento preciso. Este modelo se ha desarrollado usando una combinación de un encoder diseñado con una red neuronal Long-Short Term Memory, y un Conditional Random Field para predecir la secuencia de gestos que deben acompañar a la frase del robot. Todos los elementos presentados conforman el núcleo de una arquitectura de interacción humano-robot modular que ha sido integrada en múltiples plataformas, y probada bajo diferentes condiciones. El objetivo central de esta tesis es contribuir al área de interacción humano-robot con una nueva solución que es modular e independiente de la plataforma robótica, y que se centra en proporcionar a los desarrolladores las herramientas necesarias para desarrollar aplicaciones que requieran interacciones con personas.Society is experiencing a series of demographic changes that can result in an unbalance between the active working and non-working age populations. One of the solutions considered to mitigate this problem is the inclusion of robots in multiple sectors, including the service sector. But for this to be a viable solution, among other features, robots need to be able to interact with humans successfully. This thesis seeks to endow a social robot with the abilities required for a natural human-robot interactions. The main objective is to contribute to the body of knowledge on the area of Human-Robot Interaction with a new, platform-independent, modular approach that focuses on giving roboticists the tools required to develop applications that involve interactions with humans. In particular, this thesis focuses on three problems that need to be addressed: (i) modelling interactions between a robot and an user; (ii) endow the robot with the expressive capabilities required for a successful communication; and (iii) endow the robot with a lively appearance. The approach to dialogue modelling presented in this thesis proposes to model dialogues as a sequence of atomic interaction units, called Communicative Acts, or CAs. They can be parametrized in runtime to achieve different communicative goals, and are endowed with mechanisms oriented to solve some of the uncertainties related to interaction. Two dimensions have been used to identify the required CAs: initiative (the robot or the user), and intention (either retrieve information or to convey it). These basic CAs can be combined in a hierarchical manner to create more re-usable complex structures. This approach simplifies the creation of new interactions, by allowing developers to focus exclusively on designing the flow of the dialogue, without having to re-implement functionalities that are common to all dialogues (like error handling, for example). The expressiveness of the robot is based on the use of a library of predefined multimodal gestures, or expressions, modelled as state machines. The module managing the expressiveness receives requests for performing gestures, schedules their execution in order to avoid any possible conflict that might arise, loads them, and ensures that their execution goes without problems. The proposed approach is also able to generate expressions in runtime based on a list of unimodal actions (an utterance, the motion of a limb, etc...). One of the key features of the proposed expressiveness management approach is the integration of a series of modulation techniques that can be used to modify the robot’s expressions in runtime. This would allow the robot to adapt them to the particularities of a given situation (which would also increase the variability of the robot expressiveness), and to display different internal states with the same expressions. Considering that being recognized as a living being is a requirement for engaging in social encounters, the perception of a social robot as a living entity is a key requirement to foster human-robot interactions. In this dissertation, two approaches have been proposed. The first method generates actions for the different interfaces of the robot at certain intervals. The frequency and intensity of these actions are defined by a signal that represents the pulse of the robot, which can be adapted to the context of the interaction or the internal state of the robot. The second method enhances the robot’s utterance by predicting the appropriate non-verbal expressions that should accompany them, according to the content of the robot’s message, as well as its communicative intention. A deep learning model receives the transcription of the robot’s utterances, predicts which expressions should accompany it, and synchronizes them, so each gesture selected starts at the appropriate time. The model has been developed using a combination of a Long-Short Term Memory network-based encoder and a Conditional Random Field for generating a sequence of gestures that are combined with the robot’s utterance. All the elements presented above conform the core of a modular Human-Robot Interaction architecture that has been integrated in multiple platforms, and tested under different conditions.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Fernando Torres Medina.- Secretario: Concepción Alicia Monje Micharet.- Vocal: Amirabdollahian Farshi

    Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles

    Get PDF
    Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners. This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)

    An Approach for Contextual Control in Dialogue Management with Belief State Trend Analysis and Prediction

    Get PDF
    This thesis applies the theory of naturalistic decision making (NDM) in human physcology model for the study of dialogue management system in major approaches from the classical approach based upon finite state machine to most recent approach using partially observable markov decision process (POMDP). While most of the approaches use various techniques to estimate system state, POMDP-based system uses the belief state to make decisions. In addition to the state estimation POMDP provides a mechanism to model the uncertainty and allows error-recovery. However, applying Markovian over the belief-state space in the current POMDP models cause significant loss of valuable information in the dialogue history, leading to untruthful management of user\u27s intention. Also there is a need of adequate interaction with users according to their level of knowledge. To improve the performance of POMDP-based dialogue management, this thesis proposes an enabling method to allow dynamic control of dialogue management. There are three contributions made in order to achieve the dynamism which are as follows: Introduce historical belief information into the POMDP model, analyzing its trend and predicting the user belief states with history information and finally using this derived information to control the system based on the user intention by switching between contextual control modes. Theoretical derivations of proposed work and experiments with simulation provide evidence on dynamic dialogue control of the agent to improve the human-computer interaction using the proposed algorithm

    Listeners and readers generalise their experience with word meanings across modalities

    Get PDF
    Research has shown that adults’ lexical-semantic representations are surprisingly malleable. For instance, the interpretation of ambiguous words (e.g. bark) is influenced by experience such that recently encountered meanings become more readily available (Rodd et al., 2016, 2013). However the mechanism underlying this word-meaning priming effect remains unclear, and competing accounts make different predictions about the extent to which information about word meanings that is gained within one modality (e.g. speech) is transferred to the other modality (e.g. reading) to aid comprehension. In two web-based experiments, ambiguous target words were primed with either written or spoken sentences that biased their interpretation toward a subordinate meaning, or were unprimed. About 20 minutes after the prime exposure, interpretation of these target words was tested by presenting them in either written or spoken form, using word association (Experiment 1, N=78) and speeded semantic relatedness decisions (Experiment 2, N=181). Both experiments replicated the auditory unimodal priming effect shown previously (Rodd et al., 2016, 2013) and revealed significant cross-modal priming: primed meanings were retrieved more frequently and swiftly across all primed conditions compared to the unprimed baseline. Furthermore, there were no reliable differences in priming levels between unimodal and cross-modal prime-test conditions. These results indicate that recent experience with ambiguous word meanings can bias the reader’s or listener’s later interpretation of these words in a modality-general way. We identify possible loci of this effect within the context of models of long-term priming and ambiguity resolution

    Listeners and Readers Generalize Their Experience With Word Meanings Across Modalities

    Get PDF
    Research has shown that adults' lexical-semantic representations are surprisingly malleable. For instance, the interpretation of ambiguous words (e.g., bark) is influenced by experience such that recently encountered meanings become more readily available (Rodd et al., 2016, 2013). However, the mechanism underlying this word-meaning priming effect remains unclear, and competing accounts make different predictions about the extent to which information about word meanings that is gained within one modality (e.g., speech) is transferred to the other modality (e.g., reading) to aid comprehension. In two Web-based experiments, ambiguous target words were primed with either written or spoken sentences that biased their interpretation toward a subordinate meaning, or were unprimed. About 20 min after the prime exposure, interpretation of these target words was tested by presenting them in either written or spoken form, using word association (Experiment 1, N = 78) and speeded semantic relatedness decisions (Experiment 2, N = 181). Both experiments replicated the auditory unimodal priming effect shown previously (Rodd et al., 2016, 2013) and revealed significant cross-modal priming: primed meanings were retrieved more frequently and swiftly across all primed conditions compared with the unprimed baseline. Furthermore, there were no reliable differences in priming levels between unimodal and cross-modal prime-test conditions. These results indicate that recent experience with ambiguous word meanings can bias the reader's or listener's later interpretation of these words in a modality-general way. We identify possible loci of this effect within the context of models of long-term priming and ambiguity resolution
    corecore