1,579 research outputs found

    AsapRealizer 2.0: The Next Steps in Fluent Behavior Realization for ECAs

    Get PDF
    van Welbergen H, Yaghoubzadeh R, Kopp S. AsapRealizer 2.0: The Next Steps in Fluent Behavior Realization for ECAs. In: Bickmore T, Marsella S, Sidner C, eds. Intelligent Virtual Agents. Lecture Notes in Computer Science. Vol 8637. Cham: Springer International Publishing; 2014: 449-462.Natural human interaction is highly dynamic and responsive: interlocutors produce utterances incrementally, smoothly switch speaking turns with virtually no delay, make use of on-the-fly adaptation and (self) interruptions, execute movement in tight synchrony, etc. We present the conglomeration of our research efforts in enabling the realization of such fluent interactions for Embodied Conversational Agents in the behavior realizer ‘AsapRealizer 2.0’ and show how it provides fluent realization capabilities that go beyond the state-of-the-art

    Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles

    Get PDF
    Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners. This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)

    Developmental Bootstrapping of AIs

    Full text link
    Although some current AIs surpass human abilities in closed artificial worlds such as board games, their abilities in the real world are limited. They make strange mistakes and do not notice them. They cannot be instructed easily, fail to use common sense, and lack curiosity. They do not make good collaborators. Mainstream approaches for creating AIs are the traditional manually-constructed symbolic AI approach and generative and deep learning AI approaches including large language models (LLMs). These systems are not well suited for creating robust and trustworthy AIs. Although it is outside of the mainstream, the developmental bootstrapping approach has more potential. In developmental bootstrapping, AIs develop competences like human children do. They start with innate competences. They interact with the environment and learn from their interactions. They incrementally extend their innate competences with self-developed competences. They interact and learn from people and establish perceptual, cognitive, and common grounding. They acquire the competences they need through bootstrapping. However, developmental robotics has not yet produced AIs with robust adult-level competences. Projects have typically stopped at the Toddler Barrier corresponding to human infant development at about two years of age, before their speech is fluent. They also do not bridge the Reading Barrier, to skillfully and skeptically draw on the socially developed information resources that power current LLMs. The next competences in human cognitive development involve intrinsic motivation, imitation learning, imagination, coordination, and communication. This position paper lays out the logic, prospects, gaps, and challenges for extending the practice of developmental bootstrapping to acquire further competences and create robust, resilient, and human-compatible AIs.Comment: 102 pages, 29 figure

    The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences

    Get PDF
    This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks

    Learner agency in online task-based language learning for spoken interaction

    Get PDF
    L'objectiu d'aquest estudi és explorar la relació entre el poder de decisió i d'acció de l'alumne (learner agency), els recursos en pantalla (botons de navegació, instruccions escrites per a tasques) i la creació de significat en tasques de comunicació sincrònica mitjançant ordinador (CSMO) orientades a fomentar la interacció oral. El projecte es basa en l'estudi de casos i s'analitzen tres tasques dissenyades per a l'aprenentatge de llengua (intercanvi d'opinions, joc de rols i buits d'informació) en dos conjunts de dades (dotze casos). Es tracta d'unes tasques fetes en una universitat en línia situada a Barcelona i per a les quals es va fer servir un sistema d'audioconferència per a facilitar la interacció oral. Les dades es van recollir al llarg d'un semestre (2015) i es van analitzar juntament amb dades recollides en un estudi previ (2012). L'estudi presenta tres objectius: en primer lloc, entendre com les decisions dels alumnes i les accions deliberades pròpies dels recursos en pantalla modelen els torns de paraula; en segon lloc, entendre com la creació de significat es pot concebre amb una perspectiva multimodal, més enllà de la perspectiva lingüística; en tercer lloc, l'estudi vol ser una contribució a la teoria sobre l'agentivitat en l'aprenentatge de llengües per a fomentar l'agentivitat en les tasques CSMO d'avui dia i del futur, a fi d'aconseguir avançar d'una manera òptima en l'aprenentatge d'una llengua. S'utilitza una sèrie de fonts de dades i de mètodes. Les fonts inclouen enregistraments d'àudio d'interaccions orals entre estudiants, transcripcions, captures de pantalla, documentació de cursos de llengua i informació addicional sobre eines tecnològiques. Les dades s'analitzen per mitjà d'una anàlisi del discurs i de continguts, i d'una anàlisi del discurs mitjançat un ordinador (Herring, 2004). A més, es crea un enfocament específic que combina les perspectives analítiques èmica (alumne) i ètica (investigador), que se serveixen de l'anàlisi conversacional (Sacks, Schegloff i Jefferson, 1974) i de l'anàlisi (inter)accional multimodal (Norris, 2004). Els resultats indiquen que en les tasques es manifesten alguns tipus d'agentivitat. A més, el fet que la comunicació entre els alumnes tingui lloc per mitjà de recursos en pantalla modifica els torns de paraula tant qualitativament com quantitativament. També s'ha pogut identificar la creació de significat per mitjà de diversos instruments més enllà de la llengua (per exemple, somàtic, de text i imatge). L'agentivitat, doncs, es manifesta mitjançant sistemes humans (motor, sensorial i lingüístic) i recursos que formen part del sistema digital. Per tot això, en les tasques CSMO es pot definir l'agentivitat com el «sistema que conté accions enfocades a un o més objectius i que es desenvolupen mitjançant una o diferents eines, una definició que es basa en la noció sociocultural d'"acció enfocada a un objectiu i desenvolupada mitjançant eines" (Zinchenko, 1985). Finalment, es presenten algunes conseqüències a l'hora de dissenyar tasques i es proposen algunes recomanacions per a futures investigacions en CSMO basades en tasques amb una perspectiva multimodal.El presente estudio tiene como objetivo explorar la relación entre el poder de decisión y de acción del alumno (learner agency), los recursos basados en la pantalla (como por ejemplo, botones de navegación, instrucciones de tareas textuales) y la creación de significado en tareas de comunicación sincrónica mediada por ordenador (CSMO), desarrolladas para promover la interacción oral. Utilizando un enfoque de estudio de casos, se analizan tres tareas diseñadas para el aprendizaje de idiomas (intercambio de opinión, juego de roles y falta de información en dos conjuntos de datos (doce casos). Las tareas se llevan a cabo en una universidad en línea en Barcelona mediante una herramienta de audioconferencia para facilitar la interacción oral. Los datos se recopilaron a lo largo de un semestre, en un curso (2015) y se analizaron junto con los datos de un estudio anterior (2012). Los objetivos del estudio eran tres: en primer lugar, comprender cómo las elecciones de los alumnos y las acciones intencionales relacionadas con los recursos basados en la pantalla moldean los turnos conversacionales; en segundo lugar, comprender cómo puede entenderse la creación del significado con una perspectiva multimodal, más allá de lo lingüístico, y, en tercer lugar, contribuir a la teoría de la agentividad en el aprendizaje de idiomas. La finalidad de este último objetivo ha sido ayudar a fomentar la agentividad en las tareas actuales y futuras del CSMO, para poder avanzar de forma óptima en el aprendizaje de idiomas. Se usa una variedad de fuentes de datos y métodos. Las fuentes incluyen grabaciones de audio de interacción oral punto a punto, transcripciones, capturas de pantalla, documentación del curso, e información suplementaria sobre la herramienta tecnológica. Estas fuentes se exploran por medio del análisis de datos, incluido el análisis del contenido y del discurso, así como el análisis del discurso mediado por ordenador (Herring, 2004). Además, se desarrolla un enfoque específico que combina las perspectivas analíticas émica (alumno) y ética (investigador), que se basan en las nociones del análisis conversacional (Sacks, Schegloff y Jefferson, 1974) y el análisis (inter)accional multimodal (Norris, 2004). Los resultados sugieren que los tipos de agentividad se manifiestan en tareas. Además, la mediación de los alumnos con recursos basados en la pantalla moldea los turnos conversacionales tanto cualitativamente, como cuantitativamente. La creación de significado implica múltiples aspectos más allá del habla (por ejemplo, somático, de texto e imagen), lo que implica que puede entenderse que la agentividad se lleva a cabo por medio de sistemas humanos (motor, sensorial y de lenguaje) y de recursos pertenecientes al sistema digital. Por lo tanto, la agentividad en las tareas del CSMO puede describirse como "el sistema con acciones dirigidas a uno o más objetivos desarrolladas mediante una o más herramientas", definición que se basa en la noción sociocultural de "acción dirigida hacia un objetivo y mediada por instrumentos" (Zinchenko, 1985). Se discuten las implicaciones para el diseño de tareas y se describen recomendaciones para futuras investigaciones en CSMO basadas en tareas con una perspectiva multimodal.The present study aims to explore the relationship between learner agency, screen-based resources (such as navigational buttons and textual task instructions) and meaning making in synchronous computer-mediated communication (SCMC) tasks developed to promote spoken interaction. Using a case-study approach, three tasks designed for language learning (opinion sharing, role play and information gap) across two data sets (12 cases in total) are analysed. Tasks are carried out in an online university in Barcelona, where spoken interaction is made possible through an audioconferencing tool. Data was collected over the course of one semester in 2015 and analysed alongside data from a prior study that took place in 2012. The objectives are threefold: to understand how learners' choices and intentional actions pertaining to screen-based resources shape oral turn-taking; to understand how meaning making can be understood from a multimodal perspective, beyond speech; and to contribute to theories on agency in language learning in order to help foster agency in current and future SCMC tasks for optimal language learning gains. A range of data sources and methods are used. Sources include audio recordings of peer-to-peer oral interaction, transcripts, screenshots, course documentation and supplementary information about the technological tool employed. These sources are explored through data analysis, including content and discourse analysis as well as computer-mediated discourse analysis (Herring, 2004). In addition, a specific approach is developed that combines emic (learner) and etic (researcher) analytical perspectives that draw on notions from conversational analysis (Sacks, Schegloff and Jefferson, 1974) and multimodal (inter)actional analysis (Norris, 2004). Results suggest that different types of agency emerge during tasks. In addition, learners' mediation with screen-based resources are found to shape their oral turn-taking, both qualitatively and quantitatively. Meaning making involving multiple modes beyond speech (i.e. somatic, text and image) are identified, leading to the implication that agency can be understood as being carried out through human systems (motor, sensory and language) and resources pertaining to the digital system. Agency in SCMC tasks can therefore be described as 'systems with tool(s)-mediated, goal(s)-directed action(s)' which builds on the sociocultural notion of ¿tool-mediated, goal-directed action' (Zinchenko, 1985). Implications for task design are discussed, and recommendations for future research on task-based SCMC from a multimodal perspective are outlined

    Gender stereotypes in virtual agents

    Get PDF
    Visual, behavioural and verbal cues for gender are often used in designing virtual agents to take advantage of their cultural and stereotypical effects on the users. However, recent studies point towards a more gender-balanced view of stereotypical traits and roles in our society. This thesis is intended as an effort towards a progressive and inclusive approach for gender representations in virtual agents. The contributions are two-fold. First, in an iterative design process, representative male, female and androgynous embodied AI agents were created with few differences in their visual attributes. Second, these agents were then used to evaluate the stereotypical assumptions of gendered traits and roles in AI virtual agents. The results showed that, indeed, gender stereotypes are not as effective as previously assumed, and androgynous agents could represent a middle-ground between gendered stereotypes. The thesis findings are presented in the hope to foster discussions in virtual agent research and the frequent stereotypical use of gender representations
    corecore