1,579 research outputs found
AsapRealizer 2.0: The Next Steps in Fluent Behavior Realization for ECAs
van Welbergen H, Yaghoubzadeh R, Kopp S. AsapRealizer 2.0: The Next Steps in Fluent Behavior Realization for ECAs. In: Bickmore T, Marsella S, Sidner C, eds. Intelligent Virtual Agents. Lecture Notes in Computer Science. Vol 8637. Cham: Springer International Publishing; 2014: 449-462.Natural human interaction is highly dynamic and responsive: interlocutors produce utterances incrementally, smoothly switch speaking turns with
virtually no delay, make use of on-the-fly adaptation and (self) interruptions, execute movement in tight synchrony, etc. We present the conglomeration of our research efforts in enabling the realization of such fluent interactions for Embodied Conversational Agents in the behavior realizer ‘AsapRealizer 2.0’ and show
how it provides fluent realization capabilities that go beyond the state-of-the-art
Designing Embodied Interactive Software Agents for E-Learning: Principles, Components, and Roles
Embodied interactive software agents are complex autonomous, adaptive, and social software systems with a digital embodiment that enables them to act on and react to other entities (users, objects, and other agents) in their environment through bodily actions, which include the use of verbal and non-verbal communicative behaviors in face-to-face interactions with the user. These agents have been developed for various roles in different application domains, in which they perform tasks that have been assigned to them by their developers or delegated to them by their users or by other agents. In computer-assisted learning, embodied interactive pedagogical software agents have the general task to promote human learning by working with students (and other agents) in computer-based learning environments, among them e-learning platforms based on Internet technologies, such as the Virtual Linguistics Campus (www.linguistics-online.com). In these environments, pedagogical agents provide contextualized, qualified, personalized, and timely assistance, cooperation, instruction, motivation, and services for both individual learners and groups of learners.
This thesis develops a comprehensive, multidisciplinary, and user-oriented view of the design of embodied interactive pedagogical software agents, which integrates theoretical and practical insights from various academic and other fields. The research intends to contribute to the scientific understanding of issues, methods, theories, and technologies that are involved in the design, implementation, and evaluation of embodied interactive software agents for different roles in e-learning and other areas. For developers, the thesis provides sixteen basic principles (Added Value, Perceptible Qualities, Balanced Design, Coherence, Consistency, Completeness, Comprehensibility, Individuality, Variability, Communicative Ability, Modularity, Teamwork, Participatory Design, Role Awareness, Cultural Awareness, and Relationship Building) plus a large number of specific guidelines for the design of embodied interactive software agents and their components. Furthermore, it offers critical reviews of theories, concepts, approaches, and technologies from different areas and disciplines that are relevant to agent design. Finally, it discusses three pedagogical agent roles (virtual native speaker, coach, and peer) in the scenario of the linguistic fieldwork classes on the Virtual Linguistics Campus and presents detailed considerations for the design of an agent for one of these roles (the virtual native speaker)
Developmental Bootstrapping of AIs
Although some current AIs surpass human abilities in closed artificial worlds
such as board games, their abilities in the real world are limited. They make
strange mistakes and do not notice them. They cannot be instructed easily, fail
to use common sense, and lack curiosity. They do not make good collaborators.
Mainstream approaches for creating AIs are the traditional manually-constructed
symbolic AI approach and generative and deep learning AI approaches including
large language models (LLMs). These systems are not well suited for creating
robust and trustworthy AIs. Although it is outside of the mainstream, the
developmental bootstrapping approach has more potential. In developmental
bootstrapping, AIs develop competences like human children do. They start with
innate competences. They interact with the environment and learn from their
interactions. They incrementally extend their innate competences with
self-developed competences. They interact and learn from people and establish
perceptual, cognitive, and common grounding. They acquire the competences they
need through bootstrapping. However, developmental robotics has not yet
produced AIs with robust adult-level competences. Projects have typically
stopped at the Toddler Barrier corresponding to human infant development at
about two years of age, before their speech is fluent. They also do not bridge
the Reading Barrier, to skillfully and skeptically draw on the socially
developed information resources that power current LLMs. The next competences
in human cognitive development involve intrinsic motivation, imitation
learning, imagination, coordination, and communication. This position paper
lays out the logic, prospects, gaps, and challenges for extending the practice
of developmental bootstrapping to acquire further competences and create
robust, resilient, and human-compatible AIs.Comment: 102 pages, 29 figure
The Multimodal Tutor: Adaptive Feedback from Multimodal Experiences
This doctoral thesis describes the journey of ideation, prototyping and empirical testing of the Multimodal Tutor, a system designed for providing digital feedback that supports psychomotor skills acquisition using learning and multimodal data capturing. The feedback is given in real-time with machine-driven assessment of the learner's task execution. The predictions are tailored by supervised machine learning models trained with human annotated samples. The main contributions of this thesis are: a literature survey on multimodal data for learning, a conceptual model (the Multimodal Learning Analytics Model), a technological framework (the Multimodal Pipeline), a data annotation tool (the Visual Inspection Tool) and a case study in Cardiopulmonary Resuscitation training (CPR Tutor). The CPR Tutor generates real-time, adaptive feedback using kinematic and myographic data and neural networks
Learner agency in online task-based language learning for spoken interaction
L'objectiu d'aquest estudi és explorar la relació entre el poder de decisió i d'acció de l'alumne (learner agency), els recursos en pantalla (botons de navegació, instruccions escrites per a tasques) i la creació de significat en tasques de comunicació sincrònica mitjançant ordinador (CSMO) orientades a fomentar la interacció oral. El projecte es basa en l'estudi de casos i s'analitzen tres tasques dissenyades per a l'aprenentatge de llengua (intercanvi d'opinions, joc de rols i buits d'informació) en dos conjunts de dades (dotze casos). Es tracta d'unes tasques fetes en una universitat en línia situada a Barcelona i per a les quals es va fer servir un sistema d'audioconferència per a facilitar la interacció oral. Les dades es van recollir al llarg d'un semestre (2015) i es van analitzar juntament amb dades recollides en un estudi previ (2012). L'estudi presenta tres objectius: en primer lloc, entendre com les decisions dels alumnes i les accions deliberades pròpies dels recursos en pantalla modelen els torns de paraula; en segon lloc, entendre com la creació de significat es pot concebre amb una perspectiva multimodal, més enllà de la perspectiva lingüística; en tercer lloc, l'estudi vol ser una contribució a la teoria sobre l'agentivitat en l'aprenentatge de llengües per a fomentar l'agentivitat en les tasques CSMO d'avui dia i del futur, a fi d'aconseguir avançar d'una manera òptima en l'aprenentatge d'una llengua. S'utilitza una sèrie de fonts de dades i de mètodes. Les fonts inclouen enregistraments d'àudio d'interaccions orals entre estudiants, transcripcions, captures de pantalla, documentació de cursos de llengua i informació addicional sobre eines tecnològiques. Les dades s'analitzen per mitjà d'una anàlisi del discurs i de continguts, i d'una anàlisi del discurs mitjançat un ordinador (Herring, 2004). A més, es crea un enfocament específic que combina les perspectives analítiques èmica (alumne) i ètica (investigador), que se serveixen de l'anàlisi conversacional (Sacks, Schegloff i Jefferson, 1974) i de l'anàlisi (inter)accional multimodal (Norris, 2004). Els resultats indiquen que en les tasques es manifesten alguns tipus d'agentivitat. A més, el fet que la comunicació entre els alumnes tingui lloc per mitjà de recursos en pantalla modifica els torns de paraula tant qualitativament com quantitativament. També s'ha pogut identificar la creació de significat per mitjà de diversos instruments més enllà de la llengua (per exemple, somàtic, de text i imatge). L'agentivitat, doncs, es manifesta mitjançant sistemes humans (motor, sensorial i lingüístic) i recursos que formen part del sistema digital. Per tot això, en les tasques CSMO es pot definir l'agentivitat com el «sistema que conté accions enfocades a un o més objectius i que es desenvolupen mitjançant una o diferents eines, una definició que es basa en la noció sociocultural d'"acció enfocada a un objectiu i desenvolupada mitjançant eines" (Zinchenko, 1985). Finalment, es presenten algunes conseqüències a l'hora de dissenyar tasques i es proposen algunes recomanacions per a futures investigacions en CSMO basades en tasques amb una perspectiva multimodal.El presente estudio tiene como objetivo explorar la relación entre el poder de decisión y de acción del alumno (learner agency), los recursos basados en la pantalla (como por ejemplo, botones de navegación, instrucciones de tareas textuales) y la creación de significado en tareas de comunicación sincrónica mediada por ordenador (CSMO), desarrolladas para promover la interacción oral. Utilizando un enfoque de estudio de casos, se analizan tres tareas diseñadas para el aprendizaje de idiomas (intercambio de opinión, juego de roles y falta de información en dos conjuntos de datos (doce casos). Las tareas se llevan a cabo en una universidad en línea en Barcelona mediante una herramienta de audioconferencia para facilitar la interacción oral. Los datos se recopilaron a lo largo de un semestre, en un curso (2015) y se analizaron junto con los datos de un estudio anterior (2012). Los objetivos del estudio eran tres: en primer lugar, comprender cómo las elecciones de los alumnos y las acciones intencionales relacionadas con los recursos basados en la pantalla moldean los turnos conversacionales; en segundo lugar, comprender cómo puede entenderse la creación del significado con una perspectiva multimodal, más allá de lo lingüístico, y, en tercer lugar, contribuir a la teoría de la agentividad en el aprendizaje de idiomas. La finalidad de este último objetivo ha sido ayudar a fomentar la agentividad en las tareas actuales y futuras del CSMO, para poder avanzar de forma óptima en el aprendizaje de idiomas. Se usa una variedad de fuentes de datos y métodos. Las fuentes incluyen grabaciones de audio de interacción oral punto a punto, transcripciones, capturas de pantalla, documentación del curso, e información suplementaria sobre la herramienta tecnológica. Estas fuentes se exploran por medio del análisis de datos, incluido el análisis del contenido y del discurso, así como el análisis del discurso mediado por ordenador (Herring, 2004). Además, se desarrolla un enfoque específico que combina las perspectivas analíticas émica (alumno) y ética (investigador), que se basan en las nociones del análisis conversacional (Sacks, Schegloff y Jefferson, 1974) y el análisis (inter)accional multimodal (Norris, 2004). Los resultados sugieren que los tipos de agentividad se manifiestan en tareas. Además, la mediación de los alumnos con recursos basados en la pantalla moldea los turnos conversacionales tanto cualitativamente, como cuantitativamente. La creación de significado implica múltiples aspectos más allá del habla (por ejemplo, somático, de texto e imagen), lo que implica que puede entenderse que la agentividad se lleva a cabo por medio de sistemas humanos (motor, sensorial y de lenguaje) y de recursos pertenecientes al sistema digital. Por lo tanto, la agentividad en las tareas del CSMO puede describirse como "el sistema con acciones dirigidas a uno o más objetivos desarrolladas mediante una o más herramientas", definición que se basa en la noción sociocultural de "acción dirigida hacia un objetivo y mediada por instrumentos" (Zinchenko, 1985). Se discuten las implicaciones para el diseño de tareas y se describen recomendaciones para futuras investigaciones en CSMO basadas en tareas con una perspectiva multimodal.The present study aims to explore the relationship between learner agency, screen-based resources (such as navigational buttons and textual task instructions) and meaning making in synchronous computer-mediated communication (SCMC) tasks developed to promote spoken interaction. Using a case-study approach, three tasks designed for language learning (opinion sharing, role play and information gap) across two data sets (12 cases in total) are analysed. Tasks are carried out in an online university in Barcelona, where spoken interaction is made possible through an audioconferencing tool. Data was collected over the course of one semester in 2015 and analysed alongside data from a prior study that took place in 2012. The objectives are threefold: to understand how learners' choices and intentional actions pertaining to screen-based resources shape oral turn-taking; to understand how meaning making can be understood from a multimodal perspective, beyond speech; and to contribute to theories on agency in language learning in order to help foster agency in current and future SCMC tasks for optimal language learning gains. A range of data sources and methods are used. Sources include audio recordings of peer-to-peer oral interaction, transcripts, screenshots, course documentation and supplementary information about the technological tool employed. These sources are explored through data analysis, including content and discourse analysis as well as computer-mediated discourse analysis (Herring, 2004). In addition, a specific approach is developed that combines emic (learner) and etic (researcher) analytical perspectives that draw on notions from conversational analysis (Sacks, Schegloff and Jefferson, 1974) and multimodal (inter)actional analysis (Norris, 2004). Results suggest that different types of agency emerge during tasks. In addition, learners' mediation with screen-based resources are found to shape their oral turn-taking, both qualitatively and quantitatively. Meaning making involving multiple modes beyond speech (i.e. somatic, text and image) are identified, leading to the implication that agency can be understood as being carried out through human systems (motor, sensory and language) and resources pertaining to the digital system. Agency in SCMC tasks can therefore be described as 'systems with tool(s)-mediated, goal(s)-directed action(s)' which builds on the sociocultural notion of ¿tool-mediated, goal-directed action' (Zinchenko, 1985). Implications for task design are discussed, and recommendations for future research on task-based SCMC from a multimodal perspective are outlined
Gender stereotypes in virtual agents
Visual, behavioural and verbal cues for gender are often used in designing virtual agents to take advantage of their cultural and stereotypical effects on the users. However, recent studies point towards a more gender-balanced view of stereotypical traits and roles in our society. This thesis is intended as an effort towards a progressive and inclusive approach for gender representations in virtual agents. The contributions are two-fold. First, in an iterative design process, representative male, female and androgynous embodied AI agents were created with few differences in their visual attributes. Second, these agents were then used to evaluate the stereotypical assumptions of gendered traits and roles in AI virtual agents. The results showed that, indeed, gender stereotypes are not as effective as previously assumed, and androgynous agents could represent a middle-ground between gendered stereotypes. The thesis findings are presented in the hope to foster discussions in virtual agent research and the frequent stereotypical use of gender representations
- …