Search CORE

39,676 research outputs found

Reference resolution in multi-modal interaction: Preliminary observations

Author: Nijholt A.
Publication venue: Universidad de Pinar del Rio "Hermanos Saiz Montes de Oca"
Publication date: 01/01/2002
Field of study

In this paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems but do not give general solutions. Rather we decide to perform more detailed research on reference resolution in uni-modal contexts to obtain methods generalizable to multi-modal contexts. Since we try to build applications for a Dutch audience and since hardly any research has been done on reference resolution for Dutch, we give results on the resolution of anaphoric and deictic references in Dutch texts. We hope to be able to extend these results to our multimodal contexts later

University of Twente Research Information

Reference Resolution in Multi-modal Interaction: Position paper

Author: Nijholt A.
Publication venue: EU IST RTD Roadmap
Publication date: 01/01/2002
Field of study

In this position paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems and how they are treated for different contexts. No generally applicable solutions are given

University of Twente Research Information

Recommended from our members

A multimodal restaurant finder for semantic web

Author: He Yulan
Hui Siu Cheung
Quan Thanh Tho
Publication venue
Publication date: 01/01/2007
Field of study

Multimodal dialogue systems provide multiple modalities in the form of speech, mouse clicking, drawing or touch that can enhance human-computer interaction. However, one of the drawbacks of the existing multimodal systems is that they are highly domain-speciﬁc and they do not allow information to be shared across different providers. In this paper, we propose a semantic multimodal system, called Semantic Restaurant Finder, for the Semantic Web in which the restaurant information in different city/country/language are constructed as ontologies to allow the information to be sharable. From the Semantic Restaurant Finder, users can make use of the semantic restaurant knowledge distributed from different locations on the Internet to ﬁnd the desired restaurants

Open Research Online (The Open University)

Crossmodal Attentive Skill Learner

Author: How Jonathan P.
Kim Dong-Ki
Omidshafiei Shayegan
Pazis Jason
Publication venue
Publication date: 22/05/2018
Field of study

This paper presents the Crossmodal Attentive Skill Learner (CASL), integrated with the recently-introduced Asynchronous Advantage Option-Critic (A2OC) architecture [Harb et al., 2017] to enable hierarchical reinforcement learning across multiple sensory inputs. We provide concrete examples where the approach not only improves performance in a single task, but accelerates transfer to new tasks. We demonstrate the attention mechanism anticipates and identifies useful latent features, while filtering irrelevant sensor modalities during execution. We modify the Arcade Learning Environment [Bellemare et al., 2013] to support audio queries, and conduct evaluations of crossmodal learning in the Atari 2600 game Amidar. Finally, building on the recent work of Babaeizadeh et al. [2017], we open-source a fast hybrid CPU-GPU implementation of CASL.Comment: International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2018, NIPS 2017 Deep Reinforcement Learning Symposiu

arXiv.org e-Print Archive

DSpace@MIT

Force-imitated particle swarm optimization using the near-neighbor effect for locating multiple optima

Author: Beasley
Clerc
Dingwei Wang
Kennedy
Li
Lili Liu
Mendes
Poli
Rajac
Rashedi
Shengxiang Yang
Trelea
Tripathi
Van Dam
van den Bergh
Wang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Copyright @ Elsevier Inc. All rights reserved.Multimodal optimization problems pose a great challenge of locating multiple optima simultaneously in the search space to the particle swarm optimization (PSO) community. In this paper, the motion principle of particles in PSO is extended by using the near-neighbor effect in mechanical theory, which is a universal phenomenon in nature and society. In the proposed near-neighbor effect based force-imitated PSO (NN-FPSO) algorithm, each particle explores the promising regions where it resides under the composite forces produced by the “near-neighbor attractor” and “near-neighbor repeller”, which are selected from the set of memorized personal best positions and the current swarm based on the principles of “superior-and-nearer” and “inferior-and-nearer”, respectively. These two forces pull and push a particle to search for the nearby optimum. Hence, particles can simultaneously locate multiple optima quickly and precisely. Experiments are carried out to investigate the performance of NN-FPSO in comparison with a number of state-of-the-art PSO algorithms for locating multiple optima over a series of multimodal benchmark test functions. The experimental results indicate that the proposed NN-FPSO algorithm can efficiently locate multiple optima in multimodal fitness landscapes.This work was supported in part by the Key Program of National Natural Science Foundation (NNSF) of China under Grant 70931001, Grant 70771021, and Grant 70721001, the National Natural Science Foundation (NNSF) of China for Youth under Grant 61004121, Grant 70771021, the Science Fund for Creative Research Group of NNSF of China under Grant 60821063, the PhD Programs Foundation of Ministry of Education of China under Grant 200801450008, and in part by the Engineering and Physical Sciences Research Council (EPSRC) of UK under Grant EP/E060722/1 and Grant EP/E060722/2

CiteSeerX

Crossref

De Montfort University Open Research Archive

Brunel University Research Archive

Early Turn-taking Prediction with Spiking Neural Networks for Human Robot Collaboration

Author: Wachs Juan P.
Zhou Tian
Publication venue
Publication date: 26/09/2017
Field of study

Turn-taking is essential to the structure of human teamwork. Humans are typically aware of team members' intention to keep or relinquish their turn before a turn switch, where the responsibility of working on a shared task is shifted. Future co-robots are also expected to provide such competence. To that end, this paper proposes the Cognitive Turn-taking Model (CTTM), which leverages cognitive models (i.e., Spiking Neural Network) to achieve early turn-taking prediction. The CTTM framework can process multimodal human communication cues (both implicit and explicit) and predict human turn-taking intentions in an early stage. The proposed framework is tested on a simulated surgical procedure, where a robotic scrub nurse predicts the surgeon's turn-taking intention. It was found that the proposed CTTM framework outperforms the state-of-the-art turn-taking prediction algorithms by a large margin. It also outperforms humans when presented with partial observations of communication cues (i.e., less than 40% of full actions). This early prediction capability enables robots to initiate turn-taking actions at an early stage, which facilitates collaboration and increases overall efficiency.Comment: Submitted to IEEE International Conference on Robotics and Automation (ICRA) 201

arXiv.org e-Print Archive

Crossref