305 research outputs found
From images via symbols to contexts: using augmented reality for interactive model acquisition
Systems that perform in real environments need to bind the internal state to externally
perceived objects, events, or complete scenes. How to learn this correspondence has been a long
standing problem in computer vision as well as artificial intelligence. Augmented Reality provides
an interesting perspective on this problem because a human user can directly relate displayed
system results to real environments. In the following we present a system that is able to bootstrap
internal models from user-system interactions. Starting from pictorial representations it learns
symbolic object labels that provide the basis for storing observed episodes. In a second step, more
complex relational information is extracted from stored episodes that enables the system to react
on specific scene contexts
Who am I talking with? A face memory for social robots
In order to provide personalized services and to
develop human-like interaction capabilities robots need to rec-
ognize their human partner. Face recognition has been studied
in the past decade exhaustively in the context of security systems
and with significant progress on huge datasets. However, these
capabilities are not in focus when it comes to social interaction
situations. Humans are able to remember people seen for a
short moment in time and apply this knowledge directly in
their engagement in conversation. In order to equip a robot with
capabilities to recall human interlocutors and to provide user-
aware services, we adopt human-human interaction schemes to
propose a face memory on the basis of active appearance models
integrated with the active memory architecture. This paper
presents the concept of the interactive face memory, the applied
recognition algorithms, and their embedding into the robot’s
system architecture. Performance measures are discussed for
general face databases as well as scenario-specific datasets
Training for Speech Recognition on Coprocessors
Automatic Speech Recognition (ASR) has increased in popularity in recent
years. The evolution of processor and storage technologies has enabled more
advanced ASR mechanisms, fueling the development of virtual assistants such as
Amazon Alexa, Apple Siri, Microsoft Cortana, and Google Home. The interest in
such assistants, in turn, has amplified the novel developments in ASR research.
However, despite this popularity, there has not been a detailed training
efficiency analysis of modern ASR systems. This mainly stems from: the
proprietary nature of many modern applications that depend on ASR, like the
ones listed above; the relatively expensive co-processor hardware that is used
to accelerate ASR by big vendors to enable such applications; and the absence
of well-established benchmarks. The goal of this paper is to address the latter
two of these challenges. The paper first describes an ASR model, based on a
deep neural network inspired by recent work in this domain, and our experiences
building it. Then we evaluate this model on three CPU-GPU co-processor
platforms that represent different budget categories. Our results demonstrate
that utilizing hardware acceleration yields good results even without high-end
equipment. While the most expensive platform (10X price of the least expensive
one) converges to the initial accuracy target 10-30% and 60-70% faster than the
other two, the differences among the platforms almost disappear at slightly
higher accuracy targets. In addition, our results further highlight both the
difficulty of evaluating ASR systems due to the complex, long, and resource
intensive nature of the model training in this domain, and the importance of
establishing benchmarks for ASR.Comment: under submission to pvldb even though used acm style to submit her
A museum guide robot: Dealing with multiple participants in the real-world
Pitsch K, Gehle R, Wrede S. A museum guide robot: Dealing with multiple participants in the real-world. Presented at the Workshop "Robots in public spaces. Towards multi-party, short-term, dynamic human-robot interaction" at ICSR 2013.Using video-recordings from a real-world eld trial of a mu-seum guide robot, we show how a robot's gaze influences the visitors' state of participation in group constellations. Then, we compare the robot's conduct to a human tour guide's gaze strategies. We argue that a robot system, to deal with real-world everyday situations, needs to be equipped with knowledge about interactional coordination, incremental processing and strategies for pro-actively shaping the users' conduct
Vision systems with the human in the loop
The emerging cognitive vision paradigm deals with vision systems that apply machine learning and automatic reasoning in order to learn from what they perceive. Cognitive vision systems can rate the relevance and consistency of newly acquired knowledge, they can adapt to their environment and thus will exhibit high robustness. This contribution presents vision systems that aim at flexibility and robustness. One is tailored for content-based image retrieval, the others are cognitive vision systems that constitute prototypes of visual active memories which evaluate, gather, and integrate contextual knowledge for visual analysis. All three systems are designed to interact with human users. After we will have discussed adaptive content-based image retrieval and object and action recognition in an office environment, the issue of assessing cognitive systems will be raised. Experiences from psychologically evaluated human-machine interactions will be reported and the promising potential of psychologically-based usability experiments will be stressed
Addressing Multiple Participants: A Museum Robot's Gaze Shapes Visitor Participation
Pitsch K, Gehle R, Wrede S. Addressing Multiple Participants: A Museum Robot's Gaze Shapes Visitor Participation. Presented at the ICSR 2013.Using videorecordings from a real-world field study with a museum guide robot, we show procedures by which the robot manages (i) to include and (ii) to disengage users in a multi-party situation
Software Abstractions for Simulation and Control of a Continuum Robot
Nordmann A, Rolf M, Wrede S. Software Abstractions for Simulation and Control of a Continuum Robot. In: SIMPAR2012 - SIMULATION, MODELING, and PROGRAMMING for AUTONOMOUS ROBOTS. 2012
A Middleware for Collaborative Research in Experimental Robotics
Wienke J, Wrede S. A Middleware for Collaborative Research in Experimental Robotics. In: IEEE/SICE International Symposium on System Integration (SII2011). IEEE; 2011: 1183-1190.This paper presents the Robotics Service Bus (RSB), a new message-oriented, event-driven middleware based on a logically unified bus with hierarchical structure. Major goals for the development of RSB were openness and scalability in order to integrate diverse components in the context of robotics and intelligent systems. This includes the ability to operate on embedded platforms as well as desktop computers, reduction of framework lock-in, and the integration with other middlewares. We describe the design of the RSB middleware and explain how it meets requirements which lead to a scalable and open middleware concept. These requirements are based on several application scenarios which are used to verify the applicability of RSB. Furthermore, we relate RSB to other middlewares in the robotics domain
When a robot orients visitors to an exhibit. Referential practices and interactional dynamics in the real world
Pitsch K, Wrede S. When a robot orients visitors to an exhibit. Referential practices and interactional dynamics in the real world. In: Ro-Man 2014. Edinburgh; 2014: 36-42.A basic task for robots interacting with humans consists in guiding their focus of attention. Existing guidelines for a robot’s multimodal deixis are primarily focused on the speaker (talk-gesture-coordination, handshape). Conducting a field trial with a museum guide robot, we tested these indivi- dualistic referential strategies in the dynamic conditions of real-world HRI and found that their success ranges between 27% and 95%. Qualitative video-based micro-analysis revealed that the users experienced problems when they were not facing the robot at the moment of the deictic gesture. Also the importance of the robot’s head orientation became evident. Implications are drawn as design guidelines for an inter- actional account of modeling referential strategies for HRI
- …