4 research outputs found

    Online Learning and Planning for Crowd-aware Service Robot Navigation

    Full text link
    Mobile service robots are increasingly used in indoor environments (e.g., shopping malls or museums) among large crowds of people. To efficiently navigate in these environments, such a robot should be able to exhibit a variety of behaviors. It should avoid crowded areas, and not oppose the flow of the crowd. It should be able to identify and avoid specific crowds that result in additional delays (e.g., children in a particular area might slow down the robot). and to seek out a crowd if its task requires it to interact with as many people as possible. These behaviors require the ability to learn and model crowd behavior in an environment. Earlier work used a dataset of paths navigated by people to solve this problem. That approach is expensive, risks privacy violations, and can become outdated as the environment evolves. To overcome these drawbacks, this thesis proposes a new approach where the robot learns models of crowd behavior online and relies only on local onboard sensors. This work develops and tests multiple planners that leverage these models in simulated environments and demonstrate statistically significant improvements in performance. The work reported here is applicable not only to navigation to target locations, but also to a variety of other services

    Augmenting Situated Spoken Language Interaction with Listener Gaze

    Get PDF
    Collaborative task solving in a shared environment requires referential success. Human speakers follow the listener’s behavior in order to monitor language comprehension (Clark, 1996). Furthermore, a natural language generation (NLG) system can exploit listener gaze to realize an effective interaction strategy by responding to it with verbal feedback in virtual environments (Garoufi, Staudte, Koller, & Crocker, 2016). We augment situated spoken language interaction with listener gaze and investigate its role in human-human and human-machine interactions. Firstly, we evaluate its impact on prediction of reference resolution using a mulitimodal corpus collection from virtual environments. Secondly, we explore if and how a human speaker uses listener gaze in an indoor guidance task, while spontaneously referring to real-world objects in a real environment. Thirdly, we consider an object identification task for assembly under system instruction. We developed a multimodal interactive system and two NLG systems that integrate listener gaze in the generation mechanisms. The NLG system “Feedback” reacts to gaze with verbal feedback, either underspecified or contrastive. The NLG system “Installments” uses gaze to incrementally refer to an object in the form of installments. Our results showed that gaze features improved the accuracy of automatic prediction of reference resolution. Further, we found that human speakers are very good at producing referring expressions, and showing listener gaze did not improve performance, but elicited more negative feedback. In contrast, we showed that an NLG system that exploits listener gaze benefits the listener’s understanding. Specifically, combining a short, ambiguous instruction with con- trastive feedback resulted in faster interactions compared to underspecified feedback, and even outperformed following long, unambiguous instructions. Moreover, alternating the underspecified and contrastive responses in an interleaved manner led to better engagement with the system and an effcient information uptake, and resulted in equally good performance. Somewhat surprisingly, when gaze was incorporated more indirectly in the generation procedure and used to trigger installments, the non-interactive approach that outputs an instruction all at once was more effective. However, if the spatial expression was mentioned first, referring in gaze-driven installments was as efficient as following an exhaustive instruction. In sum, we provide a proof of concept that listener gaze can effectively be used in situated human-machine interaction. An assistance system using gaze cues is more attentive and adapts to listener behavior to ensure communicative success
    corecore