32 research outputs found

    Self-Supervised Multi-Object Tracking From Consistency Across Timescales

    Full text link
    Self-supervised multi-object trackers have the potential to leverage the vast amounts of raw data recorded worldwide. However, they still fall short in re-identification accuracy compared to their supervised counterparts. We hypothesize that this deficiency results from restricting self-supervised objectives to single frames or frame pairs. Such designs lack sufficient visual appearance variations during training to learn consistent re-identification features. Therefore, we propose a training objective that learns re-identification features over a sequence of frames by enforcing consistent association scores across short and long timescales. Extensive evaluations on the BDD100K and MOT17 benchmarks demonstrate that our learned ReID features significantly reduce ID switches compared to other self-supervised methods, setting the new state of the art for self-supervised multi-object tracking and even performing on par with supervised methods on the BDD100k benchmark.Comment: 8 pages, 3 figures, 5 table

    Yet another gaze detector: An embodied calibration free system for the iCub robot

    Get PDF
    Schillingmann L, Nagai Y. Yet another gaze detector: An embodied calibration free system for the iCub robot. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids). Seoul: IEEE; 2015

    Online nod detection in human–robot interaction

    Get PDF
    Wall E, Kummert F, Schillingmann L. Online nod detection in human–robot interaction. In: 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). 2017: 811-817

    Confirmation detection in human-agent interaction using non-lexical speech cues

    Get PDF
    Brandt M, Wrede B, Kummert F, Schillingmann L. Confirmation detection in human-agent interaction using non-lexical speech cues. Presented at the AAAI Symposium on Natural Communication for Human-Robot Collaboration, Arlington, VA.Even if only the acoustic channel is considered, human communication is highly multi-modal. Non-lexical cues provide a variety of information such as emotion or agreement. The ability to process such cues is highly relevant for spoken dialog systems, especially in assistance systems. In this paper we focus on the recognition of non-lexical confirmations such as "mhm", as they enhance the system's ability to accurately interpret human intent in natural communication. The architecture uses a Support Vector Machine to detect confirmations based on acoustic features. In a systematic comparison, several feature sets were evaluated for their performance on a corpus of human-agent interaction in a setting with naive users including elderly and cognitively impaired people. Our results show that using stacked formants as features yield an accuracy of 84% outperforming regular formants and MFCC or pitch based features for online classification

    Gaze is not Enough: Computational Analysis of Infant’s Head Movement Measures the Developing Response to Social Interaction

    Get PDF
    Schillingmann L, Burling JM, Yoshida H, Nagai Y. Gaze is not Enough: Computational Analysis of Infant’s Head Movement Measures the Developing Response to Social Interaction. Presented at the 37th Annual Meeting of the Cognitive Science Society

    How do Infants Coordinate Head and Gaze?: Computational Analysis of Infant’s First Person View in Social Interactions

    Get PDF
    Schillingmann L, Burling JM, Yoshida H, Nagai Y. How do Infants Coordinate Head and Gaze?: Computational Analysis of Infant’s First Person View in Social Interactions. Presented at the Biennial Meeting of the SRCD, Philadelphia

    Gaze Contingency in Turn-Taking for Human Robot Interaction: Advantages and Drawbacks

    Get PDF
    Palinko O, Sciutti A, Schillingmann L, Rea F, Nagai Y, Sandini G. Gaze Contingency in Turn-Taking for Human Robot Interaction: Advantages and Drawbacks. Presented at the 24th IEEE International Symposium on Robot and Human Interactive Communication

    Conversational Assistants for Elderly Users – The Importance of Socially Cooperative Dialogue

    Get PDF
    Kopp S, Brandt M, Buschmeier H, et al. Conversational Assistants for Elderly Users – The Importance of Socially Cooperative Dialogue. In: André E, Bickmore T, Vrochidis S, Wanner L, eds. Proceedings of the AAMAS Workshop on Intelligent Conversation Agents in Home and Geriatric Care Applications co-located with the Federated AI Meeting. CEUR Workshop Proceedings. Vol 2338. Aachen: RWTH; 2018: 10–17.Conversational agents can provide valuable cognitive and/or emotional assistance to elderly users or people with cognitive impairments who often have difficulties in organizing and following a structured day schedule. Previous research showed that a virtual assistant that can interact in spoken language would be a desirable help for those users. However, these user groups pose specific requirements for spoken dialogue interaction that existing systems hardly meet. This paper presents work on a virtual conversational assistant that was designed for, and together with, elderly as well as cognitively handicapped users. It has been specifically developed to enable ‘socially cooperative dialogue’ – adaptive and aware conversational interaction in which mutual understanding is co-constructed and ensured collaboratively. The technical approach is described and results of evaluation studies are reported
    corecore