5 research outputs found

    InproTKs: A Toolkit for Incremental Situated Processing

    Get PDF
    Kennington C, Kousidis S, Schlangen D. InproTKs: A Toolkit for Incremental Situated Processing. In: Proceedings of SIGdial 2014: Short Papers. 2014: 84-88

    Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information

    Get PDF
    Kennington C, Iida R, Tokunaga T, Schlangen D. Incrementally Tracking Reference in Human/Human Dialogue Using Linguistic and Extra-Linguistic Information. In: Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015). Denver, U.S.A.: Association for Computational Linguistics; 2015: 272-282

    Are we all disfluent in our own special way and should dialogue systems also be?

    Get PDF
    Betz S, Lopez Gambino MS. Are we all disfluent in our own special way and should dialogue systems also be? In: Jokisch O, ed. Elektronische Sprachsignalverarbeitung (ESSV) 2016. Studientexte zur Sprachkommunikation. Vol 81. Dresden: TUD Press; 2016: 168-174

    Investigating speaker gaze and pointing behaviour in human-computer interaction with the `mint.tools` collection

    Get PDF
    Kousidis S, Kennington C, Schlangen D. Investigating speaker gaze and pointing behaviour in human-computer interaction with the `mint.tools` collection. In: Proceedings of Short Papers at SIGdial 2013. 2013

    Incrementally resolving references in order to identify visually present objects in a situated dialogue setting

    Get PDF
    Kennington C. Incrementally resolving references in order to identify visually present objects in a situated dialogue setting. Bielefeld: Universität Bielefeld; 2016.The primary concern of this thesis is to model the resolution of spoken referring expressions made in order to identify objects; in particular, everyday objects that can be perceived visually and distinctly from other objects. The practical goal of such a model is for it to be implemented as a component for use in a live, interactive, autonomous spoken dialogue system. The requirement of interaction imposes an added complication; one that has been ignored in previous models and approaches to automatic reference resolution: the model must attempt to resolve the reference incrementally as it unfolds–not wait until the end of the referring expression to begin the resolution process. Beyond components in dialogue systems, reference has been a major player in the philosophy of meaning for longer than a century. For example, Gottlob Frege (1892) has distinguished between Sinn (sense) and Bedeutung (reference), discussed how they are related and how they relate to the meaning of words and expressions. It has furthermore been argued (e.g., Dahlgren (1976)) that reference to entities in the actual world is not just a fundamental notion of semantic theory, but the fundamental notion; for an individual acquiring a language, understanding the meaning of many words and concepts is done via the task of reference, beginning in early childhood. In this thesis, we pursue an account of word meaning that is based on perception of objects; for example, the meaning of the word red is based on visual features that are selected as distinguishing red objects from non-red ones. This thesis proposes two statistical models of incremental reference resolution. Given ex- amples of referring expressions and visual aspects of the objects to which those expressions referred, both model components learn a functional mapping between the words of the refer- ring expressions and the visual aspects. A generative model, the simple incremental update model, presented in Chapter 5, uses a mediating variable to learn the mapping, whereas a dis- criminative model, the words-as-classifiers model, presented in Chapter 6, learns the mapping directly and improves over the generative model. Both models have been evaluated in various reference resolution tasks to objects in virtual scenes as well as real, tangible objects. This thesis shows that both models work robustly and are able to resolve referring expressions made in reference to visually present objects despite realistic, noisy conditions of speech and object recognition. A theoretical and practical comparison is also provided. Special emphasis is given to the discriminative model in this thesis because of its simplicity and ability to represent word meanings. It is in the learning and application of this model that gives credence to the above claim that reference is the fundamental notion for semantic theory and that meanings of (visual) words is done through experiencing referring expressions made to objects that are visually perceivable
    corecore