1,269 research outputs found
Reference resolution in multi-modal interaction: Preliminary observations
In this paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems but do not give general solutions. Rather we decide to perform more detailed research on reference resolution in uni-modal contexts to obtain methods generalizable to multi-modal contexts. Since we try to build applications for a Dutch audience and since hardly any research has been done on reference resolution for Dutch, we give results on the resolution of anaphoric and deictic references in Dutch texts. We hope to be able to extend these results to our multimodal contexts later
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
Understanding language goes hand in hand with the ability to integrate
complex contextual information obtained via perception. In this work, we
present a novel task for grounded language understanding: disambiguating a
sentence given a visual scene which depicts one of the possible interpretations
of that sentence. To this end, we introduce a new multimodal corpus containing
ambiguous sentences, representing a wide range of syntactic, semantic and
discourse ambiguities, coupled with videos that visualize the different
interpretations for each sentence. We address this task by extending a vision
model which determines if a sentence is depicted by a video. We demonstrate how
such a model can be adjusted to recognize different interpretations of the same
underlying sentence, allowing to disambiguate sentences in a unified fashion
across the different ambiguity types.Comment: EMNLP 201
Focusing for Pronoun Resolution in English Discourse: An Implementation
Anaphora resolution is one of the most active research areas in natural
language processing. This study examines focusing as a tool for the resolution
of pronouns which are a kind of anaphora. Focusing is a discourse phenomenon
like anaphora. Candy Sidner formalized focusing in her 1979 MIT PhD thesis and
devised several algorithms to resolve definite anaphora including pronouns. She
presented her theory in a computational framework but did not generally
implement the algorithms. Her algorithms related to focusing and pronoun
resolution are implemented in this thesis. This implementation provides a
better comprehension of the theory both from a conceptual and a computational
point of view. The resulting program is tested on different discourse segments,
and evaluation and analysis of the experiments are presented together with the
statistical results.Comment: iii + 49 pages, compressed, uuencoded Postscript file; revised
version of the first author's Bilkent M.S. thesis, written under the
supervision of the second author; notify Akman via e-mail
([email protected]) or fax (+90-312-266-4126) if you are unable to
obtain hardcopy, he'll work out somethin
Reference Resolution in Multi-modal Interaction: Position paper
In this position paper we present our research on multimodal interaction in and with virtual environments. The aim of this presentation is to emphasize the necessity to spend more research on reference resolution in multimodal contexts. In multi-modal interaction the human conversational partner can apply more than one modality in conveying his or her message to the environment in which a computer detects and interprets signals from different modalities. We show some naturally arising problems and how they are treated for different contexts. No generally applicable solutions are given
Modelling Discourse-related terminology in OntoLingAnnotâs ontologies
Recently, computational linguists have shown great interest in discourse annotation in an attempt to capture the internal relations in texts. With this aim, we have formalized the linguistic knowledge associated to discourse into different linguistic ontologies. In this paper, we present the most prominent discourse-related terms and concepts included in the ontologies of the OntoLingAnnot annotation model. They show the different units, values, attributes, relations, layers and strata included in the discourse annotation level of the OntoLingAnnot model, within which these ontologies are included, used and evaluated
Conversational Agents, Humorous Act Construction, and Social Intelligence
Humans use humour to ease communication problems in human-human interaction and \ud
in a similar way humour can be used to solve communication problems that arise\ud
with human-computer interaction. We discuss the role of embodied conversational\ud
agents in human-computer interaction and we have observations on the generation\ud
of humorous acts and on the appropriateness of displaying them by embodied\ud
conversational agents in order to smoothen, when necessary, their interactions\ud
with a human partner. The humorous acts we consider are generated spontaneously.\ud
They are the product of an appraisal of the conversational situation and the\ud
possibility to generate a humorous act from the elements that make up this\ud
conversational situation, in particular the interaction history of the\ud
conversational partners
Follow-up question handling in the IMIX and Ritel systems: A comparative study
One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it
- âŠ