30,080 research outputs found
Gesture in Automatic Discourse Processing
Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning.My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract.These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features -- extracted automatically from video -- yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing
Follow-up question handling in the IMIX and Ritel systems: A comparative study
One of the basic topics of question answering (QA) dialogue systems is how follow-up questions should be interpreted by a QA system. In this paper, we shall discuss our experience with the IMIX and Ritel systems, for both of which a follow-up question handling scheme has been developed, and corpora have been collected. These two systems are each other's opposites in many respects: IMIX is multimodal, non-factoid, black-box QA, while Ritel is speech, factoid, keyword-based QA. Nevertheless, we will show that they are quite comparable, and that it is fruitful to examine the similarities and differences. We shall look at how the systems are composed, and how real, non-expert, users interact with the systems. We shall also provide comparisons with systems from the literature where possible, and indicate where open issues lie and in what areas existing systems may be improved. We conclude that most systems have a common architecture with a set of common subtasks, in particular detecting follow-up questions and finding referents for them. We characterise these tasks using the typical techniques used for performing them, and data from our corpora. We also identify a special type of follow-up question, the discourse question, which is asked when the user is trying to understand an answer, and propose some basic methods for handling it
Directional adposition use in English, Swedish and Finnish
Directional adpositions such as to the left of describe where a Figure is in relation to a Ground. English and Swedish directional adpositions refer to the location of a Figure in relation to a Ground, whether both are static or in motion. In contrast, the Finnish directional adpositions edellÀ (in front of) and jÀljessÀ (behind) solely describe the location of a moving Figure in relation to a moving Ground (Nikanne, 2003).
When using directional adpositions, a frame of reference must be assumed for interpreting the meaning of directional adpositions. For example, the meaning of to the left of in English can be based on a relative (speaker or listener based) reference frame or an intrinsic (object based) reference frame (Levinson, 1996). When a Figure and a Ground are both in motion, it is possible for a Figure to be described as being behind or in front of the Ground, even if neither have intrinsic features. As shown by Walker (in preparation), there are good reasons to assume that in the latter case a motion based reference frame is involved. This means that if Finnish speakers would use edellÀ (in front of) and jÀljessÀ (behind) more frequently in situations where both the Figure and Ground are in motion, a difference in reference frame use between Finnish on one hand and English and Swedish on the other could be expected.
We asked native English, Swedish and Finnish speakersâ to select adpositions from a language specific list to describe the location of a Figure relative to a Ground when both were shown to be moving on a computer screen. We were interested in any differences between Finnish, English and Swedish speakers.
All languages showed a predominant use of directional spatial adpositions referring to the lexical concepts TO THE LEFT OF, TO THE RIGHT OF, ABOVE and BELOW. There were no differences between the languages in directional adpositions use or reference frame use, including reference frame use based on motion.
We conclude that despite differences in the grammars of the languages involved, and potential differences in reference frame system use, the three languages investigated encode Figure location in relation to Ground location in a similar way when both are in motion.
Levinson, S. C. (1996). Frames of reference and Molyneuxâs question: Crosslingiuistic evidence. In P. Bloom, M.A. Peterson, L. Nadel & M.F. Garrett (Eds.) Language and Space (pp.109-170). Massachusetts: MIT Press.
Nikanne, U. (2003). How Finnish postpositions see the axis system. In E. van der Zee & J. Slack (Eds.), Representing direction in language and space. Oxford, UK: Oxford University Press.
Walker, C. (in preparation). Motion encoding in language, the use of spatial locatives in a motion context. Unpublished doctoral dissertation, University of Lincoln, Lincoln. United Kingdo
P-model Alternative to the T-model
Standard linguistic analysis of syntax uses the T-model. This model
requires the ordering: D-structure S-structure LF,
where D-structure is the deep structure,
S-structure is the surface structure, and LF is logical form.
Between each of these representations there is movement which alters
the order of the constituent words; movement is achieved using the principles
and parameters of syntactic theory. Psychological analysis of sentence
production is usually either serial or connectionist. Psychological serial
models do not accommodate the T-model immediately so that here a new model
called the P-model is introduced. The P-model is different from previous
linguistic and psychological models. Here it is argued that the LF
representation should be replaced by a variant
of Frege's three qualities (sense, reference, and force),
called the Frege representation or F-representation.
In the F-representation the order of elements is not necessarily the same as
that in LF and it is suggested that the correct ordering is:
F-representation D-structure S-structure.
This ordering appears to lead to a more natural
view of sentence production and processing. Within this framework movement
originates as the outcome of emphasis applied to the sentence. The
requirement that the F-representation precedes the D-structure needs a picture
of the particular principles and parameters which pertain to movement of words
between representations. In general this would imply that there is a
preferred or optimal ordering of the symbolic string in the F-representation.
The standard ordering is retained because the general way of producing
such an optimal ordering is unclear. In this case it is possible to produce
an analysis of movement between LF and D-structure similar to the usual
analysis of movement between S-structure and LF.
It is suggested that a maximal amount of information about
a language's grammar and lexicon is stored,
because of the necessity of analyzing corrupted data
Agents for educational games and simulations
This book consists mainly of revised papers that were presented at the Agents for Educational Games and Simulation (AEGS) workshop held on May 2, 2011, as part of the Autonomous Agents and MultiAgent Systems (AAMAS) conference in Taipei, Taiwan. The 12 full papers presented were carefully reviewed and selected from various submissions. The papers are organized topical sections on middleware applications, dialogues and learning, adaption and convergence, and agent applications
- âŠ