3,345 research outputs found
The BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings
We motivate and describe a new freely available human-human dialogue dataset
for interactive learning of visually grounded word meanings through ostensive
definition by a tutor to a learner. The data has been collected using a novel,
character-by-character variant of the DiET chat tool (Healey et al., 2003;
Mills and Healey, submitted) with a novel task, where a Learner needs to learn
invented visual attribute words (such as " burchak " for square) from a tutor.
As such, the text-based interactions closely resemble face-to-face conversation
and thus contain many of the linguistic phenomena encountered in natural,
spontaneous dialogue. These include self-and other-correction, mid-sentence
continuations, interruptions, overlaps, fillers, and hedges. We also present a
generic n-gram framework for building user (i.e. tutor) simulations from this
type of incremental data, which is freely available to researchers. We show
that the simulations produce outputs that are similar to the original data
(e.g. 78% turn match similarity). Finally, we train and evaluate a
Reinforcement Learning dialogue control agent for learning visually grounded
word meanings, trained from the BURCHAK corpus. The learned policy shows
comparable performance to a rule-based system built previously.Comment: 10 pages, THE 6TH WORKSHOP ON VISION AND LANGUAGE (VL'17
GEMINI: A Natural Language System for Spoken-Language Understanding
Gemini is a natural language understanding system developed for spoken
language applications. The paper describes the architecture of Gemini, paying
particular attention to resolving the tension between robustness and
overgeneration. Gemini features a broad-coverage unification-based grammar of
English, fully interleaved syntactic and semantic processing in an all-paths,
bottom-up parser, and an utterance-level parser to find interpretations of
sentences that might not be analyzable as complete sentences. Gemini also
includes novel components for recognizing and correcting grammatical
disfluencies, and for doing parse preferences. This paper presents a
component-by-component view of Gemini, providing detailed relevant measurements
of size, efficiency, and performance.Comment: 8 pages, postscrip
Verbmobil : translation of face-to-face dialogs
Verbmobil is a long-term project on the translation of spontaneous language in negotiation dialogs. We describe the goals of the project, the chosen discourse domains and the initial project schedule. We discuss some of the distinguishing features of Verbmobil and introduce the notion of translation on demand and variable depth of processing in speech translation. Finally, the role of anytime modules for efficient dialog translation in close to real time is described
VICA, a visual counseling agent for emotional distress
We present VICA, a Visual Counseling Agent designed to create an engaging multimedia face-to-face interaction. VICA is a human-friendly agent equipped with high-performance voice conversation designed to help psychologically stressed users, to offload their emotional burden. Such users specifically include non-computer-savvy elderly persons or clients. Our agent builds replies exploiting interlocutor\u2019s utterances expressing such as wishes, obstacles, emotions, etc. Statements asking for confirmation, details, emotional summary, or relations among such expressions are added to the utterances. We claim that VICA is suitable for positive counseling scenarios where multimedia specifically high-performance voice communication is instrumental for even the old or digital divided users to continue dialogue towards their self-awareness. To prove this claim, VICA\u2019s effect is evaluated with respect to a previous text-based counseling agent CRECA and ELIZA including its successors. An experiment involving 14 subjects shows VICA effects as follows: (i) the dialogue continuation (CPS: Conversation-turns Per Session) of VICA for the older half (age > 40) substantially improved 53% to CRECA and 71% to ELIZA. (ii) VICA\u2019s capability to foster peace of mind and other positive feelings was assessed with a very high score of 5 or 6 mostly, out of 7 stages of the Likert scale, again by the older. Compared on average, such capability of VICA for the older is 5.14 while CRECA (all subjects are young students, age < 25) is 4.50, ELIZA is 3.50, and the best of ELIZA\u2019s successors for the older (> 25) is 4.41
Distributional effects and individual differences in L2 morphology learning
Second language (L2) learning outcomes may depend on the structure of the input and learners’ cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and verbal working memory. Over three sessions, 54 adults were exposed to a Russian case-marking paradigm with a balanced or skewed item distribution in the input. Whereas statistical learning ability and nonverbal intelligence predicted learning of trained items, only nonverbal intelligence also predicted generalization of case-marking inflections to new vocabulary. Neither measure of temporary storage capacity predicted learning. Balanced, less predictable input was associated with higher accuracy in generalization but only in the initial test session. These results suggest that individual differences in pattern extraction play a more sustained role in L2 acquisition than instructional manipulations that vary the predictability of lexical items in the input
Dialogs Re-enacted Across Languages
To support machine learning of cross-language prosodic mappings and other
ways to improve speech-to-speech translation, we present a protocol for
collecting closely matched pairs of utterances across languages, a description
of the resulting data collection, and some observations and musings. This
report is intended for 1) people using the corpus, 2) people extending the
corpus, and 3) people designing similar collections of bilingual dialog data
- …