3,345 research outputs found

    The BURCHAK corpus: a Challenge Data Set for Interactive Learning of Visually Grounded Word Meanings

    Full text link
    We motivate and describe a new freely available human-human dialogue dataset for interactive learning of visually grounded word meanings through ostensive definition by a tutor to a learner. The data has been collected using a novel, character-by-character variant of the DiET chat tool (Healey et al., 2003; Mills and Healey, submitted) with a novel task, where a Learner needs to learn invented visual attribute words (such as " burchak " for square) from a tutor. As such, the text-based interactions closely resemble face-to-face conversation and thus contain many of the linguistic phenomena encountered in natural, spontaneous dialogue. These include self-and other-correction, mid-sentence continuations, interruptions, overlaps, fillers, and hedges. We also present a generic n-gram framework for building user (i.e. tutor) simulations from this type of incremental data, which is freely available to researchers. We show that the simulations produce outputs that are similar to the original data (e.g. 78% turn match similarity). Finally, we train and evaluate a Reinforcement Learning dialogue control agent for learning visually grounded word meanings, trained from the BURCHAK corpus. The learned policy shows comparable performance to a rule-based system built previously.Comment: 10 pages, THE 6TH WORKSHOP ON VISION AND LANGUAGE (VL'17

    GEMINI: A Natural Language System for Spoken-Language Understanding

    Full text link
    Gemini is a natural language understanding system developed for spoken language applications. The paper describes the architecture of Gemini, paying particular attention to resolving the tension between robustness and overgeneration. Gemini features a broad-coverage unification-based grammar of English, fully interleaved syntactic and semantic processing in an all-paths, bottom-up parser, and an utterance-level parser to find interpretations of sentences that might not be analyzable as complete sentences. Gemini also includes novel components for recognizing and correcting grammatical disfluencies, and for doing parse preferences. This paper presents a component-by-component view of Gemini, providing detailed relevant measurements of size, efficiency, and performance.Comment: 8 pages, postscrip

    Verbmobil : translation of face-to-face dialogs

    Get PDF
    Verbmobil is a long-term project on the translation of spontaneous language in negotiation dialogs. We describe the goals of the project, the chosen discourse domains and the initial project schedule. We discuss some of the distinguishing features of Verbmobil and introduce the notion of translation on demand and variable depth of processing in speech translation. Finally, the role of anytime modules for efficient dialog translation in close to real time is described

    VICA, a visual counseling agent for emotional distress

    Get PDF
    We present VICA, a Visual Counseling Agent designed to create an engaging multimedia face-to-face interaction. VICA is a human-friendly agent equipped with high-performance voice conversation designed to help psychologically stressed users, to offload their emotional burden. Such users specifically include non-computer-savvy elderly persons or clients. Our agent builds replies exploiting interlocutor\u2019s utterances expressing such as wishes, obstacles, emotions, etc. Statements asking for confirmation, details, emotional summary, or relations among such expressions are added to the utterances. We claim that VICA is suitable for positive counseling scenarios where multimedia specifically high-performance voice communication is instrumental for even the old or digital divided users to continue dialogue towards their self-awareness. To prove this claim, VICA\u2019s effect is evaluated with respect to a previous text-based counseling agent CRECA and ELIZA including its successors. An experiment involving 14 subjects shows VICA effects as follows: (i) the dialogue continuation (CPS: Conversation-turns Per Session) of VICA for the older half (age > 40) substantially improved 53% to CRECA and 71% to ELIZA. (ii) VICA\u2019s capability to foster peace of mind and other positive feelings was assessed with a very high score of 5 or 6 mostly, out of 7 stages of the Likert scale, again by the older. Compared on average, such capability of VICA for the older is 5.14 while CRECA (all subjects are young students, age < 25) is 4.50, ELIZA is 3.50, and the best of ELIZA\u2019s successors for the older (> 25) is 4.41

    Distributional effects and individual differences in L2 morphology learning

    Get PDF
    Second language (L2) learning outcomes may depend on the structure of the input and learners’ cognitive abilities. This study tested whether less predictable input might facilitate learning and generalization of L2 morphology while evaluating contributions of statistical learning ability, nonverbal intelligence, phonological short-term memory, and verbal working memory. Over three sessions, 54 adults were exposed to a Russian case-marking paradigm with a balanced or skewed item distribution in the input. Whereas statistical learning ability and nonverbal intelligence predicted learning of trained items, only nonverbal intelligence also predicted generalization of case-marking inflections to new vocabulary. Neither measure of temporary storage capacity predicted learning. Balanced, less predictable input was associated with higher accuracy in generalization but only in the initial test session. These results suggest that individual differences in pattern extraction play a more sustained role in L2 acquisition than instructional manipulations that vary the predictability of lexical items in the input

    Dialogs Re-enacted Across Languages

    Full text link
    To support machine learning of cross-language prosodic mappings and other ways to improve speech-to-speech translation, we present a protocol for collecting closely matched pairs of utterances across languages, a description of the resulting data collection, and some observations and musings. This report is intended for 1) people using the corpus, 2) people extending the corpus, and 3) people designing similar collections of bilingual dialog data

    Before they can teach they must talk : on some aspects of human-computer interaction

    Get PDF
    corecore