3 research outputs found

    LatteGAN: Visually Guided Language Attention for Multi-Turn Text-Conditioned Image Manipulation

    Full text link
    Text-guided image manipulation tasks have recently gained attention in the vision-and-language community. While most of the prior studies focused on single-turn manipulation, our goal in this paper is to address the more challenging multi-turn image manipulation (MTIM) task. Previous models for this task successfully generate images iteratively, given a sequence of instructions and a previously generated image. However, this approach suffers from under-generation and a lack of generated quality of the objects that are described in the instructions, which consequently degrades the overall performance. To overcome these problems, we present a novel architecture called a Visually Guided Language Attention GAN (LatteGAN). Here, we address the limitations of the previous approaches by introducing a Visually Guided Language Attention (Latte) module, which extracts fine-grained text representations for the generator, and a Text-Conditioned U-Net discriminator architecture, which discriminates both the global and local representations of fake or real images. Extensive experiments on two distinct MTIM datasets, CoDraw and i-CLEVR, demonstrate the state-of-the-art performance of the proposed model

    Mask and Cloze: Automatic Open Cloze Question Generation Using a Masked Language Model

    No full text
    This paper conducts the first trial to apply a masked language AI model and the “Gini coefficient” to the field of English study. We propose an algorithm named CLOZER that generates open cloze questions that inquiry knowledge of English learners. Open cloze questions (OCQ) have been attracting attention for both measuring the ability and facilitating the learning of English learners. However, since OCQ is in free form, teachers have to ensure that only a ground truth answer and no additional words will be accepted in the blank. A remarkable benefit of CLOZER is to relieve teachers of the burden of producing OCQ. Moreover, CLOZER provides a self-study environment for English learners by automatically generating OCQ. We evaluated CLOZER through quantitative experiments on 1,600 answers and show its effectiveness statistically. Compared with human-generated questions, we also revealed that CLOZER can generate OCQs better than the average non-native English teacher. Additionally, we conducted a field study at a high school to clarify the benefits and hurdles when introducing CLOZER. Then, on the basis of our findings, we proposed several design improvements
    corecore