3 research outputs found
LatteGAN: Visually Guided Language Attention for Multi-Turn Text-Conditioned Image Manipulation
Text-guided image manipulation tasks have recently gained attention in the
vision-and-language community. While most of the prior studies focused on
single-turn manipulation, our goal in this paper is to address the more
challenging multi-turn image manipulation (MTIM) task. Previous models for this
task successfully generate images iteratively, given a sequence of instructions
and a previously generated image. However, this approach suffers from
under-generation and a lack of generated quality of the objects that are
described in the instructions, which consequently degrades the overall
performance. To overcome these problems, we present a novel architecture called
a Visually Guided Language Attention GAN (LatteGAN). Here, we address the
limitations of the previous approaches by introducing a Visually Guided
Language Attention (Latte) module, which extracts fine-grained text
representations for the generator, and a Text-Conditioned U-Net discriminator
architecture, which discriminates both the global and local representations of
fake or real images. Extensive experiments on two distinct MTIM datasets,
CoDraw and i-CLEVR, demonstrate the state-of-the-art performance of the
proposed model
Mask and Cloze: Automatic Open Cloze Question Generation Using a Masked Language Model
This paper conducts the first trial to apply a masked language AI model and the “Gini coefficient” to the field of English study. We propose an algorithm named CLOZER that generates open cloze questions that inquiry knowledge of English learners. Open cloze questions (OCQ) have been attracting attention for both measuring the ability and facilitating the learning of English learners. However, since OCQ is in free form, teachers have to ensure that only a ground truth answer and no additional words will be accepted in the blank. A remarkable benefit of CLOZER is to relieve teachers of the burden of producing OCQ. Moreover, CLOZER provides a self-study environment for English learners by automatically generating OCQ. We evaluated CLOZER through quantitative experiments on 1,600 answers and show its effectiveness statistically. Compared with human-generated questions, we also revealed that CLOZER can generate OCQs better than the average non-native English teacher. Additionally, we conducted a field study at a high school to clarify the benefits and hurdles when introducing CLOZER. Then, on the basis of our findings, we proposed several design improvements