11,128 research outputs found

    Learning Vision-and-Language Navigation from YouTube Videos

    Full text link
    Vision-and-language navigation (VLN) requires an embodied agent to navigate in realistic 3D environments using natural language instructions. Existing VLN methods suffer from training on small-scale environments or unreasonable path-instruction datasets, limiting the generalization to unseen environments. There are massive house tour videos on YouTube, providing abundant real navigation experiences and layout information. However, these videos have not been explored for VLN before. In this paper, we propose to learn an agent from these videos by creating a large-scale dataset which comprises reasonable path-instruction pairs from house tour videos and pre-training the agent on it. To achieve this, we have to tackle the challenges of automatically constructing path-instruction pairs and exploiting real layout knowledge from raw and unlabeled videos. To address these, we first leverage an entropy-based method to construct the nodes of a path trajectory. Then, we propose an action-aware generator for generating instructions from unlabeled trajectories. Last, we devise a trajectory judgment pretext task to encourage the agent to mine the layout knowledge. Experimental results show that our method achieves state-of-the-art performance on two popular benchmarks (R2R and REVERIE). Code is available at https://github.com/JeremyLinky/YouTube-VLNComment: Accepted by ICCV 202

    EXAMINING THE CONTENT ALIGNMENT BETWEEN LANGUAGE CURRICULUM AND A LANGUAGE TEST IN CHINA

    Get PDF
    This study examined the content alignment between an English as a foreign language skills curriculum and a provincial language test in China. When there is misalignment in the content between the standards of a curriculum and a test, conclusions about student abilities and teaching effectiveness can be questioned. To examine this, three categories of alignment were investigated using document analysis and expert judgment: categorical concurrence, range of knowledge correspondence, and balance of representation. Eight reviewers coded the curriculum and test items. Results showed that the curriculum aligned across the three criteria for the listening and reading skills. For the writing skills, the range of knowledge correspondence and balance of representation criteria were met, but categorical concurrence was not. The test did not include speaking items, so there was complete misalignment with that curriculum. The findings showed that the test partially aligned with the curriculum, suggesting that performance may not fully represent students’ ability to meet the curricular standards. We recommend that future tests should comprehensively cover all of the content in the curriculum and when doing so to ensure there is a sufficient number of items measuring each objective. This would improve how accurately interpretations of student performance can be made

    The Narratological Style and the Reader of Evelyn Waugh’s Early Satires

    Get PDF
    The article deals with the role of the implied reader in Evelyn Waugh’s novels. An attempt will be made to define who the implied reader is and what is his position and role in fictional realm. By analysing narrative strategies and techniques within a selection of Waugh’s works, I attempt to discuss the changing relationship between author/narrator and the implied reader. I try to point that within Evelyn Waugh’s writing one can observe a simultaneous evolution of the narrator voice and the role ascribed to the implied reader. What is more, I discuss the limits of interpretation. I try to present and analyse how both the author/narrator and the text can impose limitations on the implied reader allowing him to move freely, however, within a set frame. Intertextuality is one of the focal points of the article, as I try to propose that the use of specific intertextual references in several different novels enhances the reader’s understanding of  Waugh’s fictional world. An attempt is made to prove that through analysis of different levels of understanding intertextual relation the reader takes on himself a role of creator. Furthermore, I draw attention to the places of indeterminacy. In this discussion I include both structures of indeterminacy proposed by Roman Ingarden i.e. blanks and negations, as both are needed not only to establish the interaction that takes place between text and the implied reader, but also to try to regulate such relation. An attempt will be made to explain how important filling the gaps within the text is and how completing the blanks affects the reader and the process of reading as such

    Statistical and explicit learning of graphotactic patterns with no phonological counterpart: Evidence from artificial lexicon studies with 6– to 7-year-olds and adults

    Get PDF
    Children are powerful statistical spellers: They can learn novel written patterns with phonological counterparts under experimental conditions, via implicit learning processes, akin to “statistical learning” processes established for spoken language acquisition. Can these mechanisms fully account for children’s knowledge of written patterns? How does this ability relate to literacy measures? How does it compare to explicit learning? This thesis addresses these questions in a series of artificial lexicon experiments, inducing graphotactic learning under incidental and explicit conditions, and comparing it with measures of literacy. The first experiment adapted an existing design (Samara & Caravolas, 2014), with the goal of searching for stronger effects. Subsequent experiments address a further limitation: Previous studies assessed learning of spelling rules which have counterparts in spoken language; however, while this is also the case for some naturalistic spelling rules (e.g., English phonotactics prohibit word initial /ŋ/ and accordingly, written words cannot begin with ng), there are also purely visual constraints (graphotactics) (e.g., gz is an illegal spelling of a frequent word-final sound combination in English: *bagz). Can children learn patterns unconfounded from correlated phonotactics? In further experiments, developing and skilled spellers were exposed to patterns replete of phonotactic cues. In post-tests, participants generalized over both positional constraints embedded in semiartificial strings, and contextual constraints created using homophonic non-word stimuli. This was demonstrated following passive exposure and even under meaningful (word learning) conditions, and success in learning graphotactics was not hindered by learning word meanings. However, the effect sizes across this thesis remained small, and the hypothesized positive associations between learning performance under incidental conditions and literacy measures were never observed. This relationship was only found under explicit conditions, when pattern generalization benefited. Investigation of age effects revealed that adults and children show similar patterns of learning but adults learn faster from matched text

    The role of working memory in attentional allocation and grammatical development under textually-enhanced, unenhanced and no captioning conditions

    Get PDF
    This study investigated the extent to which individual differences in working memory (WM) mediate the effects of captions with or without textual enhancement on attentional allocation and L2 grammatical development, and whether L2 development is influenced by WM memory in the absence of captions. We employed a pretest-posttest-delayed posttest design, with 72 Korean learners of English randomly assigned to three groups. The groups differed as to whether they were exposed to news clips without captions, with textually-enhanced captions, or with unenhanced captions during the treatment. We measured attentional allocation with eye-tracking methodology, and assessed development with an oral production, a written production and a fill-in-the-blank test. To assess various aspects of WM, we employed measures of phonological and visual short-term memory (PSTM, VSTM) and the executive functions of updating, task-switching, and inhibitory control. We found that, in both captions groups, higher PSTM was associated with higher oral production gains. For the enhanced captions group, PSTM was also positively related to gains on the written production test. Participants in the no-captions group, however, showed a positive link between VSTM and oral production gains. Attentional location only correlated positively with updating ability and PSTM under the enhanced captions condition. These results, overall, indicate that WM can moderate the effects of captions on attention and L2 development, and various WM components may play a differential role under various captioning conditions
    • …
    corecore