39 research outputs found

    Large Language Model Augmented Exercise Retrieval for Personalized Language Learning

    Full text link
    We study the problem of zero-shot exercise retrieval in the context of online language learning, to give learners the ability to explicitly request personalized exercises via natural language. Using real-world data collected from language learners, we observe that vector similarity approaches poorly capture the relationship between exercise content and the language that learners use to express what they want to learn. This semantic gap between queries and content dramatically reduces the effectiveness of general-purpose retrieval models pretrained on large scale information retrieval datasets like MS MARCO. We leverage the generative capabilities of large language models to bridge the gap by synthesizing hypothetical exercises based on the learner's input, which are then used to search for relevant exercises. Our approach, which we call mHyER, overcomes three challenges: (1) lack of relevance labels for training, (2) unrestricted learner input content, and (3) low semantic similarity between input and retrieval candidates. mHyER outperforms several strong baselines on two novel benchmarks created from crowdsourced data and publicly available data.Comment: Presented at Learning Analytics and Knowledge 2024. 11 pages, 4 figures, 5 table

    Verb phrase ellipsis: The view from information structure.

    Get PDF
    Abstract Findings from three experimental studies are presented in support of the hypothesis that the reduced acceptability associated with antecedent mismatch under ellipsis reflects violation of an information structural constraint governing contrastive topic structures, and not an ellipsis-specific licensing constraint as previously assumed. Magnitude estimation data show that the penalty associated with a mismatched antecedent is larger for contrastive topic ellipses as compared to ellipses which exhibit simple (non-contrastive topic) focus. The same pattern of acceptability is also observed for non-ellipsis controls, however. Online reading times indicate increased processing costs associated with antecedent mismatch, and the cost is greater in contrastive topic as compared to simple focus ellipses. Elevated reading times for mismatched contrastive topics are observed throughout the target clause, however, including regions prior to the ellipsis site.

    Do successor effects in reading reflect lexical parafoveal processing? Evidence from corpus-based and experimental eye movement data

    Get PDF
    Abstract In the past, most research on eye movements during reading involved a limited number of subjects reading sentences with specific experimental manipulations on target words. Such experiments usually only analyzed eye-movements measures on and around the target word. Recently, some researchers have started collecting larger data sets involving large and diverse groups of subjects reading large numbers of sentences, enabling them to consider a larger number of influences and study larger and more representative subject groups. In such corpus studies, most of the words in a sentence are analyzed. The complexity of the design of corpus studies and the many potentially uncontrolled influences in such studies pose new issues concerning the analysis methods and interpretability of the data. In particular, several corpus studies of reading have found an effect of successor word (n + 1) frequency on current word (n) fixation times, while studies employing experimental manipulations tend not to. The general interpretation of corpus studies suggests that readers obtain parafoveal lexical information from the upcoming word before they have finished identifying the current word, while the experimental manipulations shed doubt on this claim. In the present study, we combined a corpus analysis approach with an experimental manipulation (i.e., a parafoveal modification of the moving mask technique, Rayner & Bertera, 1979), so that, either (a) word n+1, (b) word n+;2, (c) both words, or (d) neither word was masked. We found that denying preview for either or both parafoveal words increased average fixation times. Furthermore, we found successor effects similar to those reported in the corpus studies. Importantly, these successor effects were found even when the parafoveal word was masked, suggesting that apparent successor frequency effects may be due to causes that are unrelated to lexical parafoveal preprocessing. We discuss the implications of this finding both for parallel and serial accounts of word identification and for the interpretability of large correlational studies of word identification in reading in general

    Comprehension questions change beliefs under uncertainty

    No full text

    Bicknell, Levy, & Rayner

    No full text
    This contains data, analysis, and materials for Bicknell, Levy, and Rayner paper in Psychological Science

    Online expectations for verbal arguments conditional on event knowledge

    No full text
    Abstract This paper provides support for the hypothesis that comprehenders form online expectations for upcoming verbal arguments using their knowledge of typical events. We test this hypothesis in a self-paced reading experiment and an experiment measuring event-related brain potentials. In both experiments, we use materials in which the likelihood of the verbal patient depends on event knowledge about the particular combination of agent and verb earlier in the sentence. By manipulating the agent for a given verb, we show that comprehenders experience more processing difficulty in sentences where the patient is less likely. Norming studies and a priming experiment provide evidence that this result is unlikely to have arisen from direct linguistic associations between patient and agent, suggesting that comprehenders use their event knowledge to form expectations

    Eye movements in reading as rational behavior

    No full text
    Moving one's eyes while reading is one of the most complex everyday tasks humans face. To perform efficiently, readers must make decisions about when and where to move their eyes every 200-300ms. Over the past decades, it has been demonstrated that these fine-grained decisions are influenced by a range of linguistic properties of the text, and measuring eye movements during reading has become one of the primary methods of studying online sentence comprehension. However, it is still largely unclear why linguistic variables affect the eye movement record in the ways they do. The present work begins to answer this question by presenting a rational framework for understanding eye movement control in reading, in which probabilistic language knowledge plays a crucial role. Specifically, the task of reading is taken to be one of sentence identification: readers move their eyes to efficiently obtain visual input, which they combine with probabilistic language knowledge through Bayesian inference to yield posterior beliefs about sentence form and structure. Simulations with implemented models within this framework demonstrate that it can provide a principled account of many aspects of reading behavior, including the influence of a number of linguistic variables. In addition, the framework suggests a novel explanation for one of the least understood aspects of eye movements in reading - regressive eye movements - and we present evidence from an eye tracking corpus to support this proposa
    corecore