20 research outputs found

    Automatic Question Generation and Student Answer Assessment in Dialogue-based Intelligent Tutoring Systems

    Get PDF
    Dialogue-based Intelligent Tutoring Systems (ITSs) have already proven to be very effective at inducing learning gains in students. These systems are guided by dialog scripts, the heart of many dialog systems, for the interactions with students. The scripts typically consist of a list of questions and corresponding ideal answers. In most ITSs, such scripts are manually crafted from instructional task descriptions. Such manual efforts not only cost more in terms of time and effort but also set a bottleneck in the scalability of the systems. Another major challenge they face is to automatically assess student answers with respect to the ideal answers. To address these challenges, this research proposes novel approaches to automatically generate questions. Furthermore, it focuses on finding appropriate approaches to assess and understand student responses in the form of natural text inputs. The question generation process generates cloze and open-cloze questions. Cloze questions are automatically generated by mining recorded tutorial dialogues between actual students and a state-of-the-art ITS. It complements the existing systems that rely only on the contents of instructional texts. Open-cloze questions are generated by minimizing human efforts. Specifically, active learning is used to train classifiers for judging the quality of automatically generated open-cloze questions, the most expensive step in generating open-cloze questions. Experiments show that a reasonably good classifier can be built with 300-500 examples labeled by using active learning which can provide about 5-10% more in accuracy and about 3-5% more in F1-measure than random sampling. Towards assessing and understanding student responses, this research addresses pronoun resolution and semantic textual similarity (STS) problems in the context of tutorial dialog. For pronoun resolution, a supervised machine learning approach is proposed which has a F-measure of 88.93%, showing its robustness in resolving pronouns. For assessing student responses, STS methods are used as they provide numeric scores indicating the degrees of equivalence in meaning between student answers and corresponding ideal answers. Since student responses in tutorial dialog are typically short in length, this research seeks to find the best methods for short text-to-text STS. To this end, Latent Dirichlet Allocation-based and regression-based methods are proposed. The methods are found to be very promising for computing short text-to-text semantic similarities. Although approaches to the STS problem provide numeric scores, they fail to explain the reasons behind them. To this direction, an interpretable STS system has been proposed which has been ranked at the top tier of this kind in the literature

    Towards detecting intra- and inter-sentential negation scope and focus in dialogue

    No full text
    We present in this paper a study on negation in dialogues. In particular, we analyze the peculiarities of negation in dialogues and propose a new method to detect intra-sentential and inter-sentential negation scope and focus in dialogue context. A key element of the solution is to use dialogue context in the form of previous utterances, which is often needed for proper interpretation of negation in dialogue compared to literary, non-dialogue texts. We have modeled the negation scope and focus detection tasks as a sequence labeling tasks and used Conditional Random Field models to label each token in an utterance as being within the scope/focus of negation or not. The proposed negation scope and focus detection method is evaluated on a newly created corpus (called the DeepTutor Negation corpus; DT-Neg). This dataset was created from actual tutorial dialogue interactions between high school students and a state-of-the-art intelligent tutoring system

    Judging the quality of automatically generated gap-fill question using active learning

    No full text
    In this paper, we propose to use active learning for training classifiers to judge the quality of gap-fill questions. Gap-fill questions are widely used for assessments in education contexts because they can be graded automatically while offering reliable assessment of learners\u27 knowledge level if appropriately calibrated. Active learning is a machine learning framework which is typically used when unlabeled data is abundant but manual annotation is slow and expensive. This is the case in many Natural Language Processing tasks, including automated question generation, which is our focus. A key task in automated question generation is judging the quality of the generated questions. Classifiers can be built to address this task which typically are trained on human labeled data. Our evaluation results suggest that the use of active learning leads to accurate classifiers for judging the quality of gap-fill questions while keeping the annotation costs in check. We are not aware of any previous effort that uses active learning for question evaluation

    DARE: Deep Anaphora Resolution in dialogue based intelligent tutoring systems

    No full text
    Anaphora resolution is a central topic in dialogue and discourse processing that deals with finding the referents of pronouns. There are no studies, to the best of our knowledge, that focus on anaphora resolution in the context of tutorial dialogues. In this paper, we present the first version of DARE (Deep Anaphora Resolution Engine), an anaphora resolution engine for dialogue-based Intelligent Tutoring Systems. The development of DARE was guided by dialogues obtained from two dialogue-based computer tutors: DeepTutor and AutoTutor

    A study of probabilistic and algebraic methods for semantic similarity

    No full text
    We study and propose in this article several novel solutions to the task of semantic similarity between two short texts. The proposed solutions are based on the probabilistic method of Latent Dirichlet Allocation (LDA) and on the algebraic method of Latent Semantic Analysis (LSA). Both methods, LDA and LSA, are completely automated methods used to discover latent topics or concepts from large collection of documents. We propose a novel word-to-word similarity measure based on LDA as well as several text-to-text similarity measures. We compare these measures with similar, known measures based on LSA. Experiments and results are presented on two data sets: the Microsoft Research Paraphrase corpus and the User Language Paraphrase corpus. We found that the novel word-to-word similarity measure based on LDA is extremely promising. Copyright © 2013, Association for the Advancement of Artificial Intelligence. All rights reserved

    Using an implicit method for coreference resolution and ellipsis handling in automatic student answer assessment

    No full text
    The automatic student answer assessment problem is challenging because it requires natural language understanding. This problem is even more challenging in conversational Intelligent Tutoring Systems (ITS) because in such conversations the speakers develop common ground as the dialogue proceeds, which means contextual information from previous utterances in the dialogue is heavily relied upon to understand a speaker\u27s utterances. Different linguistic phenomena should be addressed in order to improve the performance of automatic answer assessment systems in conversational ITS. Two such important phenomena are: references to entities mentioned earlier in the dialogue and ellipsis (i.e., answers with contextually implied parts). In this paper, we present an implicit approach to resolving coreferences and handling elliptical responses in the context of automatic student answer evaluation in dialogue based intelligent tutoring systems

    Deeptutor: An effective, online intelligent tutoring system that promotes Deep learning

    No full text
    We present in this paper an innovative solution to the challenge of building effective educational technologies that offer tailored instruction to each individual learner. The proposed solution in the form of a conversational intelligent tutoring system, called DeepTutor, has been developed as a web application that is accessible 24/7 through a browser from any device connected to the Internet. The success of several large scale experiments with high-school students using DeepTutor is a solid proof that conversational intelligent tutoring at scale over the web is possible

    Combining knowledge and corpus-based measures for word-to-word similarity

    No full text
    This paper shows that the combination of knowledge and corpus-based word-to-word similarity measures can produce higher agreement with human judgment than any of the individual measures. While this might be a predictable result, the paper provides insights about the circumstances under which a combination is productive and about the improvement levels that are to be expected. The experiments presented here were conducted using the word-to-word similarity measures included in SEMILAR, a freely available semantic similarity toolkit
    corecore