205 research outputs found

    Effect of Tuned Parameters on a LSA MCQ Answering Model

    Full text link
    This paper presents the current state of a work in progress, whose objective is to better understand the effects of factors that significantly influence the performance of Latent Semantic Analysis (LSA). A difficult task, which consists in answering (French) biology Multiple Choice Questions, is used to test the semantic properties of the truncated singular space and to study the relative influence of main parameters. A dedicated software has been designed to fine tune the LSA semantic space for the Multiple Choice Questions task. With optimal parameters, the performances of our simple model are quite surprisingly equal or superior to those of 7th and 8th grades students. This indicates that semantic spaces were quite good despite their low dimensions and the small sizes of training data sets. Besides, we present an original entropy global weighting of answers' terms of each question of the Multiple Choice Questions which was necessary to achieve the model's success.Comment: 9 page

    Generating Distractors for Reading Comprehension Questions from Real Examinations

    Full text link
    We investigate the task of distractor generation for multiple choice reading comprehension questions from examinations. In contrast to all previous works, we do not aim at preparing words or short phrases distractors, instead, we endeavor to generate longer and semantic-rich distractors which are closer to distractors in real reading comprehension from examinations. Taking a reading comprehension article, a pair of question and its correct option as input, our goal is to generate several distractors which are somehow related to the answer, consistent with the semantic context of the question and have some trace in the article. We propose a hierarchical encoder-decoder framework with static and dynamic attention mechanisms to tackle this task. Specifically, the dynamic attention can combine sentence-level and word-level attention varying at each recurrent time step to generate a more readable sequence. The static attention is to modulate the dynamic attention not to focus on question irrelevant sentences or sentences which contribute to the correct option. Our proposed framework outperforms several strong baselines on the first prepared distractor generation dataset of real reading comprehension questions. For human evaluation, compared with those distractors generated by baselines, our generated distractors are more functional to confuse the annotators.Comment: AAAI201

    Co-Attention Hierarchical Network: Generating Coherent Long Distractors for Reading Comprehension

    Full text link
    In reading comprehension, generating sentence-level distractors is a significant task, which requires a deep understanding of the article and question. The traditional entity-centered methods can only generate word-level or phrase-level distractors. Although recently proposed neural-based methods like sequence-to-sequence (Seq2Seq) model show great potential in generating creative text, the previous neural methods for distractor generation ignore two important aspects. First, they didn't model the interactions between the article and question, making the generated distractors tend to be too general or not relevant to question context. Second, they didn't emphasize the relationship between the distractor and article, making the generated distractors not semantically relevant to the article and thus fail to form a set of meaningful options. To solve the first problem, we propose a co-attention enhanced hierarchical architecture to better capture the interactions between the article and question, thus guide the decoder to generate more coherent distractors. To alleviate the second problem, we add an additional semantic similarity loss to push the generated distractors more relevant to the article. Experimental results show that our model outperforms several strong baselines on automatic metrics, achieving state-of-the-art performance. Further human evaluation indicates that our generated distractors are more coherent and more educative compared with those distractors generated by baselines.Comment: 8 pages, 3 figures. Accepted by AAAI202

    Encyclopedic Memory: Long-Term Memory Capacity for Knowledge Vocabulary in Middle School

    Get PDF
    This article is a synthesis of unpublished and published experiments showing that elementary memory scores (words and pictures immediate recall; delayed recall, recognition), which are very sensitive to aging and in pharmacological protocols, have little or no correlation with school achievement. The alternative assumption developed is that school achievement strongly depends on the long-term memory of scholastic knowledge (history, literature, sciences, maths, etc), called encyclopedic memory.A longitudinal study from the grade 6 to the grade 9 of a cohort of eight classes of a French college, was undertaken in order to observe the implication of the encyclopedic vocabulary (i.e. Julius Caesar, Manhattan, Shangaï, Uranus, vector) in school performance. An inventory in the school textbooks gives approximately 6000 encyclopedic words in grade 6, to 24000 in grade 9. The encyclopedic storage capacity was estimated at the end of each year by a multiple-choice questionnaire with random samples of words (800 items; 8 subject subjects). The results show an estimation of 2500 words acquired at the end of grade 6, to 17000 at the end of the grade 9. The correlations are from .61 to .72 between the score of encyclopedic memory and the average school grades

    Learning to Reuse Distractors to support Multiple Choice Question Generation in Education

    Full text link
    Multiple choice questions (MCQs) are widely used in digital learning systems, as they allow for automating the assessment process. However, due to the increased digital literacy of students and the advent of social media platforms, MCQ tests are widely shared online, and teachers are continuously challenged to create new questions, which is an expensive and time-consuming task. A particularly sensitive aspect of MCQ creation is to devise relevant distractors, i.e., wrong answers that are not easily identifiable as being wrong. This paper studies how a large existing set of manually created answers and distractors for questions over a variety of domains, subjects, and languages can be leveraged to help teachers in creating new MCQs, by the smart reuse of existing distractors. We built several data-driven models based on context-aware question and distractor representations, and compared them with static feature-based models. The proposed models are evaluated with automated metrics and in a realistic user test with teachers. Both automatic and human evaluations indicate that context-aware models consistently outperform a static feature-based approach. For our best-performing context-aware model, on average 3 distractors out of the 10 shown to teachers were rated as high-quality distractors. We create a performance benchmark, and make it public, to enable comparison between different approaches and to introduce a more standardized evaluation of the task. The benchmark contains a test of 298 educational questions covering multiple subjects & languages and a 77k multilingual pool of distractor vocabulary for future research.Comment: 24 pages and 4 figures Accepted for publication in IEEE Transactions on Learning technologie

    Automatic Distractor Generation for Multiple Choice Questions in Standard Tests

    Full text link
    To assess the knowledge proficiency of a learner, multiple choice question is an efficient and widespread form in standard tests. However, the composition of the multiple choice question, especially the construction of distractors is quite challenging. The distractors are required to both incorrect and plausible enough to confuse the learners who did not master the knowledge. Currently, the distractors are generated by domain experts which are both expensive and time-consuming. This urges the emergence of automatic distractor generation, which can benefit various standard tests in a wide range of domains. In this paper, we propose a question and answer guided distractor generation (EDGE) framework to automate distractor generation. EDGE consists of three major modules: (1) the Reforming Question Module and the Reforming Passage Module apply gate layers to guarantee the inherent incorrectness of the generated distractors; (2) the Distractor Generator Module applies attention mechanism to control the level of plausibility. Experimental results on a large-scale public dataset demonstrate that our model significantly outperforms existing models and achieves a new state-of-the-art.Comment: accepted by COLING202

    Comparative Study of Different Techniques for Automatic Evaluation of English Text Essays

    Get PDF
     Automated essay evaluation keeps to attract a lot of interest because of its educational and commercial importance as well as the related research challenges in the natural language processing field. Automated essay evaluation has the feature of halves, less cost of human resource, and gives the results directly and timely feedback compared with the human evaluator which requires more time and it depends on his /her mood at certain times. This paper has focused on automated evaluation of English text which was performed using various algorithms and techniques by making comparison between these techniques that applied with different size of dataset and length essays as well as the performance of algorithms was assessed using different metrics. The results uncovered that the performance of each technique has affected by the size of dataset and the length of essays. Finally, for future research directions building a standard dataset containing different types of question-answer pair to be able to compare the performance of different techniques fairly

    Biomedical knowledge graph-enhanced prompt generation for large language models

    Full text link
    Large Language Models (LLMs) have been driving progress in AI at an unprecedented rate, yet still face challenges in knowledge-intensive domains like biomedicine. Solutions such as pre-training and domain-specific fine-tuning add substantial computational overhead, and the latter require domain-expertise. External knowledge infusion is task-specific and requires model training. Here, we introduce a task-agnostic Knowledge Graph-based Retrieval Augmented Generation (KG-RAG) framework by leveraging the massive biomedical KG SPOKE with LLMs such as Llama-2-13b, GPT-3.5-Turbo and GPT-4, to generate meaningful biomedical text rooted in established knowledge. KG-RAG consistently enhanced the performance of LLMs across various prompt types, including one-hop and two-hop prompts, drug repurposing queries, biomedical true/false questions, and multiple-choice questions (MCQ). Notably, KG-RAG provides a remarkable 71% boost in the performance of the Llama-2 model on the challenging MCQ dataset, demonstrating the framework's capacity to empower open-source models with fewer parameters for domain-specific questions. Furthermore, KG-RAG enhanced the performance of proprietary GPT models, such as GPT-3.5 which exhibited improvement over GPT-4 in context utilization on MCQ data. Our approach was also able to address drug repurposing questions, returning meaningful repurposing suggestions. In summary, the proposed framework combines explicit and implicit knowledge of KG and LLM, respectively, in an optimized fashion, thus enhancing the adaptability of general-purpose LLMs to tackle domain-specific questions in a unified framework.Comment: 28 pages, 5 figures, 2 tables, 1 supplementary fil

    Experiments in neural question answering

    Get PDF
    In this thesis, we apply deep learning methods to tackle the tasks of finding duplicate questions, learning to rank the answers of a Multiple Choice Question (MCQ) and classifying the answers to a question coming from the context of a paragraph. We draw our attention toward the problems related to sentence-sentence similarity. We used siamese architecture for better representation of the question and answers. The basic aim of all the methods proposed in this thesis is to build the word embeddings of question and answers and feed them to a deep neural architecture. We used several such architectures like Long-Short term memory (LSTM) and Convolutional Neural Network (CNN). We have also implemented an attention mechanism to put more focus on the sentence-word relationship. Our goal was to extract a refined representation of the question and answers through different combination of these deep learning techniques. We generated a representation of a sentence according to the context of another sentence for solving our tasks. We provide some simple but efficient deep learning models to solve our tasks. As neural models are data-driven, we train our model extensively by making pairs such as question-question and question-answer over a large-scale real-life dataset. We used three different datasets to solve our three different tasks. The Quora dataset of several question-answer pairs was used for the task of finding duplicate question. The OpenTriviaQA question answering dataset for the ranking of multiple answers. Lastly, we use the SQuAD dataset for the answer classification of reading comprehension task. We evaluate our models based on metrics like accuracy, precision, recall, F1 scores. Our methods and experiments demonstrate some significant improvements over the state-of-the-art methods
    • …
    corecore