13 research outputs found

    Measuring text readability with machine comprehension: a pilot study

    Get PDF
    International audienceThis article studies the relationship between text readability indice and automatic machine understanding systems. Our hypothesis is that the simpler a text is, the better it should be understood by a machine. We thus expect to a strong correlation between readability levels on the one hand, and performance of automatic reading systems on the other hand. We test this hypothesis with several understanding systems based on language models of varying strengths, measuring this correlation on two corpora of journalistic texts. Our results suggest that this correlation is rather small that existing comprehension systems are far to reproduce the gradual improvement of their performance on texts of decreasing complexity

    DEVELOPING ISLAMIC-BASED MANDARIN CONSTRUCTIVIST TEACHING MATERIALS

    Get PDF
    This research is motivated by the absence of Mandarin language teaching materials that are integrated with Islamic values in their learning and the use of Mandarin language teaching materials, especially textbooks from the country of origin of China (People's Republic of China) which is still thick with eastern culture and less relevant to Islamic values. The research stages are identification analysis, formulation of teaching materials, readability test, trial, validation, and revision as well as making a final prototype. This research concludes that (1) Islamic-based Mandarin constructivist teaching materials are needed, especially in Islamic educational units that include Mandarin in their curriculum. (2) The design of teaching materials with the main topics of discussion on prayer, fasting and Eid has gone through the trial, validation and revision stage to produce the product is in the form of an Islamic-based constructivist Mandarin textbooks for beginners consisting of a teacher's manual and a student's book. (3) Based on the results of the readability test and validation of material experts, Islamic-based Mandarin teaching materials have been feasible 4) Islamic-based Mandarin Language Teaching Materials have received good acceptability from teachers and students as users.This research is motivated by the absence of Mandarin language teaching materials that are integrated with Islamic values in their learning and the use of Mandarin language teaching materials, especially textbooks from the country of origin of China (People's Republic of China) which is still thick with eastern culture and less relevant to Islamic values. The research stages are identification analysis, formulation of teaching materials, readability test, trial, validation, and revision as well as making a final prototype. This research concludes that (1) Islamic-based Mandarin constructivist teaching materials are needed, especially in Islamic educational units that include Mandarin in their curriculum. (2) The design of teaching materials with the main topics of discussion on prayer, fasting and Eid has gone through the trial, validation and revision stage to produce the product is in the form of an Islamic-based constructivist Mandarin textbooks for beginners consisting of a teacher's manual and a student's book. (3) Based on the results of the readability test and validation of material experts, Islamic-based Mandarin teaching materials have been feasible 4) Islamic-based Mandarin Language Teaching Materials have received good acceptability from teachers and students as users.Keywords: Teaching Materials, Constructivist, Islamic Base

    A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics, and Benchmark Datasets

    Full text link
    Machine Reading Comprehension (MRC) is a challenging NLP research field with wide real world applications. The great progress of this field in recent years is mainly due to the emergence of large-scale datasets and deep learning. At present, a lot of MRC models have already surpassed the human performance on many datasets despite the obvious giant gap between existing MRC models and genuine human-level reading comprehension. This shows the need of improving existing datasets, evaluation metrics and models to move the MRC models toward 'real' understanding. To address this lack of comprehensive survey of existing MRC tasks, evaluation metrics and datasets, herein, (1) we analyzed 57 MRC tasks and datasets; proposed a more precise classification method of MRC tasks with 4 different attributes (2) we summarized 9 evaluation metrics of MRC tasks and (3) 7 attributes and 10 characteristics of MRC datasets; (4) We also discussed some open issues in MRC research and highlight some future research directions. In addition, to help the community, we have collected, organized, and published our data on a companion website(https://mrc-datasets.github.io/) where MRC researchers could directly access each MRC dataset, papers, baseline projects and browse the leaderboard.Comment: 59 page

    English Machine Reading Comprehension Datasets: A Survey

    Get PDF
    This paper surveys 60 English Machine Reading Comprehension datasets, with a view to providing a convenient resource for other researchers interested in this problem. We categorize the datasets according to their question and answer form and compare them across various dimensions including size, vocabulary, data source, method of creation, human performance level, and first question word. Our analysis reveals that Wikipedia is by far the most common data source and that there is a relative lack of why, when, and where questions across datasets.Comment: Will appear at EMNLP 2021. Dataset survey paper: 9 pages, 5 figures, 2 tables + attachmen
    corecore