2 research outputs found
What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning
Recent years have witnessed the rising popularity of Natural Language
Processing (NLP) and related fields such as Artificial Intelligence (AI) and
Machine Learning (ML). Many online courses and resources are available even for
those without a strong background in the field. Often the student is curious
about a specific topic but does not quite know where to begin studying. To
answer the question of "what should one learn first," we apply an
embedding-based method to learn prerequisite relations for course concepts in
the domain of NLP. We introduce LectureBank, a dataset containing 1,352 English
lecture files collected from university courses which are each classified
according to an existing taxonomy as well as 208 manually-labeled prerequisite
relation topics, which is publicly available. The dataset will be useful for
educational purposes such as lecture preparation and organization as well as
applications such as reading list generation. Additionally, we experiment with
neural graph-based networks and non-neural classifiers to learn these
prerequisite relations from our dataset
Annotation Protocol for Textbook Enrichment with Prerequisite Knowledge Graph
Extracting and formally representing the knowledge embedded in textbooks, such as the concepts explained and the relations between them, can support the provision of advanced knowledge-based services for learning environments and digital libraries. In this paper, we consider a specific type of relation in textbooks referred to as prerequisite relations (PR). PRs represent precedence relations between concepts aimed to provide the reader with the knowledge needed to understand a further concept(s). Their annotation in educational texts produces datasets that can be represented as a graph of concepts connected by PRs. However, building good-quality and reliable datasets of PRs from a textbook is still an open issue, not just for automated annotation methods but even for manual annotation. In turn, the lack of good-quality datasets and well-defined criteria to identify PRs affect the development and validation of automated methods for prerequisite identification. As a contribution to this issue, in this paper, we propose PREAP, a protocol for the annotation of prerequisite relations in textbooks aimed at obtaining reliable annotated data that can be shared, compared, and reused in the research community. PREAP defines a novel textbook-driven annotation method aimed to capture the structure of prerequisites underlying the text. The protocol has been evaluated against baseline methods for manual and automatic annotation. The findings show that PREAP enables the creation of prerequisite knowledge graphs that have higher inter-annotator agreement, accuracy, and alignment with text than the baseline methods. This suggests that the protocol is able to accurately capture the PRs expressed in the text. Furthermore, the findings show that the time required to complete the annotation using PREAP are significantly shorter than with the other manual baseline methods. The paper includes also guidelines for using PREAP in three annotation scenarios, experimentally tested. We also provide example datasets and a user interface that we developed to support prerequisite annotation