333 research outputs found

    From Texts to Prerequisites. Identifying and Annotating Propaedeutic Relations in Educational Textual Resources

    Get PDF
    openPrerequisite Relations (PRs) are dependency relations established between two distinct concepts expressing which piece(s) of information a student has to learn first in order to understand a certain target concept. Such relations are one of the most fundamental in Education, playing a crucial role not only for what concerns new knowledge acquisition, but also in the novel applications of Artificial Intelligence to distant and e-learning. Indeed, resources annotated with such information could be used to develop automatic systems able to acquire and organize the knowledge embodied in educational resources, possibly fostering educational applications personalized, e.g., on students' needs and prior knowledge. The present thesis discusses the issues and challenges of identifying PRs in educational textual materials with the purpose of building a shared understanding of the relation among the research community. To this aim, we present a methodology for dealing with prerequisite relations as established in educational textual resources which aims at providing a systematic approach for uncovering PRs in textual materials, both when manually annotating and automatically extracting the PRs. The fundamental principles of our methodology guided the development of a novel framework for PR identification which comprises three components, each tackling a different task: (i) an annotation protocol (PREAP), reporting the set of guidelines and recommendations for building PR-annotated resources; (ii) an annotation tool (PRET), supporting the creation of manually annotated datasets reflecting the principles of PREAP; (iii) an automatic PR learning method based on machine learning (PREL). The main novelty of our methodology and framework lies in the fact that we propose to uncover PRs from textual resources relying solely on the content of the instructional material: differently from other works, rather than creating de-contextualised PRs, we acknowledge the presence of a PR between two concepts only if emerging from the way they are presented in the text. By doing so, we anchor relations to the text while modelling the knowledge structure entailed in the resource. As an original contribution of this work, we explore whether linguistic complexity of the text influences the task of manual identification of PRs. To this aim, we investigate the interplay between text and content in educational texts through a crowd-sourcing experiment on concept sequencing. Our methodology values the content of educational materials as it incorporates the evidence acquired from such investigation which suggests that PR recognition is highly influenced by the way in which concepts are introduced in the resource and by the complexity of the texts. The thesis reports a case study dealing with every component of the PR framework which produced a novel manually-labelled PR-annotated dataset.openXXXIII CICLO - DIGITAL HUMANITIES. TECNOLOGIE DIGITALI, ARTI, LINGUE, CULTURE E COMUNICAZIONE - Lingue, culture e tecnologie digitaliAlzetta, Chiar

    Grounding event references in news

    Get PDF
    Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation

    Leveraging graph-based semantic annotation for the identification of cause-effect relations

    Get PDF
    This research is related to language article in Indonesia that discuss about causality relationship research used as public health surveillance information monitoring system. Utilization of this research is suitability of feature selection, phrase annotation, paragraph annotation, medical element annotation and graph-based semantic annotation. Evaluation of system performance is done by intrinsic approach using the Naive Bayes Multinomial method. The results obtained sequentially for recall, precision and f-measure are 0.924, 0.905, and 0.910

    Computational Sociolinguistics: A Survey

    Get PDF
    Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language. In this article we present a survey of the emerging field of "Computational Sociolinguistics" that reflects this increased interest. We aim to provide a comprehensive overview of CL research on sociolinguistic themes, featuring topics such as the relation between language and social identity, language use in social interaction and multilingual communication. Moreover, we demonstrate the potential for synergy between the research communities involved, by showing how the large-scale data-driven methods that are widely used in CL can complement existing sociolinguistic studies, and how sociolinguistics can inform and challenge the methods and assumptions employed in CL studies. We hope to convey the possible benefits of a closer collaboration between the two communities and conclude with a discussion of open challenges.Comment: To appear in Computational Linguistics. Accepted for publication: 18th February, 201

    Grounding event references in news

    Get PDF
    Events are frequently discussed in natural language, and their accurate identification is central to language understanding. Yet they are diverse and complex in ontology and reference; computational processing hence proves challenging. News provides a shared basis for communication by reporting events. We perform several studies into news event reference. One annotation study characterises each news report in terms of its update and topic events, but finds that topic is better consider through explicit references to background events. In this context, we propose the event linking task which—analogous to named entity linking or disambiguation—models the grounding of references to notable events. It defines the disambiguation of an event reference as a link to the archival article that first reports it. When two references are linked to the same article, they need not be references to the same event. Event linking hopes to provide an intuitive approximation to coreference, erring on the side of over-generation in contrast with the literature. The task is also distinguished in considering event references from multiple perspectives over time. We diagnostically evaluate the task by first linking references to past, newsworthy events in news and opinion pieces to an archive of the Sydney Morning Herald. The intensive annotation results in only a small corpus of 229 distinct links. However, we observe that a number of hyperlinks targeting online news correspond to event links. We thus acquire two large corpora of hyperlinks at very low cost. From these we learn weights for temporal and term overlap features in a retrieval system. These noisy data lead to significant performance gains over a bag-of-words baseline. While our initial system can accurately predict many event links, most will require deep linguistic processing for their disambiguation
    corecore