109 research outputs found

    Thinking outside the graph: scholarly knowledge graph construction leveraging natural language processing

    Get PDF
    Despite improved digital access to scholarly knowledge in recent decades, scholarly communication remains exclusively document-based. The document-oriented workflows in science publication have reached the limits of adequacy as highlighted by recent discussions on the increasing proliferation of scientific literature, the deficiency of peer-review and the reproducibility crisis. In this form, scientific knowledge remains locked in representations that are inadequate for machine processing. As long as scholarly communication remains in this form, we cannot take advantage of all the advancements taking place in machine learning and natural language processing techniques. Such techniques would facilitate the transformation from pure text based into (semi-)structured semantic descriptions that are interlinked in a collection of big federated graphs. We are in dire need for a new age of semantically enabled infrastructure adept at storing, manipulating, and querying scholarly knowledge. Equally important is a suite of machine assistance tools designed to populate, curate, and explore the resulting scholarly knowledge graph. In this thesis, we address the issue of constructing a scholarly knowledge graph using natural language processing techniques. First, we tackle the issue of developing a scholarly knowledge graph for structured scholarly communication, that can be populated and constructed automatically. We co-design and co-implement the Open Research Knowledge Graph (ORKG), an infrastructure capable of modeling, storing, and automatically curating scholarly communications. Then, we propose a method to automatically extract information into knowledge graphs. With Plumber, we create a framework to dynamically compose open information extraction pipelines based on the input text. Such pipelines are composed from community-created information extraction components in an effort to consolidate individual research contributions under one umbrella. We further present MORTY as a more targeted approach that leverages automatic text summarization to create from the scholarly article's text structured summaries containing all required information. In contrast to the pipeline approach, MORTY only extracts the information it is instructed to, making it a more valuable tool for various curation and contribution use cases. Moreover, we study the problem of knowledge graph completion. exBERT is able to perform knowledge graph completion tasks such as relation and entity prediction tasks on scholarly knowledge graphs by means of textual triple classification. Lastly, we use the structured descriptions collected from manual and automated sources alike with a question answering approach that builds on the machine-actionable descriptions in the ORKG. We propose JarvisQA, a question answering interface operating on tabular views of scholarly knowledge graphs i.e., ORKG comparisons. JarvisQA is able to answer a variety of natural language questions, and retrieve complex answers on pre-selected sub-graphs. These contributions are key in the broader agenda of studying the feasibility of natural language processing methods on scholarly knowledge graphs, and lays the foundation of which methods can be used on which cases. Our work indicates what are the challenges and issues with automatically constructing scholarly knowledge graphs, and opens up future research directions

    Music Encoding Conference Proceedings

    Get PDF
    UIDB/00693/2020 UIDP/00693/2020publishersversionpublishe

    Visualising the intellectual and social structures of digital humanities using an invisible college model

    Get PDF
    This thesis explores the intellectual and social structures of an emerging field, Digital Humanities (DH). After around 70 years of development, DH claims to differentiate itself from the traditional Humanities for its inclusiveness, diversity, and collaboration. However, the ‘big tent’ concept not only limits our understandings of its research structure, but also results in a lack of empirical review and sustainable support. Under this umbrella, whether there are merely fragmented topics, or a consolidated knowledge system is still unknown. This study seeks to answer three research questions: a) Subject: What research topics is the DH subject composed of? b) Scholar: Who has contributed to the development of DH? c) Environment: How diverse are the backgrounds of DH scholars? The Invisible College research model is refined and applied as the methodological framework that produces four visualised networks. As the results show, DH currently contributes more towards the general historical literacy and information science, while longitudinally, it was heavily involved in computational linguistics. Humanistic topics are more popular and central, while technical topics are relatively peripheral and have stronger connections with non-Anglophone communities. DH social networks are at the early stages of development, and the formation is heavily influenced by non-academic and non-intellectual factors, e.g., language, working country, and informal relationships. Although male scholars have dominated the field, female scholars have encouraged more communication and built more collaborations. Despite the growing appeals for more diversity, the level of international collaboration in DH is more extensive than in many other disciplines. These findings can help us gain new understandings on the central and critical questions about DH. To the best of the candidate’s knowledge, this study is the first to investigate the formal and informal structures in DH with a well-grounded research model
    • …
    corecore