300 research outputs found
Recommended from our members
Building Scholarly Knowledge Bases with Crowdsourcing and Text Mining
For centuries, scholarly knowledge has been buried in documents. While articles are great to convey the story of scientific work to peers, they make it hard for machines to process scholarly knowledge. The recent proliferation of the scholarly literature and the increasing inability of researchers to digest, reproduce, reuse its content are constant reminders that we urgently need a transformative digitalization of the
scholarly literature. Building on the Open Research Knowledge Graph (http://orkg.org) as a concrete research infrastructure, in this talk we present how using crowdsourcing and text mining humans and machines can collaboratively build scholarly knowledge
bases, i.e. systems that acquire, curate and publish data, information and knowledge published in the scholarly literature in structured and semantic form. We discuss some key challenges that human and technical infrastructures face as well as the possibilities scholarly knowledge bases enable
Recommended from our members
TinyGenius: Intertwining natural language processing with microtask crowdsourcing for scholarly knowledge graph creation
As the number of published scholarly articles grows steadily each year, new methods are needed to organize scholarly knowledge so that it can be more efficiently discovered and used. Natural Language Processing (NLP) techniques are able to autonomously process scholarly articles at scale and to create machine readable representations of the article content. However, autonomous NLP methods are by far not sufficiently accurate to create a high-quality knowledge graph. Yet quality is crucial for the graph to be useful in practice. We present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. The scholarly context in which the crowd workers operate has multiple challenges. The explainability of the employed NLP methods is crucial to provide context in order to support the decision process of crowd workers. We employed TinyGenius to populate a paper-centric knowledge graph, using five distinct NLP methods. In the end, the resulting knowledge graph serves as a digital library for scholarly articles
Operational Research Literature as a Use Case for the Open Research Knowledge Graph
The Open Research Knowledge Graph (ORKG) provides machine-actionable access
to scholarly literature that habitually is written in prose. Following the FAIR
principles, the ORKG makes traditional, human-coded knowledge findable,
accessible, interoperable, and reusable in a structured manner in accordance
with the Linked Open Data paradigm. At the moment, in ORKG papers are described
manually, but in the long run the semantic depth of the literature at scale
needs automation. Operational Research is a suitable test case for this vision
because the mathematical field and, hence, its publication habits are highly
structured: A mundane problem is formulated as a mathematical model, solved or
approximated numerically, and evaluated systematically. We study the existing
literature with respect to the Assembly Line Balancing Problem and derive a
semantic description in accordance with the ORKG. Eventually, selected papers
are ingested to test the semantic description and refine it further.Comment: International Congress on Mathematical Software (ICMS) 202
Persistent Identification and Interlinking of FAIR Scholarly Knowledge
We leverage the Open Research Knowledge Graph - a scholarly infrastructure
that supports the creation, curation, and reuse of structured, semantic
scholarly knowledge - and present an approach for persistent identification of
FAIR scholarly knowledge. We propose a DOI-based persistent identification of
ORKG Papers, which are machine-actionable descriptions of the essential
information published in scholarly articles. This enables the citability of
FAIR scholarly knowledge and its discovery in global scholarly communication
infrastructures (e.g., DataCite, OpenAIRE, and ORCID). While publishing, the
state of the ORKG Paper is saved and cannot be further edited. To allow for
updating published versions, ORKG supports creating new versions, which are
linked in provenance chains. We demonstrate the linking of FAIR scholarly
knowledge with digital artefacts (articles), agents (researchers) and other
objects (organizations). We persistently identify FAIR scholarly knowledge
(namely, ORKG Papers and ORKG Comparisons as collections of ORKG Papers) by
leveraging DataCite services. Given the existing interoperability between
DataCite, Crossref, OpenAIRE and ORCID, sharing metadata with DataCite ensures
global findability of FAIR scholarly knowledge in scholarly communication
infrastructures
Recommended from our members
Crowdsourcing Scholarly Discourse Annotations
The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process
Recommended from our members
Creating a Scholarly Knowledge Graph from Survey Article Tables
Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts
Creating and validating a scholarly knowledge graph using natural language processing and microtask crowdsourcing
Due to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks
Open Research Knowledge Graph
Der Vortrag wird den Open Research Knowledge Graph (ORKG) als Forschungsinfrastruktur für das FAIRe Management von wissenschaftlichem Wissen einführen. Die seit 2018 an der TIB angesiedelte Initiative setzt sich zum Ziel, die FAIR Datenprinzipien auf wissenschaftliches Wissen in Publikationen anzuwenden, um die Nachnutzung von wissenschaftlichem Wissen zu verbessern. Die Infrastruktur unterstützt die Produktion, Kuratierung und Nachnutzung von strukturiertem wissenschaftlichem Wissen. Der Vortrag erläutert Hintergründe und Motivation, Kernfunktionalität und Integrationen
Open Research Knowledge Graph
Der Vortrag wird den Open Research Knowledge Graph (ORKG) als Forschungsinfrastruktur für das FAIRe Management von wissenschaftlichem Wissen einführen. Die seit 2018 an der TIB angesiedelte Initiative setzt sich zum Ziel, die FAIR Datenprinzipien auf wissenschaftliches Wissen in Publikationen anzuwenden, um die Nachnutzung von wissenschaftlichem Wissen zu verbessern. Die Infrastruktur unterstützt die Produktion, Kuratierung und Nachnutzung von strukturiertem wissenschaftlichem Wissen. Der Vortrag erläutert Hintergründe und Motivation, Kernfunktionalität und Integrationen
- …