6,369 research outputs found

    Extracting Causal Claims from Information Systems Papers with Natural Language Processing for Theory Ontology Learning

    Get PDF
    The number of scientific papers published each year is growing exponentially. How can computational tools support scientists to better understand and process this data? This paper presents a software-prototype that automatically extracts causes, effects, signs, moderators, mediators, conditions, and interaction signs from propositions and hypotheses of full-text scientific papers. This prototype uses natural language processing methods and a set of linguistic rules for causal information extraction. The prototype is evaluated on a manually annotated corpus of 270 Information Systems papers containing 723 hypotheses and propositions from the AIS basket of eight. F1-results for the detection and extraction of different causal variables range between 0.71 and 0.90. The presented automatic causal theory extraction allows for the analysis of scientific papers based on a theory ontology and therefore contributes to the creation and comparison of inter-nomological networks

    DeepCause: Hypothesis Extraction from Information Systems Papers with Deep Learning for Theory Ontology Learning

    Get PDF
    This paper applies different deep learning architectures for sequence labelling to extract causes, effects, moderators, and mediators from hypotheses of information systems papers for theory ontology learning. We compared a variety of recurrent neural networks (RNN) architectures, like long short-term memory (LSTM), bidirectional LSTM (BiLSTM), simple RNNs, and gated recurrent units (GRU). We analyzed GloVe word embedding, character level vector representation of words, and part-of-speech (POS) tags. Furthermore, we evaluated various hyperparameters and architectures to achieve the highest performance scores. The prototype was evaluated on hypotheses from the AIS basket of eight. The F1 result for the sequence labelling task of causal variables on a chunk level was 80%, with a precision of 80% and a recall of 80%

    How Best to Hunt a Mammoth - Toward Automated Knowledge Extraction From Graphical Research Models

    Get PDF
    In the Information Systems (IS) discipline, central contributions of research projects are often represented in graphical research models, clearly illustrating constructs and their relationships. Although thousands of such representations exist, methods for extracting this source of knowledge are still in an early stage. We present a method for (1) extracting graphical research models from articles, (2) generating synthetic training data for (3) performing object detection with a neural network, and (4) a graph reconstruction algorithm to (5) storing results into a designated research model format. We trained YOLOv7 on 20,000 generated diagrams and evaluated its performance on 100 manually reconstructed diagrams from the Senior Scholars\u27 Basket. The results for extracting graphical research models show a F1-score of 0.82 for nodes, 0.72 for links, and an accuracy of 0.72 for labels, indicating the applicability for supporting the population of knowledge repositories contributing to knowledge synthesi

    Modelling discourse in contested domains: A semiotic and cognitive framework

    Get PDF
    This paper examines the representational requirements for interactive, collaborative systems intended to support sensemaking and argumentation over contested issues. We argue that a perspective supported by semiotic and cognitively oriented discourse analyses offers both theoretical insights and motivates representational requirements for the semantics of tools for contesting meaning. We introduce our semiotic approach, highlighting its implications for discourse representation, before describing a research system (ClaiMaker) designed to support the construction of scholarly argumentation by allowing analysts to publish and contest 'claims' about scientific contributions. We show how ClaiMaker's representational scheme is grounded in specific assumptions concerning the nature of explicit modelling, and the evolution of meaning within a discourse community. These characteristics allow the system to represent scholarly discourse as a dynamic process, in the form of continuously evolving structures. A cognitively oriented discourse analysis then shows how the use of a small set of cognitive relational primitives in the underlying ontology opens possibilities for offering users advanced forms of computational service for analysing collectively constructed argumentation networks

    Knowledge-based Biomedical Data Science 2019

    Full text link
    Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

    A Narratology-Based Framework for Storyline Extraction

    Get PDF
    Stories are a pervasive phenomenon of human life. They also represent a cognitive tool to understand and make sense of the world and of its happenings. In this contribution we describe a narratology-based framework for modeling stories as a combination of different data structures and to automatically extract them from news articles. We introduce a distinction among three data structures (timelines, causelines, and storylines) that capture different narratological dimensions, respectively chronological ordering, causal connections, and plot structure. We developed the Circumstantial Event Ontology (CEO) for modeling (implicit) circumstantial relations as well as explicit causal relations and create two benchmark corpora: ECB+/CEO, for causelines, and the Event Storyline Corpus (ESC), for storylines. To test our framework and the difficulty in automatically extract causelines and storylines, we develop a series of reasonable baseline system

    Semantic learning webs

    Get PDF
    By 2020, microprocessors will likely be as cheap and plentiful as scrap paper,scattered by the millions into the environment, allowing us to place intelligent systems everywhere. This will change everything around us, including the nature of commerce, the wealth of nations, and the way we communicate, work, play, and live. This will give us smart homes, cars, TVs , jewellery, and money. We will speak to our appliances, and they will speak back. Scientists also expect the Internet will wire up the entire planet and evolve into a membrane consisting of millions of computer networks, creating an “intelligent planet.” The Internet will eventually become a “Magic Mirror” that appears in fairy tales, able to speak with the wisdom of the human race. Michio Kaku, Visions: How Science Will Revolutionize the Twenty - First Century, 1998 If the semantic web needed a symbol, a good one to use would be a Navaho dream-catcher: a small web, lovingly hand-crafted, [easy] to look at, and rumored to catch dreams; but really more of a symbol than a reality. Pat Hayes, Catching the Dreams, 2002 Though it is almost impossible to envisage what the Web will be like by the end of the next decade, we can say with some certainty that it will have continued its seemingly unstoppable growth. Given the investment of time and money in the Semantic Web (Berners-Lee et al., 2001), we can also be sure that some form of semanticization will have taken place. This might be superficial - accomplished simply through the addition of loose forms of meta-data mark-up, or more principled – grounded in ontologies and formalised by means of emerging semantic web standards, such as RDF (Lassila and Swick, 1999) or OWL (Mc Guinness and van Harmelen, 2003). Whatever the case, the addition of semantic mark-up will make at least part of the Web more readily accessible to humans and their software agents and will facilitate agent interoperability. If current research is successful there will also be a plethora of e-learning platforms making use of a varied menu of reusable educational material or learning objects. For the learner, the semanticized Web will, in addition, offer rich seams of diverse learning resources over and above the course materials (or learning objects) specified by course designers. For instance, the annotation registries, which provide access to marked up resources, will enable more focussed, ontologically-guided (or semantic) search. This much is already in development. But we can go much further. Semantic technologies make it possible not only to reason about the Web as if it is one extended knowledge base but also to provide a range of additional educational semantic web services such as summarization, interpretation or sense-making, structure-visualization, and support for argumentation
    corecore