1,157 research outputs found

    KnowText: Auto-generated Knowledge Graphs for custom domain applications

    Get PDF
    While industrial Knowledge Graphs enable information extraction from massive data volumes creating the backbone of the Semantic Web, the specialised, custom designed knowledge graphs focused on enterprise specific information are an emerging trend. We present “KnowText”, an application that performs automatic generation of custom Knowledge Graphs from unstructured text and enables fast information extraction based on graph visualisation and free text query methods designed for non-specialist users. An OWL ontology automatically extracted from text is linked to the knowledge graph and used as a knowledge base. A basic ontological schema is provided including 16 Classes and Data type Properties. The extracted facts and the OWL ontology can be downloaded and further refined. KnowText is designed for applications in business (CRM, HR, banking). Custom KG can serve for locally managing existing data, often stored as “sensitive” information or proprietary accounts, which are not on open web access. KnowText deploys a custom KG from a collection of text documents and enable fast information extraction based on its graph based visualisation and text based query methods

    Mining semantics for culturomics: towards a knowledge-based approach

    Get PDF
    The massive amounts of text data made available through the Google Books digitization project have inspired a new field of big-data textual research. Named culturomics, this field has attracted the attention of a growing number of scholars over recent years. However, initial studies based on these data have been criticized for not referring to relevant work in linguistics and language technology. This paper provides some ideas, thoughts and first steps towards a new culturomics initiative, based this time on Swedish data, which pursues a more knowledge-based approach than previous work in this emerging field. The amount of new Swedish text produced daily and older texts being digitized in cultural heritage projects grows at an accelerating rate. These volumes of text being available in digital form have grown far beyond the capacity of human readers, leaving automated semantic processing of the texts as the only realistic option for accessing and using the information contained in them. The aim of our recently initiated research program is to advance the state of the art in language technology resources and methods for semantic processing of Big Swedish text and focus on the theoretical and methodological advancement of the state of the art in extracting and correlating information from large volumes of Swedish text using a combination of knowledge-based and statistical methods
    corecore