Search CORE

6 research outputs found

A General-Purpose Approach to Temporal Event Ontology Creation

Author: Badgett Allison Ruth
Publication venue
Publication date: 24/07/2019
Field of study

One of the major challenges for modern data scientists is providing structure to data. Textual data is especially difficult to interpret and categorize. Much of the meaning found in this natural language data, like news articles or tweets, is contextual and potentially non standard. Attempts have been made to manually create organizational ontologies, but this is usually limited to specialized sub-domains, as the task of providing a complete structure "by hand" across larger domains is unmanageable. We propose a general purpose approach to event ontology creation, building upon a subevent classifier already developed in the initial stage of our research. In this work, we extract events from textual data and create a graph structure showing temporal relationships, using semantic and syntactic methods. This event ontology facilitates faster and more accurate automated data interpretation by providing a structure to textual data. The next stage in the "big data" phenomenon is not accumulating more data, but fully utilizing the vast amount of data already available. Event ontologies are a necessary step in this direction

Texas A&M Repository

A Semantic Question Answering Framework for Large Data Sets

Author: Dan Moldovan
Marta Tatu
Mithun Balakrishna
Steven Werner
Tatiana Erekhinskaya
Publication venue: RonPub
Publication date: 01/01/2016
Field of study

Traditionally, the task of answering natural language questions has involved a keyword-based document retrieval step, followed by in-depth processing of candidate answer documents and paragraphs. This post-processing uses semantics to various degrees. In this article, we describe a purely semantic question answering (QA) framework for large document collections. Our high-precision approach transforms the semantic knowledge extracted from natural language texts into a language-agnostic RDF representation and indexes it into a scalable triplestore. In order to facilitate easy access to the information stored in the RDF semantic index, a user's natural language questions are translated into SPARQL queries that return precise answers back to the user. The robustness of this framework is ensured by the natural language reasoning performed on the RDF store, by the query relaxation procedures, and the answer ranking techniques. The improvements in performance over a regular free text search index-based question answering engine prove that QA systems can benefit greatly from the addition and consumption of deep semantic information

RonPub -- Research Online Publishing

Ten Ways of Leveraging Ontologies for Rapid Natural Language Processing Customization for Multiple Use Cases in Disjoint Domains

Author: Dan Moldovan
Dmitry Strebkov
Marta Tatu
Mithun Balakrishna
Sujal Patel
Tatiana Erekhinskaya
Publication venue: RonPub
Publication date: 01/01/2020
Field of study

With the ever-growing adoption of AI technologies by large enterprises, purely data-driven approaches have dominated the field in the recent years. For a single use case, a development process looks simple: agreeing on an annotation schema, labeling the data, and training the models. As the number of use cases and their complexity increases, the development teams face issues with collective governance of the models, scalability and reusablity of data and models. These issues are widely addressed on the engineering side, but not so much on the knowledge side. Ontologies have been a well-researched approach for capturing knowledge and can be used to augment a data-driven methodology. In this paper, we discuss 10 ways of leveraging ontologies for Natural Language Processing (NLP) and its applications. We use ontologies for rapid customization of a NLP pipeline, ontologyrelated standards to power a rule engine and provide standard output format. We also discuss various use cases for medical, enterprise, financial, legal, and security domains, centered around three NLP-based applications: semantic search, question answering and natural language querying

RonPub -- Research Online Publishing

Ontological Approach for Semantic Modelling of Malay Translated Qur’an

Author: Ahmad Nor Diana Binti
Publication venue
Publication date: 01/01/2022
Field of study

This thesis contributes to the areas of ontology development and analysis, natural language processing (NLP), Information Retrieval (IR), and Language Resource and Corpus Development. Research in Natural Language Processing and semantic search for English has shown successful results for more than a decade. However, it is difficult to adapt those techniques to the Malay language, because its complex morphology and orthographic forms are very different from English. Moreover, limited resources and tools for computational linguistic analysis are available for Malay. In this thesis, we address those issues and challenges by proposing MyQOS, the Malay Qur’an Ontology System, a prototype ontology-based IR with semantics for representing and accessing a Malay translation of the Qur’an. This supports the development of a semantic search engine and a question answering system and provides a framework for storing and accessing a Malay language corpus and providing computational linguistics resources. The primary use of MyQOS in the current research is for creating and improving the quality and accuracy of the query mechanism to retrieve information embedded in the Malay text of the Qur’an translation. To demonstrate the feasibility of this approach, we describe a new architecture of morphological analysis for MyQOS and query algorithms based on MyQOS. Data analysis consisted of two measures; precision and recall, where data was obtained from MyQOS Corpus conducted in three search engines. The precision and recall for semantic search are 0.8409 (84%) and 0.8043(80%), double the results of the question-answer search which are 0.4971(50%) for precision and 0.6027 (60%) for recall. The semantic search gives high precision and high recall comparing the other two methods. This indicates that semantic search returns more relevant results than irrelevant ones. To conclude, this research is among research in the retrieval of the Qur’an texts in the Malay language that managed to outline state-of-the-art information retrieval system models. Thus, the use of MyQOS will help Malay readers to understand the Qur’an in better ways. Furthermore, the creation of a Malay language corpus and computational linguistics resources will benefit other researchers, especially in religious texts, morphological analysis, and semantic modelling

White Rose E-theses Online