Resource description framework triples entity formations using statistical language model

Abstract

A method in formatting unstructured sentences from the source corpus to a specificknowledge representation such as RDF is needed. A method for RDF entity formations from aparagraph of text using statistical language model based on N-gram is introduced. Theimplementation of RDF entity formation is applied on natural language query for informationretrieval of the Islamic knowledge. 300 concepts from the English translation of Holy Quranwith 350 relationships are used as a knowledge base. We evaluate our approach on collectionof queries from the Islamic Research Foundation website with a total, 82 queries and comparethe performance against previous method used in FREyA. The result shown the proposedmethod improved 17.07% on the accuracy of the natural language formulation analysis, whichtested on search strategy. It shows the increment on recall and precision with 7% and 3%.Keywords: semantic web; N-gram; ontology; statistical mode

    Similar works