212 research outputs found
Using Linguistic Analysis to Translate Arabic Natural Language Queries to SPARQL
The logic-based machine-understandable framework of the Semantic Web often
challenges naive users when they try to query ontology-based knowledge bases.
Existing research efforts have approached this problem by introducing Natural
Language (NL) interfaces to ontologies. These NL interfaces have the ability to
construct SPARQL queries based on NL user queries. However, most efforts were
restricted to queries expressed in English, and they often benefited from the
advancement of English NLP tools. However, little research has been done to
support querying the Arabic content on the Semantic Web by using NL queries.
This paper presents a domain-independent approach to translate Arabic NL
queries to SPARQL by leveraging linguistic analysis. Based on a special
consideration on Noun Phrases (NPs), our approach uses a language parser to
extract NPs and the relations from Arabic parse trees and match them to the
underlying ontology. It then utilizes knowledge in the ontology to group NPs
into triple-based representations. A SPARQL query is finally generated by
extracting targets and modifiers, and interpreting them into SPARQL. The
interpretation of advanced semantic features including negation, conjunctive
and disjunctive modifiers is also supported. The approach was evaluated by
using two datasets consisting of OWL test data and queries, and the obtained
results have confirmed its feasibility to translate Arabic NL queries to
SPARQL.Comment: Journal Pape
AR2SPARQL: An Arabic Natural Language Interface for the Semantic Web
With the growing interest in supporting the Arabic language on the Semantic Web (SW), there is an emerging need to enable Arab users to query ontologies and RDF stores without being challenged with the formal logic of the SW. In the domain of English language, several efforts provided Natural Language (NL) interfaces to enable ordinary users to query ontologies using NL queries. However, none of these efforts were designed to support the Arabic language which has different morphological and semantic structures.
As a step towards supporting Arabic Question Answering (QA) on the SW, this work presents AR2SPARQL, a NL interface that takes questions expressed in Arabic and returns answers drawn from an ontology-based knowledge base. The core of AR2SPARQL is the approach we propose to translate Arabic questions into triples which are matched against RDF data to retrieve an answer. The system uses both linguistic and semantic features to resolve ambiguity when matching words to the ontology content. To overcome the limited support for Arabic Natural Language Processing (NLP), the system does not make intensive use of sophisticated linguistic methods. Instead, it relies more on the knowledge defined in the ontology and the grammar rules we define to capture the structures of Arabic questions and to construct an adequate RDF representations. AR2SPARQL has been tested with two different datasets and results have shown that it achieves a good retrieval performance in terms of precision and recall
Linking Discourse Marker Inventories
The paper describes the first comprehensive edition of machine-readable discourse marker lexicons. Discourse markers such as and, because, but, though or thereafter are essential communicative signals in human conversation, as they indicate how an utterance relates to its communicative context. As much of this information is implicit or expressed differently in different languages, discourse parsing, context-adequate natural language generation and machine translation are considered particularly challenging aspects of Natural Language Processing. Providing this data in machine-readable, standard-compliant form will thus facilitate such technical tasks, and moreover, allow to explore techniques for translation inference to be applied to this particular group of lexical resources that was previously largely neglected in the context of Linguistic Linked (Open) Data
Linking discourse marker inventories
The paper describes the first comprehensive edition of machine-readable discourse marker lexicons. Discourse markers such as and, because, but, though or thereafter are essential communicative signals in human conversation, as they indicate how an utterance relates to its communicative context. As much of this information is implicit or expressed differently in different languages, discourse parsing, context-adequate natural language generation and machine translation are considered particularly challenging aspects of Natural Language Processing. Providing this data in machine-readable, standard-compliant form will thus facilitate such technical tasks, and moreover, allow to explore techniques for translation inference to be applied to this particular group of lexical resources that was previously largely neglected in the context of Linguistic Linked (Open) Data
- …