635 research outputs found

    Quranic Arabic Semantic Search Model Based on Ontology of Concepts

    Get PDF
    The Holy Quran is the essential resource for Islamic sciences and Arabic language. Therefore, numerous Quranic search applications have been built to facilitate the retrieval of knowledge from the Quran. This thesis presents a novel Arabic Quran semantic search model. First, this thesis evaluated existing search tools constructed for the Holy Quran, against 13 criteria depending on: search features, output features, the precision of the retrieved verses, recall database size, and types of database contents. Then, the study reviewed the existing Quran ontologies and compared them against 11 criteria. Some deficits have been found in all these ontologies. Additionally, a single Quranic ontology does not cover most of the knowledge in the Quran. Therefore, I developed a new Arabic-English Quran ontology from ten datasets related to the Quran such as: Quran chapter and verse names, Quran word meanings, and Quran topics. The main aim of developing a Quranic ontology is to facilitate the retrieval of knowledge from the Quran. Additionally, the Quran ontology will enrich the raw Arabic and English Quran text with Islamic semantic tags. Furthermore, I developed the first Annotated Corpus of Quran Questions and Answers in Arabic. This corpus has 2200 pairs of question and answer collected from trusted Islamic sources. Each pair of question and answer is labelled with 5 tags. Examples of tags are: question type: either factoid or descriptive, topic of question-based on the Quran ontology, and question class. Finally, the thesis explains a new semantic search model for the Arabic Quran based on my Quran ontology. This model aims at overcoming limitations in the existing Quran search applications. This search tool employs both Information Retrieval techniques and semantic search technologies. The performance of this search model is evaluated by using The Annotated Corpus of Arabic Quran Questions and Answers

    Semantic annotation of digital music

    Get PDF
    AbstractIn recent times, digital music items on the internet have been evolving in a vast information space where consumers try to find/locate the piece of music of their choice by means of search engines. The current trend of searching for music by means of music consumersʼ keywords/tags is unable to provide satisfactory search results. It is argued that search and retrieval of music can be significantly improved provided end-usersʼ tags are associated with semantic information in terms of acoustic metadata – the latter being easy to extract automatically from digital music items. This paper presents a lightweight ontology that will enable music producers to annotate music against MPEG-7 description (with its acoustic metadata) and the generated annotation may in turn be used to deliver meaningful search results. Several potential multimedia ontologies have been explored and a music annotation ontology, named mpeg-7Music, has been designed so that it can be used as a backbone for annotating music items

    Role of Semantic web in the changing context of Enterprise Collaboration

    Get PDF
    In order to compete with the global giants, enterprises are concentrating on their core competencies and collaborating with organizations that compliment their skills and core activities. The current trend is to develop temporary alliances of independent enterprises, in which companies can come together to share skills, core competencies and resources. However, knowledge sharing and communication among multidiscipline companies is a complex and challenging problem. In a collaborative environment, the meaning of knowledge is drastically affected by the context in which it is viewed and interpreted; thus necessitating the treatment of structure as well as semantics of the data stored in enterprise repositories. Keeping the present market and technological scenario in mind, this research aims to propose tools and techniques that can enable companies to assimilate distributed information resources and achieve their business goals

    Ontology-based knowledge representation and semantic search information retrieval: case study of the underutilized crops domain

    Get PDF
    The aim of using semantic technologies in domain knowledge modeling is to introduce the semantic meaning of concepts in knowledge bases, such that they are both human-readable as well as machine-understandable. Due to their powerful knowledge representation formalism and associated inference mechanisms, ontology-based approaches have been increasingly adopted to formally represent domain knowledge. The primary objective of this thesis work has been to use semantic technologies in advancing knowledge-sharing of Underutilized crops as a domain and investigate the integration of underlying ontologies developed in OWL (Web Ontology Language) with augmented SWRL (Semantic Web Rule Language) rules for added expressiveness. The work further investigated generating ontologies from existing data sources and proposed the reverse-engineering approach of generating domain specific conceptualization through competency questions posed from possible ontology users and domain experts. For utilization, a semantic search engine (the Onto-CropBase) has been developed to serve as a Web-based access point for the Underutilized crops ontology model. Relevant linked-data in Resource Description Framework Schema (RDFS) were added for comprehensiveness in generating federated queries. While the OWL/SWRL combination offers a highly expressive ontology language for modeling knowledge domains, the combination is found to be lacking supplementary descriptive constructs to model complex real-life scenarios, a necessary requirement for a successful Semantic Web application. To this end, the common logic programming formalisms for extending Description Logic (DL)-based ontologies were explored and the state of the art in SWRL expressiveness extensions determined with a view to extending the SWRL formalism. Subsequently, a novel fuzzy temporal extension to the Semantic Web Rule Language (FT-SWRL), which combines SWRL with fuzzy logic theories based on the valid-time temporal model, has been proposed to allow modeling imprecise temporal expressions in domain ontologies

    A model for information retrieval driven by conceptual spaces

    Get PDF
    A retrieval model describes the transformation of a query into a set of documents. The question is: what drives this transformation? For semantic information retrieval type of models this transformation is driven by the content and structure of the semantic models. In this case, Knowledge Organization Systems (KOSs) are the semantic models that encode the meaning employed for monolingual and cross-language retrieval. The focus of this research is the relationship between these meanings’ representations and their role and potential in augmenting existing retrieval models effectiveness. The proposed approach is unique in explicitly interpreting a semantic reference as a pointer to a concept in the semantic model that activates all its linked neighboring concepts. It is in fact the formalization of the information retrieval model and the integration of knowledge resources from the Linguistic Linked Open Data cloud that is distinctive from other approaches. The preprocessing of the semantic model using Formal Concept Analysis enables the extraction of conceptual spaces (formal contexts)that are based on sub-graphs from the original structure of the semantic model. The types of conceptual spaces built in this case are limited by the KOSs structural relations relevant to retrieval: exact match, broader, narrower, and related. They capture the definitional and relational aspects of the concepts in the semantic model. Also, each formal context is assigned an operational role in the flow of processes of the retrieval system enabling a clear path towards the implementations of monolingual and cross-lingual systems. By following this model’s theoretical description in constructing a retrieval system, evaluation results have shown statistically significant results in both monolingual and bilingual settings when no methods for query expansion were used. The test suite was run on the Cross-Language Evaluation Forum Domain Specific 2004-2006 collection with additional extensions to match the specifics of this model

    A light-weight concept ontology for annotating digital music.

    Get PDF
    In the recent time, the digital music items on the internet have been evolving to an enormous information space where we try to find/locate the piece of information of our choice by means of search engine. The current trend of searching for music by means of music consumers' keywords/tags is unable to provide satisfactory search results; and search and retrieval of music may be potentially improved if music metadata is created from semantic information provided by association of end-users' tags with acoustic metadata which is easy to extract automatically from digital music items. Based on this observation, our research objective was to investigate how music producers may be able to annotate music against MPEG-7 description (with its acoustic metadata) to deliver meaningful search results. In addressing this question, we investigated the potential of multimedia ontologies to serve as backbone for annotating music items and prospective application scenarios of semantic technologies in the digital music industry. We achieved with our main contribution under this thesis is the first prototype of mpeg-7Music annotation ontology that establishes a mapping of end-users tags with MPEG-7 acoustic metadata as well as extends upper level multimedia ontologies with end-user tags. Additionally, we have developed a semi-automatic annotation tool to demonstrate the potential of the mpeg-7Music ontology to serve as light weight concept ontology for annotating digital music by music producers. The proposed ontology has been encoded in dominant semantic web ontology standard OWL1.0 and provides a standard interoperable representation of the generated semantic metadata. Our innovations in designing the semantic annotation tool were focussed on supporting the music annotation vocabulary (i.e. the mpeg-7Music) in an attempt to turn the music metadata information space to a knowledgebase

    Semantic multimedia modelling & interpretation for annotation

    Get PDF
    The emergence of multimedia enabled devices, particularly the incorporation of cameras in mobile phones, and the accelerated revolutions in the low cost storage devices, boosts the multimedia data production rate drastically. Witnessing such an iniquitousness of digital images and videos, the research community has been projecting the issue of its significant utilization and management. Stored in monumental multimedia corpora, digital data need to be retrieved and organized in an intelligent way, leaning on the rich semantics involved. The utilization of these image and video collections demands proficient image and video annotation and retrieval techniques. Recently, the multimedia research community is progressively veering its emphasis to the personalization of these media. The main impediment in the image and video analysis is the semantic gap, which is the discrepancy among a user’s high-level interpretation of an image and the video and the low level computational interpretation of it. Content-based image and video annotation systems are remarkably susceptible to the semantic gap due to their reliance on low-level visual features for delineating semantically rich image and video contents. However, the fact is that the visual similarity is not semantic similarity, so there is a demand to break through this dilemma through an alternative way. The semantic gap can be narrowed by counting high-level and user-generated information in the annotation. High-level descriptions of images and or videos are more proficient of capturing the semantic meaning of multimedia content, but it is not always applicable to collect this information. It is commonly agreed that the problem of high level semantic annotation of multimedia is still far from being answered. This dissertation puts forward approaches for intelligent multimedia semantic extraction for high level annotation. This dissertation intends to bridge the gap between the visual features and semantics. It proposes a framework for annotation enhancement and refinement for the object/concept annotated images and videos datasets. The entire theme is to first purify the datasets from noisy keyword and then expand the concepts lexically and commonsensical to fill the vocabulary and lexical gap to achieve high level semantics for the corpus. This dissertation also explored a novel approach for high level semantic (HLS) propagation through the images corpora. The HLS propagation takes the advantages of the semantic intensity (SI), which is the concept dominancy factor in the image and annotation based semantic similarity of the images. As we are aware of the fact that the image is the combination of various concepts and among the list of concepts some of them are more dominant then the other, while semantic similarity of the images are based on the SI and concept semantic similarity among the pair of images. Moreover, the HLS exploits the clustering techniques to group similar images, where a single effort of the human experts to assign high level semantic to a randomly selected image and propagate to other images through clustering. The investigation has been made on the LabelMe image and LabelMe video dataset. Experiments exhibit that the proposed approaches perform a noticeable improvement towards bridging the semantic gap and reveal that our proposed system outperforms the traditional systems

    Data-poor categorization and passage retrieval for Gene Ontology Annotation in Swiss-Prot

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the context of the BioCreative competition, where training data were very sparse, we investigated two complementary tasks: 1) given a Swiss-Prot triplet, containing a protein, a GO (Gene Ontology) term and a relevant article, extraction of a short passage that justifies the GO category assignement; 2) given a Swiss-Prot pair, containing a protein and a relevant article, automatic assignement of a set of categories.</p> <p>Methods</p> <p>Sentence is the basic retrieval unit. Our classifier computes a distance between each sentence and the GO category provided with the Swiss-Prot entry. The Text Categorizer computes a distance between each GO term and the text of the article. Evaluations are reported both based on annotator judgements as established by the competition and based on mean average precision measures computed using a curated sample of Swiss-Prot.</p> <p>Results</p> <p>Our system achieved the best recall and precision combination both for passage retrieval and text categorization as evaluated by official evaluators. However, text categorization results were far below those in other data-poor text categorization experiments The top proposed term is relevant in less that 20% of cases, while categorization with other biomedical controlled vocabulary, such as the Medical Subject Headings, we achieved more than 90% precision. We also observe that the scoring methods used in our experiments, based on the retrieval status value of our engines, exhibits effective confidence estimation capabilities.</p> <p>Conclusion</p> <p>From a comparative perspective, the combination of retrieval and natural language processing methods we designed, achieved very competitive performances. Largely data-independent, our systems were no less effective that data-intensive approaches. These results suggests that the overall strategy could benefit a large class of information extraction tasks, especially when training data are missing. However, from a user perspective, results were disappointing. Further investigations are needed to design applicable end-user text mining tools for biologists.</p

    Ontology-based knowledge representation and semantic search information retrieval: case study of the underutilized crops domain

    Get PDF
    The aim of using semantic technologies in domain knowledge modeling is to introduce the semantic meaning of concepts in knowledge bases, such that they are both human-readable as well as machine-understandable. Due to their powerful knowledge representation formalism and associated inference mechanisms, ontology-based approaches have been increasingly adopted to formally represent domain knowledge. The primary objective of this thesis work has been to use semantic technologies in advancing knowledge-sharing of Underutilized crops as a domain and investigate the integration of underlying ontologies developed in OWL (Web Ontology Language) with augmented SWRL (Semantic Web Rule Language) rules for added expressiveness. The work further investigated generating ontologies from existing data sources and proposed the reverse-engineering approach of generating domain specific conceptualization through competency questions posed from possible ontology users and domain experts. For utilization, a semantic search engine (the Onto-CropBase) has been developed to serve as a Web-based access point for the Underutilized crops ontology model. Relevant linked-data in Resource Description Framework Schema (RDFS) were added for comprehensiveness in generating federated queries. While the OWL/SWRL combination offers a highly expressive ontology language for modeling knowledge domains, the combination is found to be lacking supplementary descriptive constructs to model complex real-life scenarios, a necessary requirement for a successful Semantic Web application. To this end, the common logic programming formalisms for extending Description Logic (DL)-based ontologies were explored and the state of the art in SWRL expressiveness extensions determined with a view to extending the SWRL formalism. Subsequently, a novel fuzzy temporal extension to the Semantic Web Rule Language (FT-SWRL), which combines SWRL with fuzzy logic theories based on the valid-time temporal model, has been proposed to allow modeling imprecise temporal expressions in domain ontologies
    • …
    corecore