15 research outputs found

    Document Clustering Method Based on Frequent Co-occurring Words

    Get PDF
    PACLIC 20 / Wuhan, China / 1-3 November, 200

    Automatic semantic annotation of Web documents

    Get PDF
    Ontologies are the most important construct of the Semantic Web. From the first attempt of using simplified RDF syntax to the advanced features of the OWL languages, ontologies have arisen as the most viable technology offering solutions to integrate various Web resources into a more intelligent Web. The work presented in this thesis is a contribution to the new generation of the Web, which should be readable and interpreted not only by humans but also by machines, such as software agents. In order to allow ontologies to achieve their role of "animating" the traditional Web into this next generation Web, it is essential to find an efficient way to map all existent Web resources onto their corresponding ontology classes. In this thesis, we propose an approach for automatic semantic annotation of Web documents which is an effective way to make the Semantic Web a reality. Such an integrated Web would greatly improve the accuracy of search engines, bring a new generation of intelligent Web services, push the limits of multi-agent technologies and improve many other areas of human activity that we cannot even imagine today. Considering the size and the speed of the growing Web, it is clear that this task cannot be achieved manually. Semi-automatic and automatic annotations of Web documents using statistical text classification methods seem to be the most promising solution. This work is focused on an approach based on Naive Bayes text classification adapted to some characteristics that are particular to Web documents. A complete software solution is developed to allow testing feasibility of such an approach. Furthermore, different variations of the text classification algorithms are tested and analysed in order to identify the most optimal approach to semantically annotate Web documents. Notably, the usage of Web documents hierarchy is explored as an option to improve the accuracy of semi-automatic and automatic annotations of Web documents. The results of each tested method are presented and commented. Finally, some aspects that could possibly be improved or approached in a different way are identified for future work

    A Novel Approach to Ontology Management

    Get PDF
    The term ontology is defined as the explicit specification of a conceptualization. While much of the prior research has focused on technical aspects of ontology management, little attention has been paid to the investigation of issues that limit the widespread use of ontologies and the evaluation of the effectiveness of ontologies in improving task performance. This dissertation addresses this void through the development of approaches to ontology creation, refinement, and evaluation. This study follows a multi-paper model focusing on ontology creation, refinement, and its evaluation. The first study develops and evaluates a method for ontology creation using knowledge available on the Web. The second study develops a methodology for ontology refinement through pruning and empirically evaluates the effectiveness of this method. The third study investigates the impact of an ontology in use case modeling, which is a complex, knowledge intensive organizational task in the context of IS development. The three studies follow the design science research approach, and each builds and evaluates IT artifacts. These studies contribute to knowledge by developing solutions to three important issues in the effective development and use of ontologies
    corecore