33,693 research outputs found

    Comparison of ontology alignment systems across single matching task via the McNemar's test

    Full text link
    Ontology alignment is widely-used to find the correspondences between different ontologies in diverse fields.After discovering the alignments,several performance scores are available to evaluate them.The scores typically require the identified alignment and a reference containing the underlying actual correspondences of the given ontologies.The current trend in the alignment evaluation is to put forward a new score(e.g., precision, weighted precision, etc.)and to compare various alignments by juxtaposing the obtained scores. However,it is substantially provocative to select one measure among others for comparison.On top of that, claiming if one system has a better performance than one another cannot be substantiated solely by comparing two scalars.In this paper,we propose the statistical procedures which enable us to theoretically favor one system over one another.The McNemar's test is the statistical means by which the comparison of two ontology alignment systems over one matching task is drawn.The test applies to a 2x2 contingency table which can be constructed in two different ways based on the alignments,each of which has their own merits/pitfalls.The ways of the contingency table construction and various apposite statistics from the McNemar's test are elaborated in minute detail.In the case of having more than two alignment systems for comparison, the family-wise error rate is expected to happen. Thus, the ways of preventing such an error are also discussed.A directed graph visualizes the outcome of the McNemar's test in the presence of multiple alignment systems.From this graph, it is readily understood if one system is better than one another or if their differences are imperceptible.The proposed statistical methodologies are applied to the systems participated in the OAEI 2016 anatomy track, and also compares several well-known similarity metrics for the same matching problem

    Scientific Knowledge Object Patterns

    Get PDF
    Web technology is revolutionizing the way diverse scientific knowledge is produced and disseminated. In the past few years, a handful of discourse representation models have been proposed for the externalization of the rhetoric and argumentation captured within scientific publications. However, there hasn’t been a unified interoperable pattern that is commonly used in practice by publishers and individual users yet. In this paper, we introduce the Scientific Knowledge Object Patterns (SKO Patterns) towards a general scientific discourse representation model, especially for managing knowledge in emerging social web and semantic web. © ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version is going to be published in "Proceedings of 15th European Conference on Pattern Languages of Programs", (2011) http://portal.acm.org/event.cfm?id=RE197&CFID=8795862&CFTOKEN=1476113

    DC Proposal: Evaluating trustworthiness of web content using semantic web technologies

    No full text
    Trust plays an important part in people's decision processes for using information. This is especially true on the Web, which has less quality control for publishing information. Untrustworthy data may lead users to make wrong decisions or result in the misunderstanding of concepts. Therefore, it is important for users to have a mechanism for assessing the trustworthiness of the information they consume. Prior research focuses on policy-based and reputation-based trust. It does not take the information itself into account. In this PhD research, we focus on evaluating the trustworthiness of Web content based on available and inferred metadata that can be obtained using Semantic Web technologies. This paper discusses the vision of our PhD work and presents an approach to solve that problem

    Organisational challenges of the semantic web in digital libraries: A Norwegian case study

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2009 Emerald Group Publishing LimitedPurpose – The purpose of this paper is to examine from a socio-technical point of view the impact of semantic web technology on the strategic, organisational and technological levels. The semantic web initiative holds great promise for the future for digital libraries. There is, however, a considerable gap in semantic web research between the contributions in the technological field and research in the organisational field. Design/methodology/approach – A comprehensive case study of the National Library of Norway (NL) is conducted, building on two major sources of information: the documentation of the digitising project of the NL; and interviews with nine different stakeholders at three levels of NL's organisation during June to August 2007. Top managers are interviewed on strategy, middle managers and librarians are interviewed regarding organisational issues and ICT professionals are interviewed on technology issues. Findings – The findings indicate that the highest impact will be at the organisational level. This is mainly because inter-organisational and cross-organisational structures have to be established to address the problems of ontology engineering, and a development framework for ontology engineering in digital libraries must be examined. Originality/value – ICT professionals and library practitioners should be more mindful of organisational issues when planning and executing semantic web projects in digital libraries. In particular, practitioners should be aware that the ontology engineering process and the semantic meta-data production will affect the entire organisation. For public digital libraries this probably will also call for a more open policy towards user groups to properly manage the process of ontology engineering

    Terminology server for improved resource discovery: analysis of model and functions

    Get PDF
    This paper considers the potential to improve distributed information retrieval via a terminologies server. The restriction upon effective resource discovery caused by the use of disparate terminologies across services and collections is outlined, before considering a DDC spine based approach involving inter-scheme mapping as a possible solution. The developing HILT model is discussed alongside other existing models and alternative approaches to solving the terminologies problem. Results from the current HILT pilot are presented to illustrate functionality and suggestions are made for further research and development

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Towards an automated query modification assistant

    Get PDF
    Users who need several queries before finding what they need can benefit from an automatic search assistant that provides feedback on their query modification strategies. We present a method to learn from a search log which types of query modifications have and have not been effective in the past. The method analyses query modifications along two dimensions: a traditional term-based dimension and a semantic dimension, for which queries are enriches with linked data entities. Applying the method to the search logs of two search engines, we identify six opportunities for a query modification assistant to improve search: modification strategies that are commonly used, but that often do not lead to satisfactory results.Comment: 1st International Workshop on Usage Analysis and the Web of Data (USEWOD2011) in the 20th International World Wide Web Conference (WWW2011), Hyderabad, India, March 28th, 201

    Past, present and future of information and knowledge sharing in the construction industry: Towards semantic service-based e-construction

    Get PDF
    The paper reviews product data technology initiatives in the construction sector and provides a synthesis of related ICT industry needs. A comparison between (a) the data centric characteristics of Product Data Technology (PDT) and (b) ontology with a focus on semantics, is given, highlighting the pros and cons of each approach. The paper advocates the migration from data-centric application integration to ontology-based business process support, and proposes inter-enterprise collaboration architectures and frameworks based on semantic services, underpinned by ontology-based knowledge structures. The paper discusses the main reasons behind the low industry take up of product data technology, and proposes a preliminary roadmap for the wide industry diffusion of the proposed approach. In this respect, the paper stresses the value of adopting alliance-based modes of operation
    • 

    corecore