18,190 research outputs found

    An evaluation of resource description quality measures

    Get PDF
    An open problem for Distributed Information Retrieval is how to represent large document repositories (known as resources) efficiently. To facilitate resource selection, estimated descriptions of each resource are required, especially when faced with non-cooperative distributed environments. Accurate and efficient Resource description estimation is required as this can have an affect on resource selection, and as a consequence retrieval quality. Query-Based Sampling (QBS) has been proposed as a novel solution for resource estimation, with proceeding techniques developed therafter. However, the challenge to determine if one QBS technique is better at generating resource description than another is still an unresolved issue. The initial metrics tested and deployed for measuring resource description quality were the Collection Term Frequency ratio (CTF) and Spearman Rank Correlation Coefficient (SRCC). The former provides an indication of the percentage of terms seen, whilst the later measures the term ranking order, although neither consider the term frequency, which is important for resource selection. We re-examine this problem and consider measuring the quality of a resource description in context to resource selection, where an estimate of the probability of a term given the resource is typically required. We believe a natural measure for comparing the estimated resource against the actual resource is the Kullback-Leibler Divergence (KL) measure. KL addresses the concerns put forward previously, by not over-representing low frequency terms, and also considering term order. In this paper, we re-assess the two previous measures alongside KL. Our preliminary investigation revealed that the former metrics display contradictory results. Whilst, KL suggested a different QBS technique than that prescribed in, would provide better estimates. This is a significant result, because it now remains unclear as to which technique will consistently provide better resource descriptions. The remainder of this paper details the three measures, the experimental analysis of our preliminary study and outlines our points of concern along with further research directions

    Transformation From Semantic Data Model to Rdf

    Get PDF
    There have been several efforts to use relational model and database to store and manipulate Resource Description Framework (RDF). They have one general disadvantage, i.e. one is forced to map the model of semantics of RDF into relational model, which will end up in constraints and additional properties, such as, validating each assertion against the RDF schema which also stored as a triplets table. In this paper, we introduce Semantic Data Model as a proposed data model language to store and manipulate Resource Description Framework. This study also tries to prescribe the procedure on transforming a semantic data model into a RDF data model. Keyworsd: Semantic Data Model, Resource Description Framework

    Towards better measures: evaluation of estimated resource description quality for distributed IR

    Get PDF
    An open problem for Distributed Information Retrieval systems (DIR) is how to represent large document repositories, also known as resources, both accurately and efficiently. Obtaining resource description estimates is an important phase in DIR, especially in non-cooperative environments. Measuring the quality of an estimated resource description is a contentious issue as current measures do not provide an adequate indication of quality. In this paper, we provide an overview of these currently applied measures of resource description quality, before proposing the Kullback-Leibler (KL) divergence as an alternative. Through experimentation we illustrate the shortcomings of these past measures, whilst providing evidence that KL is a more appropriate measure of quality. When applying KL to compare different QBS algorithms, our experiments provide strong evidence in favour of a previously unsupported hypothesis originally posited in the initial Query-Based Sampling work

    Resource Description Framework

    Get PDF

    Semantic Web and Resource Description

    Get PDF
    Knowledge representation could be a powerful tool for search in digital collections. Semantic Web is commonly used knowledge representation technique in building Expert systems. It can be very well utilized in information retrieval in digital collections as well as Internet. Paper discusses the tools and techniques for knowledge representation using Semantic Web as well as its impact on the precision of search results in digital collections

    D-RDF: Dynamic Resource Description Framework

    Get PDF
    Semantic Web is described as the Web of Data, as opposed to the World Wide Web which is a Web of Documents. As research in the field of Semantic Web is gaining momentum, the focus is shifting on the effective representation of the data that constitutes the Semantic Web. RDF or the Resource Description Framework is the W3C standardized language for describing the semantics of the data and hence, sharing its meaning across applications. In RDF, all entities are modeled as resources and facts about these resources are asserted in terms of properties and their values.;In this thesis, we propose a computation model for RDF called Dynamic RDF or D-RDF. While RDF models are restricted to describing resources in terms of their assertive or static properties, D-RDF is a generalization of RDF where there exist the assertive properties of the resources along with certain dynamic properties that operate on the values of existing properties and infer new data. In such a model, the information carries the semantics with it in the form of computing methods. In other words, whereas RDF represents semantically enhanced data, D-RDF represents both data and the programs that operate on the data. This design ensures that when the D-RDF model is processed, the dynamic properties would operate on the current values of the base data and hence, the values of the dynamic properties will always be consistent with changes that occur on that data. Hence, we develop a model of context-sensitive semantics and implement an interpretation engine for this language. The power of D-RDF is demonstrated by implementing use-cases of varying levels of complexity that highlight how highly customized data models can be constructed with D-RDF to represent information in a form that does not already exist

    Metadata elements for digital news resource description

    Get PDF
    This paper examines and proposes a set of metadata elements for describing digital news articles for the benefit of distributed and heterogeneous news resource discovery. Existing digital news description standards such as NITF and NewsML are analysed and compared with Dublin Core Metadata Element Set (DCMES), which results in that the use of Dublin Core is encouraged for interoperability of the resources. The suggested metadata elements are carefully selected and defined considering the characteristics of news articles. Some elements are detailed with refinement qualifiers and recommended encoding scheme. This set of metadata has been developed as a part of the tasks in the IST (Information Society Technologies)-funded European project OmniPaper (Smart Access to European Newspapers, IST-2001-32174)
    • 

    corecore