2,079 research outputs found

    CC-interop : COPAC/Clumps Continuing Technical Cooperation. Final Project Report

    Get PDF
    As far as is known, CC-interop was the first project of its kind anywhere in the world and still is. Its basic aim was to test the feasibility of cross-searching between physical and virtual union catalogues, using COPAC and the three functioning "clumps" or virtual union catalogues (CAIRNS, InforM25, and RIDING), all funded or part-funded by JISC in recent years. The key issues investigated were technical interoperability of catalogues, use of collection level descriptions to search union catalogues dynamically, quality of standards in cataloguing and indexing practices, and usability of union catalogues for real users. The conclusions of the project were expected to, and indeed do, contribute to the development of the JISC Information Environment and to the ongoing debate as to the feasibility and desirability of creating a national UK catalogue. They also inhabit the territory of collection level descriptions (CLDs) and the wider services of JISC's Information Environment Services Registry (IESR). The results of this project will also have applicability for the common information environment, particularly through the landscaping work done via SCONE/CAIRNS. This work is relevant not just to HE and not just to digital materials, but encompasses other sectors and domains and caters for print resources as well. Key findings are thematically grouped as follows: System performance when inter-linking COPAC and the Z39.50 clumps. The various individual Z39.50 configurations permit technical interoperability relatively easily but only limited semantic interoperability is possible. Disparate cataloguing and indexing practices are an impairment to semantic interoperability, not just for catalogues but also for CLDs and descriptions of services (like those constituting JISC's IESR). Creating dynamic landscaping through CLDs: routines can be written to allow collection description databases to be output in formats that other UK users of CLDs, including developers of the JISC information environment. Searching a distributed (virtual) catalogue or clump via Z39.50: use of Z39.50 to Z39.50 middleware permits a distributed catalogue to be searched via Z39.50 from such disparate user services as another virtual union catalogue or clump, a physical union catalogue like COPAC, an individual Z client and other IE services. The breakthrough in this Z39.50 to Z39.50 conundrum came with the discovery that the JISC-funded JAFER software (a result of the 5/99 programme) meets many of the requirements and can be used by the current clumps services. It is technically possible for the user to select all or a sub-set of available end destination Z39.50 servers (we call this "landscaping") within this middleware. Comparing results processing between COPAC and clumps. Most distributed services (clumps) do not bring back complete results sets from associated Z servers (in order to save time for users). COPAC on-the-fly routines could feasibly be applied to the clumps services. An automated search set up to repeat its query of 17 catalogues in a clump (InforM25) hourly over nearly 3 months returned surprisingly good results; for example, over 90% of responses were received in less than one second, and no servers showed slower response times in periods of traditionally heavy OPAC use (mid-morning to early evening). User behaviour when cross-searching catalogues: the importance to users of a number of on-screen features, including the ability to refine a search and clear indication that a search is processing. The importance to users of information about the availability of an item as well as the holdings data. The impact of search tools such as Google and Amazon on user behaviour and the expectations of more information than is normally available from a library catalogue. The distrust of some librarians interviewed of the data sources in virtual union catalogues, thinking that there was not true interoperability

    Analysis of equivalence mapping for terminology services

    Get PDF
    This paper assesses the range of equivalence or mapping types required to facilitate interoperability in the context of a distributed terminology server. A detailed set of mapping types were examined, with a view to determining their validity for characterizing relationships between mappings from selected terminologies (AAT, LCSH, MeSH, and UNESCO) to the Dewey Decimal Classification (DDC) scheme. It was hypothesized that the detailed set of 19 match types proposed by Chaplan in 1995 is unnecessary in this context and that they could be reduced to a less detailed conceptually-based set. Results from an extensive mapping exercise support the main hypothesis and a generic suite of match types are proposed, although doubt remains over the current adequacy of the developing Simple Knowledge Organization System (SKOS) Core Mapping Vocabulary Specification (MVS) for inter-terminology mapping

    Variation of word frequencies across genre classification tasks

    Get PDF
    This paper examines automated genre classification of text documents and its role in enabling the effective management of digital documents by digital libraries and other repositories. Genre classification, which narrows down the possible structure of a document, is a valuable step in realising the general automatic extraction of semantic metadata essential to the efficient management and use of digital objects. In the present report, we present an analysis of word frequencies in different genre classes in an effort to understand the distinction between independent classification tasks. In particular, we examine automated experiments on thirty-one genre classes to determine the relationship between the word frequency metrics and the degree of its significance in carrying out classification in varying environments

    RDF, the semantic web, Jordan, Jordan and Jordan

    Get PDF
    This collection is addressed to archivists and library professionals, and so has a slight focus on implications implications for them. This chapter is nonetheless intended to be a more-or-less generic introduction to the Semantic Web and RDF, which isn't specific to that domain

    Resource discovery in heterogeneous digital content environments

    Get PDF
    The concept of 'resource discovery' is central to our understanding of how users explore, navigate, locate and retrieve information resources. This submission for a PhD by Published Works examines a series of 11 related works which explore topics pertaining to resource discovery, each demonstrating heterogeneity in their digital discovery context. The assembled works are prefaced by nine chapters which seek to review and critically analyse the contribution of each work, as well as provide contextualization within the wider body of research literature. A series of conceptual sub-themes is used to organize and structure the works and the accompanying critical commentary. The thesis first begins by examining issues in distributed discovery contexts by studying collection level metadata (CLM), its application in 'information landscaping' techniques, and its relationship to the efficacy of federated item-level search tools. This research narrative continues but expands in the later works and commentary to consider the application of Knowledge Organization Systems (KOS), particularly within Semantic Web and machine interface contexts, with investigations of semantically aware terminology services in distributed discovery. The necessary modelling of data structures to support resource discovery - and its associated functionalities within digital libraries and repositories - is then considered within the novel context of technology-supported curriculum design repositories, where questions of human-computer interaction (HCI) are also examined. The final works studied as part of the thesis are those which investigate and evaluate the efficacy of open repositories in exposing knowledge commons to resource discovery via web search agents. Through the analysis of the collected works it is possible to identify a unifying theory of resource discovery, with the proposed concept of (meta)data alignment described and presented with a visual model. This analysis assists in the identification of a number of research topics worthy of further research; but it also highlights an incremental transition by the present author, from using research to inform the development of technologies designed to support or facilitate resource discovery, particularly at a 'meta' level, to the application of specific technologies to address resource discovery issues in a local context. Despite this variation the research narrative has remained focussed on topics surrounding resource discovery in heterogeneous digital content environments and is noted as having generated a coherent body of work. Separate chapters are used to consider the methodological approaches adopted in each work and the contribution made to research knowledge and professional practice.The concept of 'resource discovery' is central to our understanding of how users explore, navigate, locate and retrieve information resources. This submission for a PhD by Published Works examines a series of 11 related works which explore topics pertaining to resource discovery, each demonstrating heterogeneity in their digital discovery context. The assembled works are prefaced by nine chapters which seek to review and critically analyse the contribution of each work, as well as provide contextualization within the wider body of research literature. A series of conceptual sub-themes is used to organize and structure the works and the accompanying critical commentary. The thesis first begins by examining issues in distributed discovery contexts by studying collection level metadata (CLM), its application in 'information landscaping' techniques, and its relationship to the efficacy of federated item-level search tools. This research narrative continues but expands in the later works and commentary to consider the application of Knowledge Organization Systems (KOS), particularly within Semantic Web and machine interface contexts, with investigations of semantically aware terminology services in distributed discovery. The necessary modelling of data structures to support resource discovery - and its associated functionalities within digital libraries and repositories - is then considered within the novel context of technology-supported curriculum design repositories, where questions of human-computer interaction (HCI) are also examined. The final works studied as part of the thesis are those which investigate and evaluate the efficacy of open repositories in exposing knowledge commons to resource discovery via web search agents. Through the analysis of the collected works it is possible to identify a unifying theory of resource discovery, with the proposed concept of (meta)data alignment described and presented with a visual model. This analysis assists in the identification of a number of research topics worthy of further research; but it also highlights an incremental transition by the present author, from using research to inform the development of technologies designed to support or facilitate resource discovery, particularly at a 'meta' level, to the application of specific technologies to address resource discovery issues in a local context. Despite this variation the research narrative has remained focussed on topics surrounding resource discovery in heterogeneous digital content environments and is noted as having generated a coherent body of work. Separate chapters are used to consider the methodological approaches adopted in each work and the contribution made to research knowledge and professional practice

    Exploring manuscripts: sharing ancient wisdoms across the semantic web

    Get PDF
    Recent work in digital humanities has seen researchers in-creasingly producing online editions of texts and manuscripts, particularly in adoption of the TEI XML format for online publishing. The benefits of semantic web techniques are un-derexplored in such research, however, with a lack of sharing and communication of research information. The Sharing Ancient Wisdoms (SAWS) project applies linked data prac-tices to enhance and expand on what is possible with these digital text editions. Focussing on Greek and Arabic col-lections of ancient wise sayings, which are often related to each other, we use RDF to annotate and extract seman-tic information from the TEI documents as RDF triples. This allows researchers to explore the conceptual networks that arise from these interconnected sayings. The SAWS project advocates a semantic-web-based methodology, en-hancing rather than replacing current workflow processes, for digital humanities researchers to share their findings and collectively benefit from each other’s work
    • …
    corecore