23 research outputs found

    The case for metadata harvesting

    Get PDF
    Metadata harvesting is an increasingly popular model of interaction between the mutually autonomous parties of medium, medium-large federations of digital library services. With a harvesting protocol, in particular, resource descriptions locally available at each party can be served to remote applications for the implementation of federated services, such as resource discovery. This article offers a systematic explanation of the success of the model and its standard implementations in the context of current initiatives for national and international federations

    Geoscience after IT: Part L. Adjusting the emerging information system to new technology

    Get PDF
    Coherent development depends on following widely used standards that respect our vast legacy of existing entries in the geoscience record. Middleware ensures that we see a coherent view from our desktops of diverse sources of information. Developments specific to managing the written word, map content, and structured data come together in shared metadata linking topics and information types

    Metadata harvesting for content-based distributed information retrieval

    Get PDF
    We propose an approach to content-based Distributed Information Retrieval based on the periodic and incremental centralisation of full content indices of widely dispersed and autonomously managed document sources. Inspired by the success of the Open Archive Initiative’s protocol for metadata harvesting, the approach occupies middle ground between content crawling and distributed retrieval. As in crawling, some data moves towards the retrieval process, but it is statistics about the content rather than content itself; this grants more efficient use of network resources and wider scope of application. As in distributed retrieval, some processing is distributed along with the data, but it is indexing rather than retrieval; this reduces the costs of content provision whilst promoting the simplicity, effectiveness, and responsiveness of retrieval. Overall, we argue that the approach retains the good properties of centralised retrieval without renouncing to cost-effective, large-scale resource pooling. We discuss the requirements associated with the approach and identify two strategies to deploy it on top of the OAI infrastructure. In particular, we define a minimal extension of the OAI protocol which supports the coordinated harvesting of full-content indices and descriptive metadata for content resources. Finally, we report on the implementation of a proof-of-concept prototype service for multi-model content-based retrieval of distributed file collections

    Geoscience after IT: Part H. Familiarization with managing the information base

    Get PDF
    The geoscience record stores information for later reuse. The management of bibliographic, cartographic and quantitative information have different backgrounds. All involve: deciding what to keep; structuring the record so that information can be found when needed; maintaining search tools, indexes and abstracts; defining the content by reference to metadata. The current approaches to managing the literature, spatial information and quantitative data may be subsumed in a more comprehensive object-oriented model of the information base
    corecore