53 research outputs found

    Arc - An OAI Service Provider for Digital Library Federation

    Get PDF
    The usefulness of the many on-line journals and scientific digital libraries that exist today is limited by the inability to federate these resources through a unified interface. The Open Archive Initiative (OAI) is one major effort to address technical interoperability among distributed archives. The objective of OAI is to develop a framework to facilitate the discovery of content in distributed archives. In this paper, we describe our experience and lessons learned in building Arc, the first federated searching service based on the OAI protocol. Arc harvests metadata from several OAI compliant archives, normalizes them, and stores them in a search service based on a relational database (MySQL or Oracle). At present we have over 320,000 metadata records from 18 data providers from various subject domains. We have also implemented an OAI layer over Arc, thus making hierarchical harvesting possible. The experiences described within should be applicable to others who seek to build an OAI service provider

    Object Persistence and Availability in Digital Libraries

    Get PDF
    We have studied object persistence and availability of 1,000 digital library (DL) objects. Twenty World Wide Web accessible DLs were chosen and from each DL, 50 objects were chosen at random. A script checked the availability of each object three times a week for just over 1 year for a total of 161 data samples. During this time span, we found 31 objects (3% of the total) that appear to no longer be available: 24 from PubMed Central, 5 from IDEAS, 1 from CogPrints, and 1 from ETD

    Servicing the federation : the case for metadata harvesting

    Get PDF
    The paper presents a comparative analysis of data harvesting and distributed computing as complementary models of service delivery within large-scale federated digital libraries. Informed by requirements of flexibility and scalability of federated services, the analysis focuses on the identification and assessment of model invariants. In particular, it abstracts over application domains, services, and protocol implementations. The analytical evidence produced shows that the harvesting model offers stronger guarantees of satisfying the identified requirements. In addition, it suggests a first characterisation of services based on their suitability to either model and thus indicates how they could be integrated in the context of a single federated digital library

    Eprints and the Open Archives Initiative

    Full text link
    The Open Archives Initiative (OAI) was created as a practical way to promote interoperability between eprint repositories. Although the scope of the OAI has been broadened, eprint repositories still represent a significant fraction of OAI data providers. In this article I present a brief survey of OAI eprint repositories, and of services using metadata harvested from eprint repositories using the OAI protocol for metadata harvesting (OAI-PMH). I then discuss several situations where metadata harvesting may be used to further improve the utility of eprint archives as a component of the scholarly communication infrastructure.Comment: 13 page

    Federating Heterogeneous Digital Libraries by Metadata Harvesting

    Get PDF
    This dissertation studies the challenges and issues faced in federating heterogeneous digital libraries (DLs) by metadata harvesting. The objective of federation is to provide high-level services (e.g. transparent search across all DLs) on the collective metadata from different digital libraries. There are two main approaches to federate DLs: distributed searching approach and harvesting approach. As the distributed searching approach replies on executing queries to digital libraries in real time, it has problems with scalability. The difficulty of creating a distributed searching service for a large federation is the motivation behind Open Archives Initiatives Protocols for Metadata Harvesting (OAI-PMH). OAI-PMH supports both data providers (repositories, archives) and service providers. Service providers develop value-added services based on the information collected from data providers. Data providers are simply collections of harvestable metadata. This dissertation examines the application of the metadata harvesting approach in DL federations. It addresses the following problems: (1) Whether or not metadata harvesting provides a realistic and scalable solution for DL federation. (2) What is the status of and problems with current data provider implementations, and how to solve these problems. (3) How to synchronize data providers and service providers. (4) How to build different types of federation services over harvested metadata. (5) How to create a scalable and reliable infrastructure to support federation services. The work done in this dissertation is based on OAI-PMH, and the results have influenced the evolution of OAI-PMH. However, the results are not limited to the scope of OAI-PMH. Our approach is to design and build key services for metadata harvesting and to deploy them on the Web. Implementing a publicly available service allows us to demonstrate how these approaches are practical. The problems posed above are evaluated by performing experiments over these services. To summarize the results of this thesis, we conclude that the metadata harvesting approach is a realistic and scalable approach to federate heterogeneous DLs. We present two models of building federation services: a centralized model and a replicated model. Our experiments also demonstrate that the repository synchronization problem can be addressed by push, pull, and hybrid push/pull models; each model has its strengths and weaknesses and fits a specific scenario. Finally, we present a scalable and reliable infrastructure to support the applications of metadata harvesting

    Lessons Learned with Arc, an OAI-PMH Service Provider

    Get PDF
    Web-based digital libraries have historically been built in isolation utilizing different technologies, protocols, and metadata. These differences hindered the development of digital library services that enable users to discover information from multiple libraries through a single unified interface. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a major, international effort to address technical interoperability among distributed repositories. Arc debuted in 2000 as the first end-user OAI-PMH service provider. Since that time, Arc has grown to include nearly 7,000,000 metadata records. Arc has been deployed in a number of environments and has served as the basis for many other OAI-PMH projects, including Archon, Kepler, NCSTRL, and DP9. In this article we review the history of OAI-PMH and Arc, as well as some of the lessons learned while developing Arc and related OAI-PMH services. Reprinted by permission of the publisher

    Metadata Architecture for Digital Libraries: Conceptual framework for Indian Digital Libraries

    Get PDF
    This paper describes approach of development of Metadata solution for digital library architecture for resource description and retrieval. This deals with the concept of Metadata [2], the different Metadata standards (Dublin core in particular [5]), Digital library environment, computer network capabilities etc. This paper also discusses two of the Digital Library architecture protocols, for resource description and retrieval. They are STARTS (Stanford Protocol Proposal for Internet Retrieval and Search) [8] and SODA (Smart Objects and Dump Archives)[13] architecture to arrive at a possible protocol that would help to build Indian Digital Libraries [5]. While proposing the new architecture the existing Indian environment with respect to information sources and user's query of the information sources [5.1], which are feasible for launch of this protocol for information processing and retrieval has been dealt with. This is a pilot study which the author has done while doing his Fulbright fellowship in the College of Library Information Studies, University of Maryland, College Park, MD during 1999-2000

    The aDORe federation architecture: digital repositories at scale

    Get PDF
    • ā€¦
    corecore