Dept of Computer Engineering, Chalmers University of Technology
Abstract
Many independent publishers are today offering digital libraries with fulltext archives. In an attempt to provide a single user-interface to a large set of archives, DTVs Article Database Service, offers a consolidatedinterface to a geographically distributed set of archives. While this approach offers a tremendous functional advantage to a user, the delays caused by the network and queuing delays in servers make the user-perceived interactive performance poor. In this paper, we study the prospects of caching articles at the client level as wel as intermediate points as manifested by gateways that implement the interfaces to the many fulltext archives. A central research question is what the nature of the locality is in the user accesses to such a digital library. Based on access logs to drive simulations, we find that client side caching can result in a 20% hitrate. However, at the gateway level, where multiple users may access the same article, the temporal locality is poor and caching is not so relevant. We have also studied whether spatial locality can be exploited by considering to load into cache all articles in an issue, volume, or journal, if a single article is accessed, but found that spatial locality is quite poor