Search CORE

5 research outputs found

Parallelising Harvesting

Author: B. Wilkinson
J. Dongarra
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

Parallelising Harvesting

Author: B. Wilkinson
J. Dongarra
Publication venue
Publication date: 01/01/2006
Field of study

Metadata harvesting has become a common technique to transfer a stream of data from one metadata repository or digital library system to another. As collections of metadata, and their associated digital objects, grow in size, the ingest of these items at the destination archive can take a significant amount of time, depending on the type of indexing or post-processing that is required. This paper discusses an approach to parallelise the post-processing of data in a small cluster of machines or a multi-processor environment, while not increasing the burden on the source data provider. Performance tests have been carried out on varying architectures and the results indicate that this technique is indeed promising for some scenarios and can be extended to more computationally-intensive ingest procedures. In general, the technique presents a new approach for the construction of harvest-based distributed or component-based digital libraries, with better scalability than before

Crossref

UCT Computer Science Research Document Archive

Utility-based high performance digital library systems

Author: Suleman Hussein
Publication venue: Revista de Instrumentos, Modelos e Politicas em Avaliacao Educacional
Publication date: 01/01/2009
Field of study

Many practical digital library systems have had to deal with scalability of data collections and/or service provision. Early attempts at enabling this scalability focused on data/services closely coupled with or tightly integrated with various high performance computing platforms. This inevitably resulted in compromises and very specific solutions. This paper presents an analysis of current high performance systems and motivates for why utility computing can subsume existing models and better meet the needs of generic scalable digital library systems

LEKYTHOS

CiteSeerX

UCT Computer Science Research Document Archive

Lightweight component-based scalability

Author: Omar Muammar
Parker Christopher
Suleman Hussein
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

Digital libraries and information management systems are increasingly being developed according to component models with well-defined APIs and often with Web-accessible interfaces. In parallel with metadata access and harvesting, Web 2.0 mashups have demonstrated the flexibility of developing systems as independent distributed components. It can be argued that such distributed components also can be an enabler for scalability of service provision in medium to large systems. To test this premise, this article discusses how an existing component framework was modified to include support for scalability. A set of lightweight services and extensions were created to migrate and replicate services as the load changes. Experiments with the prototype system confirm that this system can in fact be quite effective as an enabler of transparent and efficient scalability, without the need to resort to complex middleware or substantial system reengineering. Finally, specific problems areas have been identified as future avenues for exploration at the crucial intersection of digital libraries and high-performance computing

CiteSeerX

UCT Computer Science Research Document Archive