6 research outputs found

    Migration on request, a practical technique for preservation

    Get PDF
    Maintaining a digital object in a usable state over time is a crucial aspect of digital preservation. Existing methods of preserving have many drawbacks. This paper describes advanced techniques of data migration which can be used to support preservation more accurately and cost effectively. To ensure that preserved works can be rendered on current computer systems over time, “traditional migration” has been used to convert data into current formats. As the new format becomes obsolete another conversion is performed, etcetera. Traditional migration has many inherent problems as errors during transformation propagate throughout future transformations. CAMiLEON’s software longevity principles can be applied to a migration strategy, offering improvements over traditional migration. This new approach is named “Migration on Request.” Migration on Request shifts the burden of preservation onto a single tool, which is maintained over time. Always returning to the original format enables potential errors to be significantly reduced

    Experimental evaluation of a generative probabilistic image retrieval model on 'easy' data

    Get PDF
    We present evaluation results of a generative probabilistic image retrieval model using `easy data'. Previous research into our model's retrieval effectiveness has used the test collection developed at TREC's Video Track, but as discussed in detail in [WeVr:SIGIR:03], its search task has been too difficult to measure actual performance of the retrieval model. The `easy data' experiments presented here evaluate our model under varying model parameters on the Corel set. The Corel data set is relatively easy because images are nicely grouped into coherent themes, the within theme similarity is high and the across theme similarity relatively low. These properties make Corel a nice vehicle for testing, presenting or selling new content based retrieval techniques and models. In contrast to the TREC data, the Corel collection gives statistically significant differences between varying experimental conditions, so we get more insight in the model's behaviour. We then discuss at length the limitations of the results obtained using this data set, comparing the experiments performed here to those on the TREC data

    Migration on Request, a Practical Technique for Preservation

    Full text link

    Personalized classification for keyword-based category profiles

    Get PDF
    Personalized classification refers to allowing users to define their own categories and automating the assignment of documents to these categories. In this paper, we examine the use of keywords to define personalized categories and propose the use of Support Vector Machine (SVM) to perform personalized classification. Two scenarios have been investigated. The first assumes that the personalized categories are defined in a flat category space. The second assumes that each personalized category is defined within a pre-defined general category that provides a more specific context for the personalized category. The training documents for personalized categories are obtained from a training document pool using a search engine and a set of keywords. Our experiments have delivered better classification results using the second scenario. We also conclude that the number of keywords used can be very small and increasing them does not always lead to better classification performance

    Metadata categorization for identifying search patterns in a digital library

    Get PDF
    Purpose: For digital libraries, it is useful to understand how users search in a collection. Investigating search patterns can help them to improve the user interface, collection management and search algorithms. However, search patterns may vary widely in different parts of a collection. The purpose of this paper is to demonstrate how to identify these search patterns within a well-curated historical newspaper collection using the existing metadata.Design/methodology/approach: The authors analyzed search logs combined with metadata

    Standardisierung von der Heterogenität her denken - zum Entwicklungsstand bilateraler Transferkomponenten für digitale Fachbibliotheken

    Full text link
    "Lösungen für die Probleme beim Aufbau fachwissenschaftlicher Informationsangebote im Web führen weit über die bisher gewohnten Denkweisen von Informationszentren und Bibliothekaren hinaus. Die diskutierte Leitlinie 'Standardisierung von der Heterogenität her zu denken' charakterisiert den Wandel am deutlichsten. Er ist nicht nur technologisch, sondern inhaltlich-konzeptuell. Der im Folgenden dargestellte Stand spiegelt das Ergebnis der Entwicklung der letzten sechs Jahre wider und fasst die realisierten Teilkomponenten solch einer Modellsicht zusammen." (Autorenreferat
    corecore