2 research outputs found

    Diversity in image retrieval: DCU at ImageCLEFPhoto 2008

    Get PDF
    DCU participated in the ImageCLEF 2008 photo retrieval task, submitting runs for both the English and Random language annotation conditions. Our approaches used text-based and image-based retrieval approaches to give baseline retrieval runs, with the highest-ranked images from these baseline runs clustered using K-Means clustering of the text annotations. Finally, each cluster was represented by its most relevant image and these images were ranked for the nal submission. For random annotation language runs, we used TextCat1 to identify German annotation documents, which were then translated into English using Systran Version:3.0 Machine Translator. We also compared results from these translated runs with untranslated runs. Our results showed that, as expected, runs that combine image and text outperform text alone and image alone. Our baseline image+text runs (i.e. without clustering) give our best MAP score, and these runs also outperformed the mean and median ImageCLEFPhoto submissions for CR@20. Clustering approaches consistently gave a large improvement in CR@20 over the baseline, unclustered results. Pseudo relevance feedback consistently improved MAP while also consistently decreasing CR@20. We also found that the performance of untranslated random runs was quite close to that of translated random runs for CR@20, indicating that we could achieve similar diversity in our results without translating the documents