27 research outputs found
The effects of separate and merged indexes and word normalization in multilingual CLIR
Multilingual IR may be performed in two environments: there may exist a separate index for each target language, or all the languages may be indexed in a merged index. In the first case, retrieval must be performed separately in each index, after which the result lists have to be merged. In the case of the merged index, there are two alternatives: either to perform retrieval with a merged query (all the languages in the same query), or to perform distinct retrievals in each language, and merge the result lists. Further, there are several indexing approaches concerning word normalization. The present paper examines the impact of stemming compared with inflected retrieval in multilingual IR when there are separate indexes / a merged index. Four different result list merging approaches are compared with each other. The best result was achieved when retrieval was performed in separate indexes and result lists were merged. Stemming seems to improve the results compared with inflected retrieval
Sähköisen tiedejulkaisemisen ja tiedejulkaisujen verkkokaupan kehittyminen Suomessa
Eija Airio tutkii kirjassaan Sähköisen tiedejulkaisemisen ja tiedejulkaisun verkkokaupan kehittyminen Suomessa, miksi ja miten eräät suomalaiset tiedekustantajat ovat siirtyneet sähköiseen tiedejulkaisemiseen ja julkaisujen verkkokauppaan. Pyrkimyksenä on myös selvittää, millaisia tulevaisuudensuunnitelmia ja odotuksia heillä on näistä uusista välineistä. Kirja on tapaustutkimus, jossa tiedejulkaisemisen sähköistymistä edustaa Elektra-projekti ja tiedejulkaisujen sähköistymistä verkkokirjakauppa Granum. Tutkimuksessa kävi ilmi, että sekä tiedejulkaisutoiminnan että julkaisujen myynnin sähköistymisen suurimpana ongelmana on rahoituksen puute. Hankkeet saavat ulkopuolista rahoitusta perustamisvaiheessa, mutta uhkana on niiden jääminen pelkiksi kokeiluiksi jatkorahoituksen puutteen vuoksi
The effects of separate and merged indexes and word normalization in multilingual CLIR
Multilingual IR may be performed in two environments: there may exist a separate index for each target language, or all the languages may be indexed in a merged index. In the first case, retrieval must be performed separately in each index, after which the result lists have to be merged. In the case of the merged index, there are two alternatives: either to perform retrieval with a merged query (all the languages in the same query), or to perform distinct retrievals in each language, and merge the result lists. Further, there are several indexing approaches concerning word normalization. The present paper examines the impact of stemming compared with inflected retrieval in multilingual IR when there are separate indexes / a merged index. Four different result list merging approaches are compared with each other. The best result was achieved when retrieval was performed in separate indexes and result lists were merged. Stemming seems to improve the results compared with inflected retrieval
DCU and UTA at ImageCLEFPhoto 2007
Dublin City University (DCU) and University of Tampere(UTA) participated in the ImageCLEF 2007 photographic ad-hoc retrieval task with several monolingual and bilingual
runs. Our approach was language independent: text retrieval based on fuzzy s-gram query translation was combined with visual retrieval. Data fusion between text and image content
was performed using unsupervised query-time weight generation approaches. Our baseline was a combination of dictionary-based query translation and visual retrieval, which achieved the best result. The best mixed modality runs using fuzzy s-gram translation achieved on average around 83% of the performance of the baseline. Performance was more similar when only top rank precision levels of P10 and P20 were considered. This suggests that fuzzy sgram
query translation combined with visual retrieval is a cheap alternative for cross-lingual image retrieval where only a small number of relevant items are required. Both sets of results emphasize the merit of our query-time weight generation schemes for data fusion, with the fused runs exhibiting marked performance increases over single modalities, this is achieved without the use of any prior training data