Investigating Multilingual, Multi-script Support in Lucene/Solr Library Applications

Abstract

Yale has developed over many years a highly-structured, high-quality multilingual catalog of bibliographic data. Almost 50% of the collection represents non-English materials in over 650 languages, and includes many different non-Roman scripts. Faculty, students, researchers, and staff would like to make full use of this original script content for resource discovery. While the underlying textual data are in place, effective indexing, retrieval and display functionality for the non-Roman script content is not available within our bibliographic discovery applications, Orbis and Yufind. Opportunities now exist in the Unicode, Lucene/Solr computing environment to bridge the functionality gap and achieve internationalization of the Yale Library catalog. While most parts of this study focus on the Yale environment, in the absence of other such studies it is hoped that the findings will be of interest to a much larger community.Arcadia Foundatio

    Similar works