6 research outputs found

    Matching Meaning for Cross-Language Information Retrieval

    Get PDF
    Cross-language information retrieval concerns the problem of finding information in one language in response to search requests expressed in another language. The explosive growth of the World Wide Web, with access to information in many languages, has provided a substantial impetus for research on this important problem. In recent years, significant advances in cross-language retrieval effectiveness have resulted from the application of statistical techniques to estimate accurate translation probabilities for individual terms from automated analysis of human-prepared translations. With few exceptions, however, those results have been obtained by applying evidence about the meaning of terms to translation in one direction at a time (e.g., by translating the queries into the document language). This dissertation introduces a more general framework for the use of translation probability in cross-language information retrieval based on the notion that information retrieval is dependent fundamentally upon matching what the searcher means with what the document author meant. The perspective yields a simple computational formulation that provides a natural way of combining what have been known traditionally as query and document translation. When combined with the use of synonym sets as a computational model of meaning, cross-language search results are obtained using English queries that approximate a strong monolingual baseline for both French and Chinese documents. Two well-known techniques (structured queries and probabilistic structured queries) are also shown to be a special case of this model under restrictive assumptions

    A study of context influences in Arabic-English language translation technologies

    Get PDF
    Social and cultural context is largely missing in current language translation systems. Dictionary based systems translate terms in a source language to an equivalent term in a target language, but often the translation could be inaccurate when context is not taken into consideration, or when an equivalent term in the target language does not exist. Domain knowledge and context can be made explicit by using ontologies, and ontology utilization would enable inclusion of semantic relations to other terms, leading to translation results which is more comprehensive than a single equivalent term. It is proposed that existing ontologies in the domain should be utilized and combined by ontology merging techniques, to leverage on existing resources to form a basis ontology with contextual representation, and this can be further enhanced by using machine translation techniques on existing corpora to improve the basic ontology to append further contextual information to the knowledge base. Statistical methods in machine translation could provide automated relevance determination of these existing resources which are machine readable, and aid the human translator in establishing a domain specific knowledge base for translation. Advancements in communication and technologies has made the world smaller where people of different regions and languages need to work together and interact.The accuracy of these translations are crucial as it could lead to misunderstandings and possible conflict. While single equivalent terms in a target language can provide a gist of the meaning of a source language term, a semantic conceptualisation provided by an ontology could enable the term to be understood in the specific context that it is being used

    The Information-seeking Strategies of Humanities Scholars Using Resources in Languages Other Than English

    Get PDF
    ABSTRACT THE INFORMATION-SEEKING STRATEGIES OF HUMANITIES SCHOLARS USING RESOURCES IN LANGUAGES OTHER THAN ENGLISH by Carol Sabbar The University of Wisconsin-Milwaukee, 2016 Under the Supervision of Dr. Iris Xie This dissertation explores the information-seeking strategies used by scholars in the humanities who rely on resources in languages other than English. It investigates not only the strategies they choose but also the shifts that they make among strategies and the role that language, culture, and geography play in the information-seeking context. The study used purposive sampling to engage 40 human subjects, all of whom are post-doctoral humanities scholars based in the United States who conduct research in a variety of languages. Data were collected through semi-structured interviews and research diaries in order to answer three research questions: What information-seeking strategies are used by scholars conducting research in languages other than English? What shifts do scholars make among strategies in routine, disruptive, and/or problematic situations? And In what ways do language, culture, and geography play a role in the information-seeking context, especially in the problematic situations? The data were then analyzed using grounded theory and the constant comparative method. A new conceptual model – the information triangle – was used and is presented in this dissertation to categorize and visually map the strategies and shifts. Based on data collected, thirty distinct strategies were identified and divided into four categories: formal system, informal resource, interactive human, and hybrid strategies. Three types of shifts were considered: planned, opportunistic, and alternative. Finally, factors related to language, culture, and geography were identified and analyzed according to their roles in the information-seeking context. This study is the first of its kind to combine the study of information-seeking behaviors with the factors of language, culture, and geography, and as such, it presents numerous methodological and practical implications along with many opportunities for future research
    corecore