8,597 research outputs found
User-centred interface design for cross-language information retrieval
This paper reports on the user-centered design methodology and
techniques used for the elicitation of user requirements and how these requirements informed the first phase of the user interface design for a Cross-Language Information Retrieval System. We describe a set of factors involved in analysis of the data collected and, finally discuss the implications for user interface design based on the findings
Introduction to the special issue on cross-language algorithms and applications
With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of
Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special
issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment
analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
Teaching a New Dog Old Tricks: Resurrecting Multilingual Retrieval Using Zero-shot Learning
While billions of non-English speaking users rely on search engines every
day, the problem of ad-hoc information retrieval is rarely studied for
non-English languages. This is primarily due to a lack of data set that are
suitable to train ranking algorithms. In this paper, we tackle the lack of data
by leveraging pre-trained multilingual language models to transfer a retrieval
system trained on English collections to non-English queries and documents. Our
model is evaluated in a zero-shot setting, meaning that we use them to predict
relevance scores for query-document pairs in languages never seen during
training. Our results show that the proposed approach can significantly
outperform unsupervised retrieval techniques for Arabic, Chinese Mandarin, and
Spanish. We also show that augmenting the English training collection with some
examples from the target language can sometimes improve performance.Comment: ECIR 2020 (short
Observing Users - Designing clarity a case study on the user-centred design of a cross-language information retrieval system
This paper presents a case study of the development of an interface to a novel and complex form of document retrieval: searching for texts written in foreign languages based on native language queries. Although the underlying technology for achieving such a search is relatively well understood, the appropriate interface design is not. A study involving users (with such searching needs) from the start of the design process is described covering initial examination of user needs and tasks; preliminary
design and testing of interface components; building, testing, and further refining an interface; before
finally conducting usability tests of the system. Lessons are learned at every stage of the process leading to a much more informed view of how such an interface should be built
Cross-lingual document retrieval categorisation and navigation based on distributed services
The widespread use of the Internet across countries has increased the need for access to document collections
that are often written in languages different from a user’s native language. In this paper we describe Clarity, a
Cross Language Information Retrieval (CLIR) system for English, Finnish, Swedish, Latvian and Lithuanian.
Clarity is a fully-fledged retrieval system that supports the user during the whole process of query formulation,
text retrieval and document browsing. We address four of the major aspects of Clarity: (i) the user-driven
methodology that formed the basis for the iterative design cycle and framework in the project, (ii) the system
architecture that was developed to support the interaction and coordination of Clarity’s distributed services, (iii)
the data resources and methods for query translation, and (iv) the support for Baltic languages. Clarity is an
example of a distributed CLIR system built with minimal translation resources and, to our knowledge, the only
such system that currently supports Baltic languages
User requirement elicitation for cross-language information retrieval
Who are the users of a cross-language retrieval system? Under what circumstances do they need to perform such multi-language searches? How will the task and the context
of use affect successful interaction with the system? Answers to these questions were explored in a user study performed as part of the design stages of Clarity, a EU
founded project on cross-language information retrieval. The findings resulted in a rethink of the planned user interface and a consequent expansion of the set of services
offered. This paper reports on the methodology and techniques used for the elicitation of user requirements as well as how these were in turn transformed into new design
solutions
Recommended from our members
Language engineering - a champion for European culture
Language is key to culture. It is a direct cultural medium as well as a means of recording and providing access to non-lingual elements of culture. Language is also fundamental to a sense of cultural identity. For this reason, it is vital, in a changing Europe, that we preserve the multi-lingual character of our society in order to move successfully towards closer co-operation at a political, economic, and social level.
Language engineering is the application of knowledge of language to the development of computer software which can recognise, understand, interpret, and generate human language in all its forms.
The paper provides a high level view of the ‘state of the art’ in language engineering and indicates ways in which it will have a profound impact on our culture in the future. It shows how advances in language engineering are an important aid in maintaining cultural diversity in a multi-lingual European society, while enabling the development of social cohesion across cultural and national divides. It addresses issues raised by the prospect of the Multi-lingual Information Society, including education, human communication with technology and information management, as well as aspects of digital cities such as tele-presence in digital libraries, virtual art galleries and electronic museums. The paper raises the issue of language as a factor in cultural domination, showing the contribution that language engineering can make towards countering it.
The paper also raises a number of controversial issues concerning the likely benefits arising from the ways in which language is likely to influence the culture of Europe
Evaluation campaigns and TRECVid
The TREC Video Retrieval Evaluation (TRECVid) is an
international benchmarking activity to encourage research
in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video
corpus, automatic detection of a variety of semantic and
low-level video features, shot boundary detection and the
detection of story boundaries in broadcast TV news. This
paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether
such campaigns are a good thing or a bad thing. There are
arguments for and against these campaigns and we present
some of them in the paper concluding that on balance they
have had a very positive impact on research progress
- …