5 research outputs found

    Does document relevance affect the searcher's perception 0f time?

    Get PDF
    Time plays an essential role in multiple areas of Information Retrieval (IR) studies such as search evaluation, user behavior analysis, temporal search result ranking and query understanding. Especially, in search evaluation studies, time is usually adopted as a measure to quantify users' efforts in search processes. Psychological studies have reported that the time perception of human beings can be affected by many stimuli, such as attention and motivation, which are closely related to many cognitive factors in search. Considering the fact that users' search experiences are affected by their subjective feelings of time, rather than the objective time measured by timing devices, it is necessary to look into the different factors that have impacts on search users' perception of time. In this work, we make a first step towards revealing the time perception mechanism of search users with the following contributions: (1) We establish an experimental research framework to measure the subjective perception of time while reading documents in search scenario, which originates from but is also different from traditional time perception measurements in psychological studies. (2) With the framework, we show that while users are reading result documents, document relevance has small yet visible effect on search users' perception of time. By further examining the impact of other factors, we demonstrate that the effect on relevant documents can also be influenced by individuals and tasks. (3) We conduct a preliminary experiment in which the difference between perceived time and dwell time is taken into consideration in a search evaluation task. We found that the revised framework achieved a better correlation with users' satisfaction feedbacks. This work may help us better understand the time perception mechanism of search users and provide insights in how to better incorporate time factor in search evaluation studies

    Joint Upper & Lower Bound Normalization for IR Evaluation

    Full text link
    In this paper, we present a novel perspective towards IR evaluation by proposing a new family of evaluation metrics where the existing popular metrics (e.g., nDCG, MAP) are customized by introducing a query-specific lower-bound (LB) normalization term. While original nDCG, MAP etc. metrics are normalized in terms of their upper bounds based on an ideal ranked list, a corresponding LB normalization for them has not yet been studied. Specifically, we introduce two different variants of the proposed LB normalization, where the lower bound is estimated from a randomized ranking of the corresponding documents present in the evaluation set. We next conducted two case-studies by instantiating the new framework for two popular IR evaluation metric (with two variants, e.g., DCG_UL_V1,2 and MSP_UL_V1,2 ) and then comparing against the traditional metric without the proposed LB normalization. Experiments on two different data-sets with eight Learning-to-Rank (LETOR) methods demonstrate the following properties of the new LB normalized metric: 1) Statistically significant differences (between two methods) in terms of original metric no longer remain statistically significant in terms of Upper Lower (UL) Bound normalized version and vice-versa, especially for uninformative query-sets. 2) When compared against the original metric, our proposed UL normalized metrics demonstrate higher Discriminatory Power and better Consistency across different data-sets. These findings suggest that the IR community should consider UL normalization seriously when computing nDCG and MAP and more in-depth study of UL normalization for general IR evaluation is warranted.Comment: 26 pages, 3 figure

    English as a second language user’s Information Interaction in an e-Governmental context

    Get PDF
    The proliferation of web-based technologies has led most national governments to begin transitioning to a so called “e-service," where provision is made through purely digital means. Despite their obvious benefits for most users, these on-line systems present barriers of access. This research seeks to identify the current information seeking behaviours of English as a second language (ESL) users when performing e-government-related tasks, to ascertain where and why issues arise during this process. Utilising a multi-phase and integrated mixed methods approach, this research investigated how ESL users find information in an e-governmental context, how this differs from native users, and how differences can be supported by the system. The Participatory Design approach identified relevant search task topics, which were utilised during experiments in the second integrated mixed methods phase. Results from the mixed methods phase suggest that success may be less dependent on second language proficiency, but rather the search strategies employed and the fastidiousness of the user in assessing document relevance. There were a number of significant differences identified between ESL and native English participants, but also a number of similarities as both groups were unable to consistently predict when they had not performed particularly well. In light of a solely e-government system, this raises significant concerns about users and the information they rely on to make judgements that can have real world implications. A number of participant recommendations are suggested but one way of mitigating such concerns is to consider the use of system wizards. Performance was high between both groups when this system design was implemented, with positive sentiment (from both groups) towards such a tool as they provide a clear and structured platform to information

    Ephemeral relevance and user activities in a search session

    Get PDF
    We study relevance judgment and user activities in a search session. We focus on ephemeral relevance—a contextual measurement regarding the amount of useful information a searcher acquired from a clicked result at a particular time—and two primary types of search activities—query reformulation and click. The purpose of the study is both explanatory and practical. First, we examine the influence of different factors on ephemeral relevance and user activities in a search session. Second, we leverage short-term search history and implicit feedback in a session to predict ephemeral relevance and future search activities. The main findings include: 1. As a contextual usefulness measurement, ephemeral relevance differs from both topical relevance judgment and context-independent usefulness assessment. We show ephemeral relevance significantly relates to a wide range of factors, including topical relevance, novelty, understandability, reliability, effort spent, and search task. The difference between ephemeral relevance and context-independent usefulness assessment is linked to judgment criteria, novelty, effort spent, and changes in user’s perceptions of a search result. 2. Ephemeral relevance can be predicted accurately using implicit feedback signals without any manual explicit judgments. We generalize existing implicit feedback methods from using information related to a single result to those based on user activities in a whole session, achieving a correlation as high as 0.5 between the predicted and real judgments. 3. We show choices of word changes in query reformulation and click decisions significantly relate to recent search history, such as the contents and effectiveness of previous search queries, the contents of the results viewed and clicked in previous searches, etc. 4. Leveraging short-term search history in a session and other information, we can predict word changes in query reformulation and click decisions with different levels of accuracies. These findings help disclose and explain the dynamics of relevance and user activities in a search session. The developed techniques provide effective support for developing interactive IR systems

    Spoken content retrieval beyond pipeline integration of automatic speech recognition and information retrieval

    Get PDF
    The dramatic increase in the creation of multimedia content is leading to the development of large archives in which a substantial amount of the information is in spoken form. Efficient access to this information requires effective spoken content retrieval (SCR) methods. Traditionally, SCR systems have focused on a pipeline integration of two fundamental technologies: transcription using automatic speech recognition (ASR) and search supported using text-based information retrieval (IR). Existing SCR approaches estimate the relevance of a spoken retrieval item based on the lexical overlap between a user’s query and the textual transcriptions of the items. However, the speech signal contains other potentially valuable non-lexical information that remains largely unexploited by SCR approaches. Particularly, acoustic correlates of speech prosody, that have been shown useful to identify salient words and determine topic changes, have not been exploited by existing SCR approaches. In addition, the temporal nature of multimedia content means that accessing content is a user intensive, time consuming process. In order to minimise user effort in locating relevant content, SCR systems could suggest playback points in retrieved content indicating the locations where the system believes relevant information may be found. This typically requires adopting a segmentation mechanism for splitting documents into smaller “elements” to be ranked and from which suitable playback points could be selected. Existing segmentation approaches do not generalise well to every possible information need or provide robustness to ASR errors. This thesis extends SCR beyond the standard ASR and IR pipeline approach by: (i) exploring the utilisation of prosodic information as complementary evidence of topical relevance to enhance current SCR approaches; (ii) determining elements of content that, when retrieved, minimise user search effort and provide increased robustness to ASR errors; and (iii) developing enhanced evaluation measures that could better capture the factors that affect user satisfaction in SCR
    corecore