11,105 research outputs found

    Relevance judgments and the incremental presentation of document representations

    Full text link
    A new approach to the solicitation and measurement of relevance judgments is presented, which attempts to resolve some of the difficulties inherent in the nature of relevance and human judgment, and which further seeks to examine how users' judgments of document representations change as more information about documents is revealed to them. Subjects (university faculty and doctoral students) viewed three incremental versions of documents, and recorded ratio-level relevance judgments for each version. These judgments were analyzed by a variety of methods, including graphical inspection and examination of the number and degree of changes of judgments as new information is seen. A post questionnaire was also administered to obtain subjects' perceptions of the process and the individual fields of information presented. A consistent pattern of perception and importance of these fields is seen: Abstracts are by far the most important field and have the greatest impact, followed by titles, bibliographic information, and indexing.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/29634/1/0000723.pd

    Estimating Position Bias without Intrusive Interventions

    Full text link
    Presentation bias is one of the key challenges when learning from implicit feedback in search engines, as it confounds the relevance signal. While it was recently shown how counterfactual learning-to-rank (LTR) approaches \cite{Joachims/etal/17a} can provably overcome presentation bias when observation propensities are known, it remains to show how to effectively estimate these propensities. In this paper, we propose the first method for producing consistent propensity estimates without manual relevance judgments, disruptive interventions, or restrictive relevance modeling assumptions. First, we show how to harvest a specific type of intervention data from historic feedback logs of multiple different ranking functions, and show that this data is sufficient for consistent propensity estimation in the position-based model. Second, we propose a new extremum estimator that makes effective use of this data. In an empirical evaluation, we find that the new estimator provides superior propensity estimates in two real-world systems -- Arxiv Full-text Search and Google Drive Search. Beyond these two points, we find that the method is robust to a wide range of settings in simulation studies

    How users assess web pages for information-seeking

    Get PDF
    In this paper, we investigate the criteria used by online searchers when assessing the relevance of web pages for information-seeking tasks. Twenty four participants were given three tasks each, and indicated the features of web pages which they employed when deciding about the usefulness of the pages in relation to the tasks. These tasks were presented within the context of a simulated work-task situation. We investigated the relative utility of features identified by participants (web page content,structure and quality), and how the importance of these features is affected by the type of information-seeking task performed and the stage of the search. The results of this study provide a set of criteria used by searchers to decide about the utility of web pages for different types of tasks. Such criteria can have implications for the design of systems that use or recommend web pages

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Aerospace medicine and biology: A continuing bibliography with indexes (supplement 341)

    Get PDF
    This bibliography lists 133 reports, articles and other documents introduced into the NASA Scientific and Technical Information System during September 1990. Subject coverage includes: aerospace medicine and psychology, life support systems and controlled environments, safety equipment, exobiology and extraterrestrial life, and flight crew behavior and performance

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    An Empirical Study of User Navigation during Document Triage

    Get PDF
    Περιέχει το πλήρες κείμενοDocument triage is the moment in the information seeking process when the user first decides the relevance of a document to their information need[17]. This paper reports a study of user behaviour during document triage. The study reveals two main findings: first, that there is a small set of common navigational patterns; second, that certain document features strongly influence users’ navigation
    corecore