8,967 research outputs found

    Implications of Inter-Rater Agreement on a Student Information Retrieval Evaluation

    Full text link
    This paper is about an information retrieval evaluation on three different retrieval-supporting services. All three services were designed to compensate typical problems that arise in metadata-driven Digital Libraries, which are not adequately handled by a simple tf-idf based retrieval. The services are: (1) a co-word analysis based query expansion mechanism and re-ranking via (2) Bradfordizing and (3) author centrality. The services are evaluated with relevance assessments conducted by 73 information science students. Since the students are neither information professionals nor domain experts the question of inter-rater agreement is taken into consideration. Two important implications emerge: (1) the inter-rater agreement rates were mainly fair to moderate and (2) after a data-cleaning step which erased the assessments with poor agreement rates the evaluation data shows that the three retrieval services returned disjoint but still relevant result sets.Comment: 7 pages, 3 figures, LWA 2010, Workshop I

    Human assessments of document similarity

    Get PDF
    Two studies are reported that examined the reliability of human assessments of document similarity and the association between human ratings and the results of n-gram automatic text analysis (ATA). Human interassessor reliability (IAR) was moderate to poor. However, correlations between average human ratings and n-gram solutions were strong. The average correlation between ATA and individual human solutions was greater than IAR. N-gram length influenced the strength of association, but optimum string length depended on the nature of the text (technical vs. nontechnical). We conclude that the methodology applied in previous studies may have led to overoptimistic views on human reliability, but that an optimal n-gram solution can provide a good approximation of the average human assessment of document similarity, a result that has important implications for future development of document visualization systems

    Applying Science Models for Search

    Full text link
    The paper proposes three different kinds of science models as value-added services that are integrated in the retrieval process to enhance retrieval quality. The paper discusses the approaches Search Term Recommendation, Bradfordizing and Author Centrality on a general level and addresses implementation issues of the models within a real-life retrieval environment.Comment: 14 pages, 3 figures, ISI 201

    The Impact of Residential Treatment on Emotionally Disturbed Boys

    Get PDF
    Within the past four decades, social work has witnessed the development of increasingly specialized servicecs to children, among these a sort of “total impact therapy” generally defined as residential treatment. In conjunction with the basic social work values of the bio-psycho-social nature of human maladjustment, residential centres have attempted to help the child effect a happier adjustment to his life situation by meeting some ungratified basic need. Institutions for dependent children complimented those for custodial care of even isolation; contemporary residential treatment centres are designed to meet a broader range of needs of the child than those of forty years ago through a variety of approaches, often referred to as milieu therapy. Consideration of the common needs of children is basic to questions concerning the place of institutional treatment and the particular type of child for which this social work service is the most appropriate one. The residential treatment centre addresses the whole gamut of a child’s needs from physical care to rehabilitation. Exposure to, and participation in, a group life experience simulating as closely as possible the family or community life experience is the element differentiating residential care from other treatment modes. By involvement in the realities of his daily situation and the working through or resolution of these, the child is helped to cope with his own growth and development—physical, emotional, and social. Problems and questions examined in this paper revolve around the residential treatment centre defined vaguely by the Child Welfare League of America as “A building....maintained and operated by a chartered agency, organization or institution, whose main purpose is to provide shelter and care to a group of unrelated children and youths up to eighteen years of age.” More specifically, the concern for research, the proposal and plans for implementation are focused on Mount St. Joseph, an autonomous, non-profit institution providing care for boys with moderate to severe emotional disturbances

    A comparison of homonym meaning frequency estimates derived from movie and television subtitles, free association, and explicit ratings

    Get PDF
    First Online: 10 September 2018Most words are ambiguous, with interpretation dependent on context. Advancing theories of ambiguity resolution is important for any general theory of language processing, and for resolving inconsistencies in observed ambiguity effects across experimental tasks. Focusing on homonyms (words such as bank with unrelated meanings EDGE OF A RIVER vs. FINANCIAL INSTITUTION), the present work advances theories and methods for estimating the relative frequency of their meanings, a factor that shapes observed ambiguity effects. We develop a new method for estimating meaning frequency based on the meaning of a homonym evoked in lines of movie and television subtitles according to human raters. We also replicate and extend a measure of meaning frequency derived from the classification of free associates. We evaluate the internal consistency of these measures, compare them to published estimates based on explicit ratings of each meaning’s frequency, and compare each set of norms in predicting performance in lexical and semantic decision mega-studies. All measures have high internal consistency and show agreement, but each is also associated with unique variance, which may be explained by integrating cognitive theories of memory with the demands of different experimental methodologies. To derive frequency estimates, we collected manual classifications of 533 homonyms over 50,000 lines of subtitles, and of 357 homonyms across over 5000 homonym–associate pairs. This database—publicly available at: www.blairarmstrong.net/homonymnorms/—constitutes a novel resource for computational cognitive modeling and computational linguistics, and we offer suggestions around good practices for its use in training and testing models on labeled data

    Science Models as Value-Added Services for Scholarly Information Systems

    Full text link
    The paper introduces scholarly Information Retrieval (IR) as a further dimension that should be considered in the science modeling debate. The IR use case is seen as a validation model of the adequacy of science models in representing and predicting structure and dynamics in science. Particular conceptualizations of scholarly activity and structures in science are used as value-added search services to improve retrieval quality: a co-word model depicting the cognitive structure of a field (used for query expansion), the Bradford law of information concentration, and a model of co-authorship networks (both used for re-ranking search results). An evaluation of the retrieval quality when science model driven services are used turned out that the models proposed actually provide beneficial effects to retrieval quality. From an IR perspective, the models studied are therefore verified as expressive conceptualizations of central phenomena in science. Thus, it could be shown that the IR perspective can significantly contribute to a better understanding of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric

    Heuristic Principles and Differential Judgments in the Assessment of Information Quality

    Get PDF
    Information quality (IQ) is a multidimensional construct and includes dimensions such as accuracy, completeness, objectivity, and representation that are difficult to measure. Recently, research has shown that independent assessors who rated IQ yielded high inter-rater agreement for some information quality dimensions as opposed to others. In this paper, we explore the reasons that underlie the differences in the “measurability” of IQ. Employing Gigerenzer’s “building blocks” framework, we conjecture that the feasibility of using a set of heuristic principles consistently when assessing different dimensions of IQ is a key factor driving inter-rater agreement in IQ judgments. We report on two studies. In the first study, we qualitatively explored the manner in which participants applied the heuristic principles of search rules, stopping rules, and decision rules in assessing the IQ dimensions of accuracy, completeness, objectivity, and representation. In the second study, we investigated the extent to which participants could reach an agreement in rating the quality of Wikipedia articles along these dimensions. Our findings show an alignment between the consistent application of heuristic principles and inter-rater agreement levels found on particular dimensions of IQ judgments. Specifically, on the dimensions of completeness and representation, assessors applied the heuristic principles consistently and tended to agree in their ratings, whereas, on the dimensions of accuracy and objectivity, they not apply the heuristic principles in a uniform manner and inter-rater agreement was relatively low. We discuss our findings implications for research and practice
    • …
    corecore