214 research outputs found

    Investigating Bell Inequalities for Multidimensional Relevance Judgments in Information Retrieval

    Get PDF
    Relevance judgment in Information Retrieval is influenced by multiple factors. These include not only the topicality of the documents but also other user oriented factors like trust, user interest, etc. Recent works have identified and classified these various factors into seven dimensions of relevance. In a previous work, these relevance dimensions were quantified and user's cognitive state with respect to a document was represented as a state vector in a Hilbert Space, with each relevance dimension representing a basis. It was observed that relevance dimensions are incompatible in some documents, when making a judgment. Incompatibility being a fundamental feature of Quantum Theory, this motivated us to test the Quantum nature of relevance judgments using Bell type inequalities. However, none of the Bell-type inequalities tested have shown any violation. We discuss our methodology to construct incompatible basis for documents from real world query log data, the experiments to test Bell inequalities on this dataset and possible reasons for the lack of violation

    On crowdsourcing relevance magnitudes for information retrieval evaluation

    Get PDF
    4siMagnitude estimation is a psychophysical scaling technique for the measurement of sensation, where observers assign numbers to stimuli in response to their perceived intensity. We investigate the use of magnitude estimation for judging the relevance of documents for information retrieval evaluation, carrying out a large-scale user study across 18 TREC topics and collecting over 50,000 magnitude estimation judgments using crowdsourcing. Our analysis shows that magnitude estimation judgments can be reliably collected using crowdsourcing, are competitive in terms of assessor cost, and are, on average, rank-aligned with ordinal judgments made by expert relevance assessors. We explore the application of magnitude estimation for IR evaluation, calibrating two gain-based effectiveness metrics, nDCG and ERR, directly from user-reported perceptions of relevance. A comparison of TREC system effectiveness rankings based on binary, ordinal, and magnitude estimation relevance shows substantial variation; in particular, the top systems ranked using magnitude estimation and ordinal judgments differ substantially. Analysis of the magnitude estimation scores shows that this effect is due in part to varying perceptions of relevance: different users have different perceptions of the impact of relative differences in document relevance. These results have direct implications for IR evaluation, suggesting that current assumptions about a single view of relevance being sufficient to represent a population of users are unlikely to hold.partially_openopenMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, AndrewMaddalena, Eddy; Mizzaro, Stefano; Scholer, Falk; Turpin, Andre

    Crowdsourcing for Engineering Design: Objective Evaluations and Subjective Preferences

    Full text link
    Crowdsourcing enables designers to reach out to large numbers of people who may not have been previously considered when designing a new product, listen to their input by aggregating their preferences and evaluations over potential designs, aiming to improve ``good'' and catch ``bad'' design decisions during the early-stage design process. This approach puts human designers--be they industrial designers, engineers, marketers, or executives--at the forefront, with computational crowdsourcing systems on the backend to aggregate subjective preferences (e.g., which next-generation Brand A design best competes stylistically with next-generation Brand B designs?) or objective evaluations (e.g., which military vehicle design has the best situational awareness?). These crowdsourcing aggregation systems are built using probabilistic approaches that account for the irrationality of human behavior (i.e., violations of reflexivity, symmetry, and transitivity), approximated by modern machine learning algorithms and optimization techniques as necessitated by the scale of data (millions of data points, hundreds of thousands of dimensions). This dissertation presents research findings suggesting the unsuitability of current off-the-shelf crowdsourcing aggregation algorithms for real engineering design tasks due to the sparsity of expertise in the crowd, and methods that mitigate this limitation by incorporating appropriate information for expertise prediction. Next, we introduce and interpret a number of new probabilistic models for crowdsourced design to provide large-scale preference prediction and full design space generation, building on statistical and machine learning techniques such as sampling methods, variational inference, and deep representation learning. Finally, we show how these models and algorithms can advance crowdsourcing systems by abstracting away the underlying appropriate yet unwieldy mathematics, to easier-to-use visual interfaces practical for engineering design companies and governmental agencies engaged in complex engineering systems design.PhDDesign ScienceUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/133438/1/aburnap_1.pd

    Study of Relevance and Effort across Devices

    Get PDF
    Relevance judgements are essential for designing information retrieval systems. Traditionally, judgements have been judgements have been gathered via desktop interfaces. However, with the rise in popularity of smaller devices for information access, it has become imperative to investigate whether desktop based judgements are different from judgements gathered using mobiles. Recently, user effort and document usefulness have also emerged as important dimensions to optimize and evaluate information retrieval systems. Since existing work is limited to desktops, it remains to be seen how these judgements are affected by user’s search device. In this paper, we address these shortcomings by collecting and analyzing relevance, usefulness and effort judgements on mobiles and desktops. Analysis of these judgements indicates that high agreement rate between desktop and mobile judges for relevance, followed by usefulness and findability. We also found that desktop judges are likely to spend more time and examine documents in greater depth on non-relevant/notuseful/difficult documents compared to mobile judges. Based on our findings, we suggest that relevance judgements should be gathered via desktops and effort judgements should be collected on each device independently

    Mining Crowdsourced First Impressions in Online Social Video

    Get PDF
    While multimedia and social computing research have used crowdsourcing techniques to annotate objects, actions, and scenes in social video sites like YouTube, little work has ad- dressed the crowdsourcing of personal and social traits in online social video or social media content in general. In this paper, we address the problems of (1) crowdsourcing the annotation of first impressions of video bloggers (vloggers) personal and social traits in conversational YouTube videos, and (2) mining the impressions with the goal of modeling the interplay of different vlogger facets. First, we design a human annotation task to crowdsource impressions of vloggers that extends a tradition of studies of personality impressions with the addition of attractiveness and mood impressions. Second, we propose a probabilistic framework using Topic Models to discover prototypical impressions that are data driven, and that combine multiple facets of vloggers. Finally, we address the task of automatically predicting topic impressions using nonverbal and verbal content extracted from videos and comments. Our study of 442 YouTube vlogs and 2,210 annotations collected in Mechanical Turk supports recent literature showing the feasibility to crowdsource interpersonal human impression with comparable quality to what is reported in social psychology research, and provides insights on the interplay among human first impressions. We also show that topic models are useful to discover meaningful prototypical impressions that can be validated by humans, and that different topics can be predicted using different sources of information from vloggers’ nonverbal and verbal content, as well as comments from the audience

    Understanding relationship quality in hospitality services : A study based on text analytics and partial least squares

    Get PDF
    Purpose – The purpose of this paper is to analyze the occurrence of terms to identify the relevant topics and then to investigate the area (based on topics) of hospitality services that is highly associated with relationship quality. This research represents an opportunity to fill the gap in the current literature, and clarify the understanding of guests’ affective states by evaluating all aspects of their relationship with a hotel. Design/methodology/approach – This research focuses on natural opinions upon which machine-learning algorithms can be executed: text summarization, sentiment analysis and latent Dirichlet allocation (LDA). Our data set contains 47,172 reviews of 33 hotels located in Las Vegas, and registered with Yelp. A component- based structural equation modeling (partial least squares (PLS)) is applied, with a dual – exploratory and predictive – purpose. Findings – To maintain a truly loyal relationship and to achieve competitive success, hospitality managers must take into account both tangible and intangible features when allocating their marketing efforts to satisfaction-, trust- and commitment-based cues. On the other hand, the application of the PLS predict algorithm demonstrates the predictive performance (out-of-sample prediction) of our model that supports its ability to predict new and accurate values for individual cases when further samples are added. Originality/value – LDA and PLS produce relevant informative summaries of corpora, and confirm and address more specifically the results of the previous literature concerning relationship quality. Our results are more reliable and accurate (providing insights not indicated in guests’ ratings into how hotels can improve their services) than prior statistical results based on limited sample data and on numerical satisfaction ratings alone

    A Survey of Quantum Theory Inspired Approaches to Information Retrieval

    Get PDF
    Since 2004, researchers have been using the mathematical framework of Quantum Theory (QT) in Information Retrieval (IR). QT offers a generalized probability and logic framework. Such a framework has been shown capable of unifying the representation, ranking and user cognitive aspects of IR, and helpful in developing more dynamic, adaptive and context-aware IR systems. Although Quantum-inspired IR is still a growing area, a wide array of work in different aspects of IR has been done and produced promising results. This paper presents a survey of the research done in this area, aiming to show the landscape of the field and draw a road-map of future directions
    • …
    corecore