15 research outputs found

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Perspectives on Large Language Models for Relevance Judgment

    Full text link
    When asked, current large language models (LLMs) like ChatGPT claim that they can assist us with relevance judgments. Many researchers think this would not lead to credible IR research. In this perspective paper, we discuss possible ways for LLMs to assist human experts along with concerns and issues that arise. We devise a human-machine collaboration spectrum that allows categorizing different relevance judgment strategies, based on how much the human relies on the machine. For the extreme point of "fully automated assessment", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing two opposing perspectives - for and against the use of LLMs for automatic relevance judgments - and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers. We hope to start a constructive discussion within the community to avoid a stale-mate during review, where work is dammed if is uses LLMs for evaluation and dammed if it doesn't

    Budget-Feasible Mechanism Design for Non-monotone Submodular Objectives: Offline and Online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible, and O(1)-approximation mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Prior to our work, the only O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). The fact that in our mechanism the agents are not ordered according to their marginal value per cost allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present, for example, at most k agents can be selected. We obtain O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a p-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about nontrivial approximation guarantees in polynomial time, our results are, asymptotically, the best possible

    Budget-feasible mechanism design for non-monotone submodular objectives: Offline and online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible and O(1)-approximation mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Since the introduction of the problem by Singer [40], obtaining efficient mechanisms for objectives that go beyond the class of monotone submodular functions has been elusive. Prior to our work, the only O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). Ours is the first mechanism for the problem where-crucially-the agents are not ordered according to their marginal value per cost. This allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present, e.g., at most k agents can be selected. We obtain O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a p-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about non-trivial approximation guaran

    Budget-Feasible Mechanism Design for Non-Monotone Submodular Objectives: Offline and Online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible and O(1)O(1)-approximate mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Prior to our work, the only O(1)O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). Ours is the first mechanism for the problem where---crucially---the agents are not ordered with respect to their marginal value per cost. This allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present. We obtain O(p)O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a pp-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about non-trivial approximation guarantees in polynomial time, our results are asymptotically best possible.Comment: Accepted to EC 201

    Plattformbasierte Erwerbsarbeit: Stand der empirischen Forschung

    Full text link
    This study summarizes the current state of empirical research in economics and social sciences on contract work mediated or provided by online platforms (online contract work). Based on a systematic literature review, this study discusses results on the diffusion of online platforms, the characteristics of workers as well as the motives for labor supply and the working conditions. The study considers services which can be provided from anywhere via the internet (online labor markets), as well as services which are mediated by online platforms but are provided at a predefined location (mobile labor markets). Besides a summary of existing research findings on the topic, this study also evaluates the quality of the empirical methods. The focus lies on the applied methods for data collection as well as the statistical analyses of the data. As a result, the current state of knowledge on online contract work can be regarded as fragmented. While for the United States several studies already exist on the diffusion of online contract work, there is a paucity of corresponding studies in Europe. A considerably higher number of studies deals with other aspects of online contract work, out of which, however, only a few focus on mobile labor markets. Administrative statistics and largescale representative surveys do not yet contain information on online contract work. Existing research on the topic is therefore based on a variety of data sources and methodological approaches, which makes it difficult to compare empirical findings

    Algorithms for assessing the quality and difficulty of multiple choice exam questions

    Get PDF
    Multiple Choice Questions (MCQs) have long been the backbone of standardized testing in academia and industry. Correspondingly, there is a constant need for the authors of MCQs to write and refine new questions for new versions of standardized tests as well as to support measuring performance in the emerging massive open online courses, (MOOCs). Research that explores what makes a question difficult, or what questions distinguish higher-performing students from lower-performing students can aid in the creation of the next generation of teaching and evaluation tools. In the automated MCQ answering component of this thesis, algorithms query for definitions of scientific terms, process the returned web results, and compare the returned definitions to the original definition in the MCQ. This automated method for answering questions is then augmented with a model, based on human performance data from crowdsourced question sets, for analysis of question difficulty as well as the discrimination power of the non-answer alternatives. The crowdsourced question sets come from PeerWise, an open source online college-level question authoring and answering environment. The goal of this research is to create an automated method to both answer and assesses the difficulty of multiple choice inverse definition questions in the domain of introductory biology. The results of this work suggest that human-authored question banks provide useful data for building gold standard human performance models. The methodology for building these performance models has value in other domains that test the difficulty of questions and the quality of the exam takers
    corecore