15 research outputs found

    Perspectives on Large Language Models for Relevance Judgment

    Full text link
    When asked, current large language models (LLMs) like ChatGPT claim that they can assist us with relevance judgments. Many researchers think this would not lead to credible IR research. In this perspective paper, we discuss possible ways for LLMs to assist human experts along with concerns and issues that arise. We devise a human-machine collaboration spectrum that allows categorizing different relevance judgment strategies, based on how much the human relies on the machine. For the extreme point of "fully automated assessment", we further include a pilot experiment on whether LLM-based relevance judgments correlate with judgments from trained human assessors. We conclude the paper by providing two opposing perspectives - for and against the use of LLMs for automatic relevance judgments - and a compromise perspective, informed by our analyses of the literature, our preliminary experimental evidence, and our experience as IR researchers. We hope to start a constructive discussion within the community to avoid a stale-mate during review, where work is dammed if is uses LLMs for evaluation and dammed if it doesn't

    Budget-Feasible Mechanism Design for Non-Monotone Submodular Objectives: Offline and Online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible and O(1)O(1)-approximate mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Prior to our work, the only O(1)O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). Ours is the first mechanism for the problem where---crucially---the agents are not ordered with respect to their marginal value per cost. This allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present. We obtain O(p)O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a pp-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about non-trivial approximation guarantees in polynomial time, our results are asymptotically best possible.Comment: Accepted to EC 201

    Budget-feasible mechanism design for non-monotone submodular objectives: Offline and online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible and O(1)-approximation mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Since the introduction of the problem by Singer [40], obtaining efficient mechanisms for objectives that go beyond the class of monotone submodular functions has been elusive. Prior to our work, the only O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). Ours is the first mechanism for the problem where-crucially-the agents are not ordered according to their marginal value per cost. This allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present, e.g., at most k agents can be selected. We obtain O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a p-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about non-trivial approximation guaran

    Crowdsourcing for linguistic field research and e-learning

    Get PDF
    Crowdsourcing denotes the transfer of work commonly carried out by single humans to a large group of people. Nowadays, crowdsourcing is employed for many purposes, like people contributing their knowledge to Wikipedia, researchers predicting diseases from data on Twitter, or players solving protein folding problems in games. Still, there are areas for which the application of crowdsourcing has not yet been investigated thoroughly. This thesis examines crowdsourcing for two such areas: for empirical research in sciences oriented on humans -focusing on linguistic field research- and for e-learning. Sciences oriented on humans -like linguistics, sociology, or art history- depend on empirical research. For example, in traditional linguistic field research researchers ask questions and fill in forms. Such methods are time-consuming, costly, and not free of biases. This thesis proposes the application of crowdsourcing techniques to overcome these disadvantages and to support empirical research in getting more efficient. Therefore, the concept of a generic market for trading with symbolic goods and speculating on their characteristics in a playful manner, called Agora is introduced. Agora aims to be an "operating system" for social media applications gathering data. Furthermore, the Web-based crowdsourcing platform metropolitalia has been established for hosting two social media applications based upon Agora: Mercato Linguistico and Poker Parole. These applications have been conceived as part of this thesis for gathering complementary data and meta-data on Italian language varieties. Mercato Linguistico incites players to express their own knowledge or beliefs, Poker Parole incites players to make conjectures on the contributions of others. Thereby the primary meta-data collected with Mercato Linguistico are enriched with secondary, reflexive meta-data from Poker Parole, which are needed for studies on the perception of languages. An evaluation of the data gathered on metropolitalia exhibits the viability of the market-based approach of Agora and highlights its strengths. E-learning is concerned with the use of digital technology for learning, nowadays especially via the Internet. This thesis investigates how e-learning applications can support students with association-based learning and lecturers with teaching. For that, a game-like e-learning tool named Termina is proposed in this thesis. From the data collected with Termina association maps are constructed. An association map is a simplified version of a concept map, in which concepts are represented as rectangles and relationships between concepts as links. They constitute an abstract comprehension of a topic. Students profit from the association maps' availability, learn from other participating students, and can track their own learning progress. Lecturers gain insights into the knowledge and into potential misunderstandings of their students. An evaluation of Termina and the collected data along a university course exhibits Termina's usefulness for both students and lecturers. The main contributions of this thesis are (1) a literature review over collective intelligence, crowdsourcing, and related fields, (2) a model of a generic market for gathering data for empirical research efficiently, (3) two applications based on this model and results of an evaluation of the data gathered with them, (4) the game-like e-learning tool Termina together with insights from its evaluation, and (5) a generic software architecture for all aforementioned applications.Crowdsourcing bezeichnet die Auslagerung von Arbeit an eine Gruppe von Menschen zur Lösung eines Problems. Heutzutage wird Crowdsourcing für viele Zwecke verwendet, zum Beispiel tragen Leute ihr Wissen zu Wikipedia bei, Wissenschaftler sagen Krankheiten anhand von Twitter-Daten vorher oder Spieler lösen Proteinfaltungsprobleme in Spielen. Es gibt dennoch Gebiete, für die der Einsatz von Crowdsourcing noch nicht gründlich untersucht wurde. Diese Arbeit untersucht Crowdsourcing für zwei solche Gebiete: für empirische Forschung in auf den Menschen bezogenen Wissenschaften mit Fokus auf linguistischer Feldforschung sowie für E-Learning. Auf den Menschen bezogene Wissenschaften wie Linguistik, Soziologie oder Kunstgeschichte beruhen auf empirischer Forschung. In traditioneller linguistischer Feldforschung zum Beispiel stellen Wissenschaftler Fragen und füllen Fragebögen aus. Solche Methoden sind zeitaufwändig, teuer und nicht unbefangen. Diese Arbeit schlägt vor, Crowdsourcing-Techniken anzuwenden, um diese Nachteile zu überwinden und um empirische Forschung effizienter zu gestalten. Dazu wird das Konzept eines generischen Marktes namens Agora für den Handel mit symbolischen Gütern und für die Spekulation über deren Charakteristika eingeführt. Agora ist ein generisches "Betriebssystem" für Social Media Anwendungen. Außerdem wurde die Internet-basierte Crowdsourcing-Plattform metropolitalia eingerichtet, um zwei dieser Social Media Anwendungen, die auf Agora basieren, bereitzustellen: Mercato Linguistico und Poker Parole. Diese Anwendungen wurden als Teil dieser Arbeit entwickelt, um komplementäre Daten und Metadaten über italienische Sprachvarietäten zu sammeln. Mercato Linguistico regt Spieler dazu an, ihr eigenes Wissen und ihre Überzeugungen auszudrücken. Poker Parole regt Spieler dazu an, Vermutungen über die Beiträge anderer Spieler anzustellen. Damit werden die mit Mercato Linguistico gesammelten primären Metadaten mit reflexiven sekundären Metadaten aus Poker Parole, die für Studien über die Wahrnehmung von Sprachen notwendig sind, bereichert. Eine Auswertung der auf metropolitalia gesammelten Daten zeigt die Zweckmäßigkeit des marktbasierten Ansatzes von Agora und unterstreicht dessen Stärken. E-Learning befasst sich mit der Verwendung von digitalen Technologien für das Lernen, heutzutage vor allem über das Internet. Diese Arbeit untersucht, wie E-Learning-Anwendungen Studenten bei assoziationsbasiertem Lernen und Dozenten bei der Lehre unterstützen können. Dafür wird eine Spiel-ähnliche Anwendung namens Termina in dieser Arbeit eingeführt. Mit den über Termina gesammelten Daten werden Association-Maps konstruiert. Eine Association-Map ist eine vereinfachte Variante einer Concept-Map, in der Begriffe als Rechtecke und Beziehungen zwischen Begriffen als Verbindungslinien dargestellt werden. Sie stellen eine abstrakte Zusammenfassung eines Themas dar. Studenten profitieren von der Verfügbarkeit der Association-Maps, lernen von anderen Studenten und können ihren eigenen Lernprozess verfolgen. Dozenten bekommen Einblicke in den Wissensstand und in eventuelle Missverständnisse ihrer Studenten. Eine Evaluation von Termina und der damit gesammelten Daten während eines Universitätskurses bestätigt, dass Termina sowohl für Studenten als auch für Dozenten hilfreich ist. Die Kernbeiträge dieser Arbeit sind (1) eine Literaturrecherche über kollektive Intelligenz, Crowdsourcing und verwandte Gebiete, (2) ein Modell eines generischen Marktes zur effizienten Sammlung von Daten für empirische Forschung, (3) zwei auf diesem Modell basierende Anwendungen und Ergebnisse deren Evaluation, (4) die Spiel-ähnliche E-Learning-Anwendung Termina zusammen mit Einblicken aus dessen Evaluation und (5) eine generische Softwarearchitektur für alle vorgenannten Anwendungen

    Budget-Feasible Mechanism Design for Non-monotone Submodular Objectives: Offline and Online

    Get PDF
    The framework of budget-feasible mechanism design studies procurement auctions where the auctioneer (buyer) aims to maximize his valuation function subject to a hard budget constraint. We study the problem of designing truthful mechanisms that have good approximation guarantees and never pay the participating agents (sellers) more than the budget. We focus on the case of general (non-monotone) submodular valuation functions and derive the first truthful, budget-feasible, and O(1)-approximation mechanisms that run in polynomial time in the value query model, for both offline and online auctions. Prior to our work, the only O(1)-approximation mechanism known for non-monotone submodular objectives required an exponential number of value queries. At the heart of our approach lies a novel greedy algorithm for non-monotone submodular maximization under a knapsack constraint. Our algorithm builds two candidate solutions simultaneously (to achieve a good approximation), yet ensures that agents cannot jump from one solution to the other (to implicitly enforce truthfulness). The fact that in our mechanism the agents are not ordered according to their marginal value per cost allows us to appropriately adapt these ideas to the online setting as well. To further illustrate the applicability of our approach, we also consider the case where additional feasibility constraints are present, for example, at most k agents can be selected. We obtain O(p)-approximation mechanisms for both monotone and non-monotone submodular objectives, when the feasible solutions are independent sets of a p-system. With the exception of additive valuation functions, no mechanisms were known for this setting prior to our work. Finally, we provide lower bounds suggesting that, when one cares about nontrivial approximation guarantees in polynomial time, our results are, asymptotically, the best possible

    Plattformbasierte Erwerbsarbeit: Stand der empirischen Forschung

    Full text link
    This study summarizes the current state of empirical research in economics and social sciences on contract work mediated or provided by online platforms (online contract work). Based on a systematic literature review, this study discusses results on the diffusion of online platforms, the characteristics of workers as well as the motives for labor supply and the working conditions. The study considers services which can be provided from anywhere via the internet (online labor markets), as well as services which are mediated by online platforms but are provided at a predefined location (mobile labor markets). Besides a summary of existing research findings on the topic, this study also evaluates the quality of the empirical methods. The focus lies on the applied methods for data collection as well as the statistical analyses of the data. As a result, the current state of knowledge on online contract work can be regarded as fragmented. While for the United States several studies already exist on the diffusion of online contract work, there is a paucity of corresponding studies in Europe. A considerably higher number of studies deals with other aspects of online contract work, out of which, however, only a few focus on mobile labor markets. Administrative statistics and largescale representative surveys do not yet contain information on online contract work. Existing research on the topic is therefore based on a variety of data sources and methodological approaches, which makes it difficult to compare empirical findings
    corecore