2,837 research outputs found

    Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems

    Get PDF
    Crowdsourcing systems commonly face the problem of aggregating multiple judgments provided by potentially unreliable workers. In addition, several aspects of the design of efficient crowdsourcing processes, such as defining worker's bonuses, fair prices and time limits of the tasks, involve knowledge of the likely duration of the task at hand. Bringing this together, in this work we introduce a new time--sensitive Bayesian aggregation method that simultaneously estimates a task's duration and obtains reliable aggregations of crowdsourced judgments. Our method, called BCCTime, builds on the key insight that the time taken by a worker to perform a task is an important indicator of the likely quality of the produced judgment. To capture this, BCCTime uses latent variables to represent the uncertainty about the workers' completion time, the tasks' duration and the workers' accuracy. To relate the quality of a judgment to the time a worker spends on a task, our model assumes that each task is completed within a latent time window within which all workers with a propensity to genuinely attempt the labelling task (i.e., no spammers) are expected to submit their judgments. In contrast, workers with a lower propensity to valid labeling, such as spammers, bots or lazy labelers, are assumed to perform tasks considerably faster or slower than the time required by normal workers. Specifically, we use efficient message-passing Bayesian inference to learn approximate posterior probabilities of (i) the confusion matrix of each worker, (ii) the propensity to valid labeling of each worker, (iii) the unbiased duration of each task and (iv) the true label of each task. Using two real-world public datasets for entity linking tasks, we show that BCCTime produces up to 11% more accurate classifications and up to 100% more informative estimates of a task's duration compared to state-of-the-art methods

    A crowdsourcing study of perceived credibility of Reddit content based on a novel data scraping tool

    Get PDF
    Crowdsourcing-tutkimus Reddit-sisällön koetusta uskottavuudesta uudenlaisen tiedonkaavintatyökalun avulla. Tiivistelmä. The internet is a forever growing trove of information. It is well known, both in the public and in academia that in the last years the unhindered access to the internet gave everyone more opportunities to post fake or misleading content. Furthermore, the growing interest in artificial intelligence and large language models showed how easy it is for people to be provided, possibly by mistake, with misleading content through AI tools. These tools are trained on content from the world wide web, but one might wonder, how might these AI tools know which content is believable and which is not? The current thesis aims to contribute to the credibility literature of online media through a crowdsourcing survey based on content gathered — through a tool designed and build for the purposes of this thesis — from the Reddit social media platform. The data gathering tool was designed to scrape Reddit and store historical data from Reddit posts, something that no other tool has done before, and using the scraped data it offers the possibility of creating surveys for assessing credibility of Reddit posts. The thesis aimed to find what features of Reddit posts affect credibility. Once the survey participants assessed the credibility of multiple Reddit posts, both a quantitative and a qualitative analysis were conducted on the results. Findings show that popularity does not affect perceived credibility, however topic familiarity and experience of using Reddit have a weak positive affect on credibility. Furthermore, agreeable and content that is easy to understand were also affecting credibility positively, however content that contained jargon or that participants disagreed with or found offensive impacted credibility negatively. Among other findings, this thesis defines three types of credibility evaluation, “shallow evaluation”, “in-depth evaluation” and “experience-based evaluation”, that can help future research in understanding and designing credibility studies. The thesis brings several contributions to the literature. Firstly, it both complements and challenges past findings in credibility research of online media. Furthermore, the research puts forward the three levels of credibility evaluation, which can be used in future research and analyzed more thoroughly. Finally, the artifact that was built for the study, the open-source data gathering tool, offers a new way for researchers to gather data from Reddit, but it also gives the possibility to store historical data of a post, something that no other tool does, and enables possible new avenues for research in this direction

    Crowdsourced Live Streaming over the Cloud

    Full text link
    Empowered by today's rich tools for media generation and distribution, and the convenient Internet access, crowdsourced streaming generalizes the single-source streaming paradigm by including massive contributors for a video channel. It calls a joint optimization along the path from crowdsourcers, through streaming servers, to the end-users to minimize the overall latency. The dynamics of the video sources, together with the globalized request demands and the high computation demand from each sourcer, make crowdsourced live streaming challenging even with powerful support from modern cloud computing. In this paper, we present a generic framework that facilitates a cost-effective cloud service for crowdsourced live streaming. Through adaptively leasing, the cloud servers can be provisioned in a fine granularity to accommodate geo-distributed video crowdsourcers. We present an optimal solution to deal with service migration among cloud instances of diverse lease prices. It also addresses the location impact to the streaming quality. To understand the performance of the proposed strategies in the realworld, we have built a prototype system running over the planetlab and the Amazon/Microsoft Cloud. Our extensive experiments demonstrate that the effectiveness of our solution in terms of deployment cost and streaming quality

    PEER RATINGS AND ASSESSMENT QUALITY IN CROWD-BASED INNOVATION PROCESSES

    Get PDF
    Social networks – whether public or in enterprises – regularly ask users to rate their peers’ content using different voting techniques. When employed in innovation challenges, these rating procedures are part of an open, interactive, and continuous engagement among customers, employees, or citizens. In this regard, assessment accuracy (i.e., correctly identifying good and bad ideas) in crowdsourced eval-uation processes may be influenced by the display of peer ratings. While it could sometimes be useful for users to follow their peers, it is not entirely clear under which circumstances this actually holds true. Thus, in this research-in-progress article, we propose a study design to systematically investigate the effect of peer ratings on assessment accuracy in crowdsourced idea evaluation processes. Based on the elaboration likelihood model and social psychology, we develop a research model that incorporates the mediating factors extraversion, locus of control, as well as peer rating quality (i.e., the ratings’ corre-lation with the evaluated content’s actual quality). We suggest that the availability of peer ratings de-creases assessment accuracy and that rating quality, extraversion, as well as an internal locus of control mitigate this effect
    • …
    corecore