2,837 research outputs found
Time-Sensitive Bayesian Information Aggregation for Crowdsourcing Systems
Crowdsourcing systems commonly face the problem of aggregating multiple
judgments provided by potentially unreliable workers. In addition, several
aspects of the design of efficient crowdsourcing processes, such as defining
worker's bonuses, fair prices and time limits of the tasks, involve knowledge
of the likely duration of the task at hand. Bringing this together, in this
work we introduce a new time--sensitive Bayesian aggregation method that
simultaneously estimates a task's duration and obtains reliable aggregations of
crowdsourced judgments. Our method, called BCCTime, builds on the key insight
that the time taken by a worker to perform a task is an important indicator of
the likely quality of the produced judgment. To capture this, BCCTime uses
latent variables to represent the uncertainty about the workers' completion
time, the tasks' duration and the workers' accuracy. To relate the quality of a
judgment to the time a worker spends on a task, our model assumes that each
task is completed within a latent time window within which all workers with a
propensity to genuinely attempt the labelling task (i.e., no spammers) are
expected to submit their judgments. In contrast, workers with a lower
propensity to valid labeling, such as spammers, bots or lazy labelers, are
assumed to perform tasks considerably faster or slower than the time required
by normal workers. Specifically, we use efficient message-passing Bayesian
inference to learn approximate posterior probabilities of (i) the confusion
matrix of each worker, (ii) the propensity to valid labeling of each worker,
(iii) the unbiased duration of each task and (iv) the true label of each task.
Using two real-world public datasets for entity linking tasks, we show that
BCCTime produces up to 11% more accurate classifications and up to 100% more
informative estimates of a task's duration compared to state-of-the-art
methods
A crowdsourcing study of perceived credibility of Reddit content based on a novel data scraping tool
Crowdsourcing-tutkimus Reddit-sisällön koetusta uskottavuudesta uudenlaisen tiedonkaavintatyökalun avulla. Tiivistelmä. The internet is a forever growing trove of information. It is well known, both in the public and in academia that in the last years the unhindered access to the internet gave everyone more opportunities to post fake or misleading content. Furthermore, the growing interest in artificial intelligence and large language models showed how easy it is for people to be provided, possibly by mistake, with misleading content through AI tools. These tools are trained on content from the world wide web, but one might wonder, how might these AI tools know which content is believable and which is not?
The current thesis aims to contribute to the credibility literature of online media through a crowdsourcing survey based on content gathered — through a tool designed and build for the purposes of this thesis — from the Reddit social media platform. The data gathering tool was designed to scrape Reddit and store historical data from Reddit posts, something that no other tool has done before, and using the scraped data it offers the possibility of creating surveys for assessing credibility of Reddit posts.
The thesis aimed to find what features of Reddit posts affect credibility. Once the survey participants assessed the credibility of multiple Reddit posts, both a quantitative and a qualitative analysis were conducted on the results. Findings show that popularity does not affect perceived credibility, however topic familiarity and experience of using Reddit have a weak positive affect on credibility. Furthermore, agreeable and content that is easy to understand were also affecting credibility positively, however content that contained jargon or that participants disagreed with or found offensive impacted credibility negatively. Among other findings, this thesis defines three types of credibility evaluation, “shallow evaluation”, “in-depth evaluation” and “experience-based evaluation”, that can help future research in understanding and designing credibility studies.
The thesis brings several contributions to the literature. Firstly, it both complements and challenges past findings in credibility research of online media. Furthermore, the research puts forward the three levels of credibility evaluation, which can be used in future research and analyzed more thoroughly. Finally, the artifact that was built for the study, the open-source data gathering tool, offers a new way for researchers to gather data from Reddit, but it also gives the possibility to store historical data of a post, something that no other tool does, and enables possible new avenues for research in this direction
Crowdsourced Live Streaming over the Cloud
Empowered by today's rich tools for media generation and distribution, and
the convenient Internet access, crowdsourced streaming generalizes the
single-source streaming paradigm by including massive contributors for a video
channel. It calls a joint optimization along the path from crowdsourcers,
through streaming servers, to the end-users to minimize the overall latency.
The dynamics of the video sources, together with the globalized request demands
and the high computation demand from each sourcer, make crowdsourced live
streaming challenging even with powerful support from modern cloud computing.
In this paper, we present a generic framework that facilitates a cost-effective
cloud service for crowdsourced live streaming. Through adaptively leasing, the
cloud servers can be provisioned in a fine granularity to accommodate
geo-distributed video crowdsourcers. We present an optimal solution to deal
with service migration among cloud instances of diverse lease prices. It also
addresses the location impact to the streaming quality. To understand the
performance of the proposed strategies in the realworld, we have built a
prototype system running over the planetlab and the Amazon/Microsoft Cloud. Our
extensive experiments demonstrate that the effectiveness of our solution in
terms of deployment cost and streaming quality
PEER RATINGS AND ASSESSMENT QUALITY IN CROWD-BASED INNOVATION PROCESSES
Social networks – whether public or in enterprises – regularly ask users to rate their peers’ content using different voting techniques. When employed in innovation challenges, these rating procedures are part of an open, interactive, and continuous engagement among customers, employees, or citizens. In this regard, assessment accuracy (i.e., correctly identifying good and bad ideas) in crowdsourced eval-uation processes may be influenced by the display of peer ratings. While it could sometimes be useful for users to follow their peers, it is not entirely clear under which circumstances this actually holds true. Thus, in this research-in-progress article, we propose a study design to systematically investigate the effect of peer ratings on assessment accuracy in crowdsourced idea evaluation processes. Based on the elaboration likelihood model and social psychology, we develop a research model that incorporates the mediating factors extraversion, locus of control, as well as peer rating quality (i.e., the ratings’ corre-lation with the evaluated content’s actual quality). We suggest that the availability of peer ratings de-creases assessment accuracy and that rating quality, extraversion, as well as an internal locus of control mitigate this effect
- …