329 research outputs found
Globally Optimal Crowdsourcing Quality Management
We study crowdsourcing quality management, that is, given worker responses to
a set of tasks, our goal is to jointly estimate the true answers for the tasks,
as well as the quality of the workers. Prior work on this problem relies
primarily on applying Expectation-Maximization (EM) on the underlying maximum
likelihood problem to estimate true answers as well as worker quality.
Unfortunately, EM only provides a locally optimal solution rather than a
globally optimal one. Other solutions to the problem (that do not leverage EM)
fail to provide global optimality guarantees as well. In this paper, we focus
on filtering, where tasks require the evaluation of a yes/no predicate, and
rating, where tasks elicit integer scores from a finite domain. We design
algorithms for finding the global optimal estimates of correct task answers and
worker quality for the underlying maximum likelihood problem, and characterize
the complexity of these algorithms. Our algorithms conceptually consider all
mappings from tasks to true answers (typically a very large number), leveraging
two key ideas to reduce, by several orders of magnitude, the number of mappings
under consideration, while preserving optimality. We also demonstrate that
these algorithms often find more accurate estimates than EM-based algorithms.
This paper makes an important contribution towards understanding the inherent
complexity of globally optimal crowdsourcing quality management
Exploration of the Genetic Epidemiology of Asthma: A Review, With a Focus on Prevalence in Children and Adolescents in the Caribbean
Asthma is a chronic disease caused by the inflammation of the main air passages of the lungs. This paper outlines a review of the published literature on asthma. While a few studies show a trend of rising asthma cases in the Caribbean region, even fewer have explored the genetic epidemiological factors of asthma. This is a literature review that seeks to sum the body of knowledge on the epidemiology of asthma. Specifically, the major objective of the literature review is to provide a unified information base on the current state of factors involved in the genetic epidemiology of asthma. The review is a simple, yet detailed summary of the literature sources and their methodology and findings on the genetic epidemiology of asthma. Further, it seeks to direct this effort to the Caribbean region. The paper then reviews a summarized and synthesized collection of the body of previous research. Of specific interest are peer-reviewed sources that have been published in recent times. The paper provides more recent insight and recapitulates on the previous research, while tracing the intellectual progress on the debate. Where possible, reviewing and discussing the results of the previous literature, this review singles out the gaps and potential future research directions for studying the genetic epidemiology of asthma. Overall, we hope to contribute to a more synthesized knowledge and improved understanding of the previous literature and future potential direction of genetic and epidemiological asthma research
Recommended from our members
Incidence of Pediatric Cannabis Exposure Among Children and Teenagers Aged 0 to 19 Years Before and After Medical Marijuana Legalization in Massachusetts
Importance Pediatric health care contacts due to cannabis exposure increased in Colorado and Washington State after cannabis (marijuana) policies became more liberal, but evidence from other US states is limited.
Objective To document the incidence of pediatric cannabis exposure cases reported to the Regional Center for Poison Control and Prevention (RPC) before and after medical marijuana legalization (MML) in Massachusetts.
Design, Setting, and Participants Cross-sectional comparison of pediatric cannabis exposure cases 4 years before and after MML in Massachusetts. The exposure cases included those of 218 children and teenagers aged between 0 and 19 years, as reported to the RPC from 2009 to 2016. Census data were used to determine the incidence. Data analysis was performed from November 12, 2018, to July 20, 2019.
Exposure Cannabis products.
Main Outcomes and Measures Incidence of RPC-reported cannabis exposure cases, both single substance and polysubstance, for the age group of 0 to 19 years, and cannabis product type, coingestants, and clinical effects.
Results During the 8-year study period (2009-2016), the RPC received 218 calls involving cannabis exposure (98 single substance, 120 polysubstance) in children and teenagers aged 0 to 19 years, representing 0.15% of all RPC calls in that age group for that period. Of the total exposure cases, males accounted for 132 (60.6%) and females 86 (39.4%). The incidence of single-substance cannabis calls increased from 0.4 per 100āÆ000 population before MML to 1.1 per 100āÆ000 population after (incidence rate ratio, 2.4; 95% CI, 1.5-3.9), a 140% increase. The age group of 15 to 19 years had the highest frequency of RPC-reported cannabis exposures (178 calls [81.7%]). The proportion of all RPC calls due to single-substance cannabis exposure increased overall for all age groups from 29 before MML to 69 afterward. Exposure to edible products increased after MML for most age groups.
Conclusions and Relevance Pediatric cannabis exposure cases increased in Massachusetts after medical marijuana was legalized in 2012, despite using childproof packaging and warning labels. This study provides additional evidence suggesting that MML may be associated with an increase in cannabis exposure cases among very young children, and extends prior work showing that teenagers are also experiencing increased cannabis-related health system contacts via the RPC. Additional efforts are needed to keep higher-potency edible products and concentrated extracts from children and teenagers, especially considering the MML and retail cannabis sales in an increasing number of US states
Limitations of Majority Agreement in Crowdsourced Image Interpretation
Crowdsourcing can efficiently complete tasks that are difficult to automate, but the quality of crowdsourced data is tricky to evaluate. Algorithms to grade volunteer work often assume that all tasks are similarly difficult, an assumption that is frequently false. We use a cropland identification game with over 2,600 participants and 165,000 unique tasks to investigate how best to evaluate the difficulty of crowdsourced tasks and to what extent this is possible based on volunteer responses alone. Inter-volunteer agreement exceeded 90% for about 80% of the images and was negatively correlated with volunteer-expressed uncertainty about image classification. A total of 343 relatively difficult images were independently classified as cropland, non-cropland or impossible by two experts. The experts disagreed weakly (one said impossible while the other rated as cropland or non-cropland) on 27% of the images, but disagreed strongly (cropland vs. non-cropland) on only 7%. Inter-volunteer disagreement increased significantly with inter-expert disagreement. While volunteers agreed with expert classifications for most images, over 20% would have been mis-categorized if only the volunteersā majority vote was used. We end with a series of recommendations for managing the challenges posed by heterogeneous tasks in crowdsourcing campaigns
Gluon helicity from global analysis of experimental data and lattice QCD Ioffe time distributions
We perform a new global analysis of spin-dependent parton distribution
functions with the inclusion of Ioffe time pseudo-distributions computed in
lattice QCD (LQCD), which are directly sensitive to the gluon helicity
distribution, . These lattice data have an analogous relationship to
parton distributions as do experimental cross sections, and can be readily
included in global analyses. We focus in particular on the constraining
capability of current LQCD data on the sign of at intermediate
parton momentum fractions , which was recently brought into question by
analysis of data in the absence of parton positivity constraints. We find that
present LQCD data cannot discriminate between positive and negative
solutions, although significant changes in the solutions for both the gluon and
quark sectors are observed.Comment: 24 pages, 7 figure
Efficient crowdsourcing for multi-class labeling
Crowdsourcing systems like Amazon's Mechanical Turk have emerged as an effective large-scale human-powered platform for performing tasks in domains such as image classification, data entry, recommendation, and proofreading. Since workers are low-paid (a few cents per task) and tasks performed are monotonous, the answers obtained are noisy and hence unreliable. To obtain reliable estimates, it is essential to utilize appropriate inference algorithms (e.g. Majority voting) coupled with structured redundancy through task assignment. Our goal is to obtain the best possible trade-off between reliability and redundancy. In this paper, we consider a general probabilistic model for noisy observations for crowd-sourcing systems and pose the problem of minimizing the total price (i.e. redundancy) that must be paid to achieve a target overall reliability. Concretely, we show that it is possible to obtain an answer to each task correctly with probability 1-Īµ as long as the redundancy per task is O((K/q) log (K/Īµ)), where each task can have any of the distinct answers equally likely, q is the crowd-quality parameter that is defined through a probabilistic model. Further, effectively this is the best possible redundancy-accuracy trade-off any system design can achieve. Such a single-parameter crisp characterization of the (order-)optimal trade-off between redundancy and reliability has various useful operational consequences. Further, we analyze the robustness of our approach in the presence of adversarial workers and provide a bound on their influence on the redundancy-accuracy trade-off.
Unlike recent prior work [GKM11, KOS11, KOS11], our result applies to non-binary (i.e. K>2) tasks. In effect, we utilize algorithms for binary tasks (with inhomogeneous error model unlike that in [GKM11, KOS11, KOS11]) as key subroutine to obtain answers for K-ary tasks. Technically, the algorithm is based on low-rank approximation of weighted adjacency matrix for a random regular bipartite graph, weighted according to the answers provided by the workers.National Science Foundation (U.S.
Leveraging Affect Transfer Learning for Behavior Prediction in an Intelligent Tutoring System
In the context of building an intelligent tutoring system (ITS), which
improves student learning outcomes by intervention, we set out to improve
prediction of student problem outcome. In essence, we want to predict the
outcome of a student answering a problem in an ITS from a video feed by
analyzing their face and gestures. For this, we present a novel transfer
learning facial affect representation and a user-personalized training scheme
that unlocks the potential of this representation. We model the temporal
structure of video sequences of students solving math problems using a
recurrent neural network architecture. Additionally, we extend the largest
dataset of student interactions with an intelligent online math tutor by a
factor of two. Our final model, coined ATL-BP (Affect Transfer Learning for
Behavior Prediction) achieves an increase in mean F-score over state-of-the-art
of 45% on this new dataset in the general case and 50% in a more challenging
leave-users-out experimental setting when we use a user-personalized training
scheme
- ā¦