67 research outputs found
InterPoll: Crowd-Sourced Internet Polls
Crowd-sourcing is increasingly being used to provide answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, provide little to help the survey-maker or pollster in obtaining statistically significant results devoid of even the obvious selection biases.
This paper proposes InterPoll, a platform for programming of crowd-sourced polls. Pollsters express polls as embedded LINQ queries and the runtime correctly reasons about uncertainty in those polls, only polling as many people as required to meet statistical guarantees. To optimize the cost of polls, InterPoll performs query optimization, as well as bias correction and power analysis. The goal of InterPoll is to provide a system that can be reliably used for research into marketing, social and political science questions.
This paper highlights some of the existing challenges and how InterPoll is designed to address most of them.
In this paper we summarize some of the work we have already done and give an outline for future work
Recommended from our members
Experimental Evidence of Chaotic Dynamics in Computer Hardware ; CU-CS-1031-07
Recommended from our members
Observer Effect and Measurement Bias in Performance Analysis ; CU-CS-1042-08
CodeExp: Explanatory Code Document Generation
Developing models that can automatically generate detailed code explanation
can greatly benefit software maintenance and programming education. However,
existing code-to-text generation models often produce only high-level summaries
of code that do not capture implementation-level choices essential for these
scenarios. To fill in this gap, we propose the code explanation generation
task. We first conducted a human study to identify the criteria for
high-quality explanatory docstring for code. Based on that, we collected and
refined a large-scale code docstring corpus and formulated automatic evaluation
metrics that best match human assessments. Finally, we present a multi-stage
fine-tuning strategy and baseline models for the task. Our experiments show
that (1) our refined training dataset lets models achieve better performance in
the explanation generation tasks compared to larger unrefined data (15x
larger), and (2) fine-tuned models can generate well-structured long docstrings
comparable to human-written ones. We envision our training dataset,
human-evaluation protocol, recommended metrics, and fine-tuning strategy can
boost future code explanation research. The code and annotated data are
available at https://github.com/subercui/CodeExp.Comment: Accepted in Findings of EMNLP 202
- …