2,018 research outputs found
SurveyMan: Programming and Automatically Debugging Surveys
Surveys can be viewed as programs, complete with logic, control flow, and
bugs. Word choice or the order in which questions are asked can unintentionally
bias responses. Vague, confusing, or intrusive questions can cause respondents
to abandon a survey. Surveys can also have runtime errors: inattentive
respondents can taint results. This effect is especially problematic when
deploying surveys in uncontrolled settings, such as on the web or via
crowdsourcing platforms. Because the results of surveys drive business
decisions and inform scientific conclusions, it is crucial to make sure they
are correct.
We present SurveyMan, a system for designing, deploying, and automatically
debugging surveys. Survey authors write their surveys in a lightweight
domain-specific language aimed at end users. SurveyMan statically analyzes the
survey to provide feedback to survey authors before deployment. It then
compiles the survey into JavaScript and deploys it either to the web or a
crowdsourcing platform. SurveyMan's dynamic analyses automatically find survey
bugs, and control for the quality of responses. We evaluate SurveyMan's
algorithms analytically and empirically, demonstrating its effectiveness with
case studies of social science surveys conducted via Amazon's Mechanical Turk.Comment: Submitted version; accepted to OOPSLA 201
Prioritized Garbage Collection: Explicit GC Support for Software Caches
Programmers routinely trade space for time to increase performance, often in
the form of caching or memoization. In managed languages like Java or
JavaScript, however, this space-time tradeoff is complex. Using more space
translates into higher garbage collection costs, especially at the limit of
available memory. Existing runtime systems provide limited support for
space-sensitive algorithms, forcing programmers into difficult and often
brittle choices about provisioning.
This paper presents prioritized garbage collection, a cooperative programming
language and runtime solution to this problem. Prioritized GC provides an
interface similar to soft references, called priority references, which
identify objects that the collector can reclaim eagerly if necessary. The key
difference is an API for defining the policy that governs when priority
references are cleared and in what order. Application code specifies a priority
value for each reference and a target memory bound. The collector reclaims
references, lowest priority first, until the total memory footprint of the
cache fits within the bound. We use this API to implement a space-aware
least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement
for existing caches, such as Google's Guava library. The garbage collector
automatically grows and shrinks the Sache in response to available memory and
workload with minimal provisioning information from the programmer. Using a
Sache, it is almost impossible for an application to experience a memory leak,
memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201
Tea: A High-level Language and Runtime System for Automating Statistical Analysis
Though statistical analyses are centered on research questions and
hypotheses, current statistical analysis tools are not. Users must first
translate their hypotheses into specific statistical tests and then perform API
calls with functions and parameters. To do so accurately requires that users
have statistical expertise. To lower this barrier to valid, replicable
statistical analysis, we introduce Tea, a high-level declarative language and
runtime system. In Tea, users express their study design, any parametric
assumptions, and their hypotheses. Tea compiles these high-level specifications
into a constraint satisfaction problem that determines the set of valid
statistical tests, and then executes them to test the hypothesis. We evaluate
Tea using a suite of statistical analyses drawn from popular tutorials. We show
that Tea generally matches the choices of experts while automatically switching
to non-parametric tests when parametric assumptions are not met. We simulate
the effect of mistakes made by non-expert users and show that Tea automatically
avoids both false negatives and false positives that could be produced by the
application of incorrect statistical tests.Comment: 11 page
- …