Search CORE

2,018 research outputs found

SurveyMan: Programming and Automatically Debugging Surveys

Author: Berger Emery D.
Tosch Emma
Publication venue
Publication date: 20/06/2014
Field of study

Surveys can be viewed as programs, complete with logic, control flow, and bugs. Word choice or the order in which questions are asked can unintentionally bias responses. Vague, confusing, or intrusive questions can cause respondents to abandon a survey. Surveys can also have runtime errors: inattentive respondents can taint results. This effect is especially problematic when deploying surveys in uncontrolled settings, such as on the web or via crowdsourcing platforms. Because the results of surveys drive business decisions and inform scientific conclusions, it is crucial to make sure they are correct. We present SurveyMan, a system for designing, deploying, and automatically debugging surveys. Survey authors write their surveys in a lightweight domain-specific language aimed at end users. SurveyMan statically analyzes the survey to provide feedback to survey authors before deployment. It then compiles the survey into JavaScript and deploys it either to the web or a crowdsourcing platform. SurveyMan's dynamic analyses automatically find survey bugs, and control for the quality of responses. We evaluate SurveyMan's algorithms analytically and empirically, demonstrating its effectiveness with case studies of social science surveys conducted via Amazon's Mechanical Turk.Comment: Submitted version; accepted to OOPSLA 201

arXiv.org e-Print Archive

CiteSeerX

Prioritized Garbage Collection: Explicit GC Support for Software Caches

Author: Berger Emery D.
Guyer Samuel Z.
Nunez Diogenes
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/10/2016
Field of study

Programmers routinely trade space for time to increase performance, often in the form of caching or memoization. In managed languages like Java or JavaScript, however, this space-time tradeoff is complex. Using more space translates into higher garbage collection costs, especially at the limit of available memory. Existing runtime systems provide limited support for space-sensitive algorithms, forcing programmers into difficult and often brittle choices about provisioning. This paper presents prioritized garbage collection, a cooperative programming language and runtime solution to this problem. Prioritized GC provides an interface similar to soft references, called priority references, which identify objects that the collector can reclaim eagerly if necessary. The key difference is an API for defining the policy that governs when priority references are cleared and in what order. Application code specifies a priority value for each reference and a target memory bound. The collector reclaims references, lowest priority first, until the total memory footprint of the cache fits within the bound. We use this API to implement a space-aware least-recently-used (LRU) cache, called a Sache, that is a drop-in replacement for existing caches, such as Google's Guava library. The garbage collector automatically grows and shrinks the Sache in response to available memory and workload with minimal provisioning information from the programmer. Using a Sache, it is almost impossible for an application to experience a memory leak, memory pressure, or an out-of-memory crash caused by software caching.Comment: to appear in OOPSLA 201

arXiv.org e-Print Archive

Crossref

Tea: A High-level Language and Runtime System for Automating Statistical Analysis

Author: Berger Emery D.
Chasins Sarah E.
Daum Maureen
Jun Eunice
Just Rene
Reinecke Katharina
Roesch Jared
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/04/2019
Field of study

Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests, and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.Comment: 11 page

arXiv.org e-Print Archive

Crossref