14 research outputs found
Estimating Software Task Effort in Crowds
A key task during software maintenance is the refinement and elaboration of emerging software issues, such as feature implementations and bug resolution. It includes the annotation of software tasks with additional information, such as criticality, assignee and estimated cost of resolution. This paper reports on a first study to investigate the feasibility of using crowd workers supplied with limited information about an issue and project to provide comparably accurate estimates using planning poker. The paper describes our adaptation of planning poker to crowdsourcing and our initial trials. The results demonstrate the feasibility and potential efficiency of using crowds to deliver estimates. We also review the additional benefit that asking crowds for an estimate brings, in terms of further elaboration of the details of an issue. Finally, we outline our plans for a more extensive evaluation of planning poker in crowds
Evaluation Measures for Relevance and Credibility in Ranked Lists
Recent discussions on alternative facts, fake news, and post truth politics
have motivated research on creating technologies that allow people not only to
access information, but also to assess the credibility of the information
presented to them by information retrieval systems. Whereas technology is in
place for filtering information according to relevance and/or credibility, no
single measure currently exists for evaluating the accuracy or precision (and
more generally effectiveness) of both the relevance and the credibility of
retrieved results. One obvious way of doing so is to measure relevance and
credibility effectiveness separately, and then consolidate the two measures
into one. There at least two problems with such an approach: (I) it is not
certain that the same criteria are applied to the evaluation of both relevance
and credibility (and applying different criteria introduces bias to the
evaluation); (II) many more and richer measures exist for assessing relevance
effectiveness than for assessing credibility effectiveness (hence risking
further bias).
Motivated by the above, we present two novel types of evaluation measures
that are designed to measure the effectiveness of both relevance and
credibility in ranked lists of retrieval results. Experimental evaluation on a
small human-annotated dataset (that we make freely available to the research
community) shows that our measures are expressive and intuitive in their
interpretation
Playing Planning Poker in Crowds: Human Computation of Software Effort Estimates
Reliable cost effective effort estimation remains a considerable challenge for software projects. Recent work has demonstrated that the popular Planning Poker practice can produce reliable estimates when undertaken within a software team of knowledgeable domain experts. However, the process depends on the availability of experts and can be time-consuming to perform, making it impractical for large scale or open source projects that may curate many thousands of outstanding tasks. This paper reports on a full study to investigate the feasibility of using crowd workers supplied with limited information about a task to provide comparably accurate estimates using Planning Poker. We describe the design of a Crowd Planning Poker (CPP) process implemented on Amazon Mechanical Turk and the results of a substantial set of trials, involving more than 5000 crowd workers and 39 diverse software tasks. Our results show that a carefully organised and selected crowd of workers can produce effort estimates that are of similar accuracy to those of a single expert
String Sanitization Under Edit Distance: Improved and Generalized
Let W be a string of length n over an alphabet Σ, k be a positive integer, and S be a set of length-k substrings of W. The ETFS problem asks us to construct a string XED such that: (i) no string of S occurs in XED; (ii) the order of all other length-k substrings over Σ is the same in W and in XED; and (iii) XED has minimal edit distance to W. When W represents an individual's data and S represents a set of confidential patterns, the ETFS problem asks for transforming W to preserve its privacy and its utility [Bernardini et al., ECML PKDD 2019].
ETFS can be solved in O(n2k) time [Bernardini et al., CPM 2020]. The same paper shows that ETFS cannot be solved in O(n2−δ) time, for any δ>0, unless the Strong Exponential Time Hypothesis (SETH) is false. Our main results can be summarized as follows: (i) an O(n2log2k)-time algorithm to solve ETFS; and (ii) an O(n2log2n)-time algorithm to solve AETFS, a generalization of ETFS in which the elements of S can have arbitrary lengths. Our algorithms are thus optimal up to polylogarithmic factors, unless SETH fails. Let us also stress that our algorithms work under edit distance with arbitrary weights at no extra cost. As a bonus, we show how to modify some known techniques, which speed up the standard edit distance computation, to be applied to our problems. Beyond string sanitization, our techniques may inspire solutions to other problems related to regular expressions or context-free grammars