143 research outputs found
On statistics, computation and scalability
How should statistical procedures be designed so as to be scalable
computationally to the massive datasets that are increasingly the norm? When
coupled with the requirement that an answer to an inferential question be
delivered within a certain time budget, this question has significant
repercussions for the field of statistics. With the goal of identifying
"time-data tradeoffs," we investigate some of the statistical consequences of
computational perspectives on scability, in particular divide-and-conquer
methodology and hierarchies of convex relaxations.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP17 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Budget Feasible Mechanisms for Experimental Design
In the classical experimental design setting, an experimenter E has access to
a population of potential experiment subjects , each
associated with a vector of features . Conducting an experiment
with subject reveals an unknown value to E. E typically assumes
some hypothetical relationship between 's and 's, e.g., , and estimates from experiments, e.g., through linear
regression. As a proxy for various practical constraints, E may select only a
subset of subjects on which to conduct the experiment.
We initiate the study of budgeted mechanisms for experimental design. In this
setting, E has a budget . Each subject declares an associated cost to be part of the experiment, and must be paid at least her cost. In
particular, the Experimental Design Problem (EDP) is to find a set of
subjects for the experiment that maximizes V(S) = \log\det(I_d+\sum_{i\in
S}x_i\T{x_i}) under the constraint ; our objective
function corresponds to the information gain in parameter that is
learned through linear regression methods, and is related to the so-called
-optimality criterion. Further, the subjects are strategic and may lie about
their costs.
We present a deterministic, polynomial time, budget feasible mechanism
scheme, that is approximately truthful and yields a constant factor
approximation to EDP. In particular, for any small and , we can construct a (12.98, )-approximate mechanism that is
-truthful and runs in polynomial time in both and
. We also establish that no truthful,
budget-feasible algorithms is possible within a factor 2 approximation, and
show how to generalize our approach to a wide class of learning problems,
beyond linear regression
- …