Search CORE

89 research outputs found

Budget Feasible Mechanisms for Experimental Design

Author: A. Archer
A. Atkinson
A.A. Ageev
G. Calinescu
J. Ginebra
L. Vandenberghe
M. Sviridenko
R. Lavi
R. Myerson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/07/2013
Field of study

In the classical experimental design setting, an experimenter E has access to a population of

n

potential experiment subjects

i\in \{1,...,n\}

, each associated with a vector of features

x_i\in R^d

. Conducting an experiment with subject

i

reveals an unknown value

y_i\in R

to E. E typically assumes some hypothetical relationship between

x_i

's and

y_i

's, e.g.,

y_i \approx \beta x_i

, and estimates

\beta

from experiments, e.g., through linear regression. As a proxy for various practical constraints, E may select only a subset of subjects on which to conduct the experiment. We initiate the study of budgeted mechanisms for experimental design. In this setting, E has a budget

B

. Each subject

i

declares an associated cost

c_i >0

to be part of the experiment, and must be paid at least her cost. In particular, the Experimental Design Problem (EDP) is to find a set

S

of subjects for the experiment that maximizes V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i}) under the constraint

\sum_{i\in S}c_i\leq B

; our objective function corresponds to the information gain in parameter

\beta

that is learned through linear regression methods, and is related to the so-called

D

-optimality criterion. Further, the subjects are strategic and may lie about their costs. We present a deterministic, polynomial time, budget feasible mechanism scheme, that is approximately truthful and yields a constant factor approximation to EDP. In particular, for any small

\delta > 0

and

\epsilon > 0

, we can construct a (12.98,

\epsilon

)-approximate mechanism that is

\delta

-truthful and runs in polynomial time in both

n

and

\log\log\frac{B}{\epsilon\delta}

. We also establish that no truthful, budget-feasible algorithms is possible within a factor 2 approximation, and show how to generalize our approach to a wide class of learning problems, beyond linear regression

arXiv.org e-Print Archive

Crossref

Online Convex Optimization with Binary Constraints

Author: Callaway Duncan S.
Lesage-Landry Antoine
Taylor Joshua A.
Publication venue
Publication date: 01/01/2021
Field of study

We consider online optimization with binary decision variables and convex loss functions. We design a new algorithm, binary online gradient descent (bOGD) and bound its expected dynamic regret. We provide a regret bound that holds for any time horizon and a specialized bound for finite time horizons. First, we present the regret as the sum of the relaxed, continuous round optimum tracking error and the rounding error of our update in which the former asymptomatically decreases with time under certain conditions. Then, we derive a finite-time bound that is sublinear in time and linear in the cumulative variation of the relaxed, continuous round optima. We apply bOGD to demand response with thermostatically controlled loads, in which binary constraints model discrete on/off settings. We also model uncertainty and varying load availability, which depend on temperature deadbands, lockout of cooling units and manual overrides. We test the performance of bOGD in several simulations based on demand response. The simulations corroborate that the use of randomization in bOGD does not significantly degrade performance while making the problem more tractable

arXiv.org e-Print Archive

PolyPublie

Learning in the Real World: Constraints on Cost, Space, and Privacy

Author: Kusner Matt J
Publication venue: Washington University Open Scholarship
Publication date: 01/08/2016
Field of study

The sheer demand for machine learning in fields as varied as: healthcare, web-search ranking, factory automation, collision prediction, spam filtering, and many others, frequently outpaces the intended use-case of machine learning models. In fact, a growing number of companies hire machine learning researchers to rectify this very problem: to tailor and/or design new state-of-the-art models to the setting at hand. However, we can generalize a large set of the machine learning problems encountered in practical settings into three categories: cost, space, and privacy. The first category (cost) considers problems that need to balance the accuracy of a machine learning model with the cost required to evaluate it. These include problems in web-search, where results need to be delivered to a user in under a second and be as accurate as possible. The second category (space) collects problems that require running machine learning algorithms on low-memory computing devices. For instance, in search-and-rescue operations we may opt to use many small unmanned aerial vehicles (UAVs) equipped with machine learning algorithms for object detection to find a desired search target. These algorithms should be small to fit within the physical memory limits of the UAV (and be energy efficient) while reliably detecting objects. The third category (privacy) considers problems where one wishes to run machine learning algorithms on sensitive data. It has been shown that seemingly innocuous analyses on such data can be exploited to reveal data individuals would prefer to keep private. Thus, nearly any algorithm that runs on patient or economic data falls under this set of problems. We devise solutions for each of these problem categories including (i) a fast tree-based model for explicitly trading off accuracy and model evaluation time, (ii) a compression method for the k-nearest neighbor classifier, and (iii) a private causal inference algorithm that protects sensitive data

Washington University St. Louis: Open Scholarship

Privacy Tradeoffs in Predictive Analytics

Author: Bhagat Smriti
Fawaz Nadia
Ioannidis Stratis
Montanari Andrea
Taft Nina
Weinsberg Udi
Publication venue
Publication date: 01/01/2014
Field of study

Online services routinely mine user data to predict user preferences, make recommendations, and place targeted ads. Recent research has demonstrated that several private user attributes (such as political affiliation, sexual orientation, and gender) can be inferred from such data. Can a privacy-conscious user benefit from personalization while simultaneously protecting her private attributes? We study this question in the context of a rating prediction service based on matrix factorization. We construct a protocol of interactions between the service and users that has remarkable optimality properties: it is privacy-preserving, in that no inference algorithm can succeed in inferring a user's private attribute with a probability better than random guessing; it has maximal accuracy, in that no other privacy-preserving protocol improves rating prediction; and, finally, it involves a minimal disclosure, as the prediction accuracy strictly decreases when the service reveals less information. We extensively evaluate our protocol using several rating datasets, demonstrating that it successfully blocks the inference of gender, age and political affiliation, while incurring less than 5% decrease in the accuracy of rating prediction.Comment: Extended version of the paper appearing in SIGMETRICS 201

arXiv.org e-Print Archive

CiteSeerX

Crossref