73,102 research outputs found
ActiveRemediation: The Search for Lead Pipes in Flint, Michigan
We detail our ongoing work in Flint, Michigan to detect pipes made of lead
and other hazardous metals. After elevated levels of lead were detected in
residents' drinking water, followed by an increase in blood lead levels in area
children, the state and federal governments directed over $125 million to
replace water service lines, the pipes connecting each home to the water
system. In the absence of accurate records, and with the high cost of
determining buried pipe materials, we put forth a number of predictive and
procedural tools to aid in the search and removal of lead infrastructure.
Alongside these statistical and machine learning approaches, we describe our
interactions with government officials in recommending homes for both
inspection and replacement, with a focus on the statistical model that adapts
to incoming information. Finally, in light of discussions about increased
spending on infrastructure development by the federal government, we explore
how our approach generalizes beyond Flint to other municipalities nationwide.Comment: 10 pages, 10 figures, To appear in KDD 2018, For associated
promotional video, see https://www.youtube.com/watch?v=YbIn_axYu9
A Contextual Bandit Bake-off
Contextual bandit algorithms are essential for solving many real-world
interactive machine learning problems. Despite multiple recent successes on
statistically and computationally efficient methods, the practical behavior of
these algorithms is still poorly understood. We leverage the availability of
large numbers of supervised learning datasets to empirically evaluate
contextual bandit algorithms, focusing on practical methods that learn by
relying on optimization oracles from supervised learning. We find that a recent
method (Foster et al., 2018) using optimism under uncertainty works the best
overall. A surprisingly close second is a simple greedy baseline that only
explores implicitly through the diversity of contexts, followed by a variant of
Online Cover (Agarwal et al., 2014) which tends to be more conservative but
robust to problem specification by design. Along the way, we also evaluate
various components of contextual bandit algorithm design such as loss
estimators. Overall, this is a thorough study and review of contextual bandit
methodology
Auditing: Active Learning with Outcome-Dependent Query Costs
We propose a learning setting in which unlabeled data is free, and the cost
of a label depends on its value, which is not known in advance. We study binary
classification in an extreme case, where the algorithm only pays for negative
labels. Our motivation are applications such as fraud detection, in which
investigating an honest transaction should be avoided if possible. We term the
setting auditing, and consider the auditing complexity of an algorithm: the
number of negative labels the algorithm requires in order to learn a hypothesis
with low relative error. We design auditing algorithms for simple hypothesis
classes (thresholds and rectangles), and show that with these algorithms, the
auditing complexity can be significantly lower than the active label
complexity. We also discuss a general competitive approach for auditing and
possible modifications to the framework.Comment: Corrections in section
- …