15 research outputs found
Self-Updating Models with Error Remediation
Many environments currently employ machine learning models for data
processing and analytics that were built using a limited number of training
data points. Once deployed, the models are exposed to significant amounts of
previously-unseen data, not all of which is representative of the original,
limited training data. However, updating these deployed models can be difficult
due to logistical, bandwidth, time, hardware, and/or data sensitivity
constraints. We propose a framework, Self-Updating Models with Error
Remediation (SUMER), in which a deployed model updates itself as new data
becomes available. SUMER uses techniques from semi-supervised learning and
noise remediation to iteratively retrain a deployed model using
intelligently-chosen predictions from the model as the labels for new training
iterations. A key component of SUMER is the notion of error remediation as
self-labeled data can be susceptible to the propagation of errors. We
investigate the use of SUMER across various data sets and iterations. We find
that self-updating models (SUMs) generally perform better than models that do
not attempt to self-update when presented with additional previously-unseen
data. This performance gap is accentuated in cases where there is only limited
amounts of initial training data. We also find that the performance of SUMER is
generally better than the performance of SUMs, demonstrating a benefit in
applying error remediation. Consequently, SUMER can autonomously enhance the
operational capabilities of existing data processing systems by intelligently
updating models in dynamic environments.Comment: 17 pages, 13 figures, published in the proceedings of the Artificial
Intelligence and Machine Learning for Multi-Domain Operations Applications II
conference in the SPIE Defense + Commercial Sensing, 2020 symposiu
Online Decision Mediation
Consider learning a decision support assistant to serve as an intermediary
between (oracle) expert behavior and (imperfect) human behavior: At each time,
the algorithm observes an action chosen by a fallible agent, and decides
whether to *accept* that agent's decision, *intervene* with an alternative, or
*request* the expert's opinion. For instance, in clinical diagnosis,
fully-autonomous machine behavior is often beyond ethical affordances, thus
real-world decision support is often limited to monitoring and forecasting.
Instead, such an intermediary would strike a prudent balance between the former
(purely prescriptive) and latter (purely descriptive) approaches, while
providing an efficient interface between human mistakes and expert feedback. In
this work, we first formalize the sequential problem of *online decision
mediation* -- that is, of simultaneously learning and evaluating mediator
policies from scratch with *abstentive feedback*: In each round, deferring to
the oracle obviates the risk of error, but incurs an upfront penalty, and
reveals the otherwise hidden expert action as a new training data point.
Second, we motivate and propose a solution that seeks to trade off (immediate)
loss terms against (future) improvements in generalization error; in doing so,
we identify why conventional bandit algorithms may fail. Finally, through
experiments and sensitivities on a variety of datasets, we illustrate
consistent gains over applicable benchmarks on performance measures with
respect to the mediator policy, the learned model, and the decision-making
system as a whole