Coresets for Wasserstein Distributionally Robust Optimization Problems

Ding, Hu; Huang, Jiawei; Huang, Ruomin; Liu, Wenjie

Coresets for Wasserstein Distributionally Robust Optimization Problems

Authors: Hu Ding
Jiawei Huang
Ruomin Huang
Wenjie Liu
Publication date: 9 October 2022
Publisher

Abstract

Wasserstein distributionally robust optimization (\textsf{WDRO}) is a popular model to enhance the robustness of machine learning with ambiguous data. However, the complexity of \textsf{WDRO} can be prohibitive in practice since solving its ``minimax'' formulation requires a great amount of computation. Recently, several fast \textsf{WDRO} training algorithms for some specific machine learning tasks (e.g., logistic regression) have been developed. However, the research on designing efficient algorithms for general large-scale \textsf{WDRO}s is still quite limited, to the best of our knowledge. \textit{Coreset} is an important tool for compressing large dataset, and thus it has been widely applied to reduce the computational complexities for many optimization problems. In this paper, we introduce a unified framework to construct the

\epsilon

-coreset for the general \textsf{WDRO} problems. Though it is challenging to obtain a conventional coreset for \textsf{WDRO} due to the uncertainty issue of ambiguous data, we show that we can compute a ``dual coreset'' by using the strong duality property of \textsf{WDRO}. Also, the error introduced by the dual coreset can be theoretically guaranteed for the original \textsf{WDRO} objective. To construct the dual coreset, we propose a novel grid sampling approach that is particularly suitable for the dual formulation of \textsf{WDRO}. Finally, we implement our coreset approach and illustrate its effectiveness for several \textsf{WDRO} problems in the experiments

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2210.04260

Last time updated on 24/11/2022