27 research outputs found
A Novel Sequential Coreset Method for Gradient Descent Algorithms
A wide range of optimization problems arising in machine learning can be
solved by gradient descent algorithms, and a central question in this area is
how to efficiently compress a large-scale dataset so as to reduce the
computational complexity. {\em Coreset} is a popular data compression technique
that has been extensively studied before. However, most of existing coreset
methods are problem-dependent and cannot be used as a general tool for a
broader range of applications. A key obstacle is that they often rely on the
pseudo-dimension and total sensitivity bound that can be very high or hard to
obtain. In this paper, based on the ''locality'' property of gradient descent
algorithms, we propose a new framework, termed ''sequential coreset'', which
effectively avoids these obstacles. Moreover, our method is particularly
suitable for sparse optimization whence the coreset size can be further reduced
to be only poly-logarithmically dependent on the dimension. In practice, the
experimental results suggest that our method can save a large amount of running
time compared with the baseline algorithms
Black-box Coreset Variational Inference
Recent advances in coreset methods have shown that a selection of
representative datapoints can replace massive volumes of data for Bayesian
inference, preserving the relevant statistical information and significantly
accelerating subsequent downstream tasks. Existing variational coreset
constructions rely on either selecting subsets of the observed datapoints, or
jointly performing approximate inference and optimizing pseudodata in the
observed space akin to inducing points methods in Gaussian Processes. So far,
both approaches are limited by complexities in evaluating their objectives for
general purpose models, and require generating samples from a typically
intractable posterior over the coreset throughout inference and testing. In
this work, we present a black-box variational inference framework for coresets
that overcomes these constraints and enables principled application of
variational coresets to intractable models, such as Bayesian neural networks.
We apply our techniques to supervised learning problems, and compare them with
existing approaches in the literature for data summarization and inference.Comment: NeurIPS 202
Soft-Label Anonymous Gastric X-ray Image Distillation
This paper presents a soft-label anonymous gastric X-ray image distillation
method based on a gradient descent approach. The sharing of medical data is
demanded to construct high-accuracy computer-aided diagnosis (CAD) systems.
However, the large size of the medical dataset and privacy protection are
remaining problems in medical data sharing, which hindered the research of CAD
systems. The idea of our distillation method is to extract the valid
information of the medical dataset and generate a tiny distilled dataset that
has a different data distribution. Different from model distillation, our
method aims to find the optimal distilled images, distilled labels and the
optimized learning rate. Experimental results show that the proposed method can
not only effectively compress the medical dataset but also anonymize medical
images to protect the patient's private information. The proposed approach can
improve the efficiency and security of medical data sharing.Comment: Published as a conference paper at ICIP 202
Coreset Clustering on Small Quantum Computers
Many quantum algorithms for machine learning require access to classical data
in superposition. However, for many natural data sets and algorithms, the
overhead required to load the data set in superposition can erase any potential
quantum speedup over classical algorithms. Recent work by Harrow introduces a
new paradigm in hybrid quantum-classical computing to address this issue,
relying on coresets to minimize the data loading overhead of quantum
algorithms. We investigate using this paradigm to perform -means clustering
on near-term quantum computers, by casting it as a QAOA optimization instance
over a small coreset. We compare the performance of this approach to classical
-means clustering both numerically and experimentally on IBM Q hardware. We
are able to find data sets where coresets work well relative to random sampling
and where QAOA could potentially outperform standard -means on a coreset.
However, finding data sets where both coresets and QAOA work well--which is
necessary for a quantum advantage over -means on the entire data
set--appears to be challenging
Stochastic Subset Selection for Efficient Training and Inference of Neural Networks
Current machine learning algorithms are designed to work with huge volumes of
high dimensional data such as images. However, these algorithms are being
increasingly deployed to resource constrained systems such as mobile devices
and embedded systems. Even in cases where large computing infrastructure is
available, the size of each data instance, as well as datasets, can be a
bottleneck in data transfer across communication channels. Also, there is a
huge incentive both in energy and monetary terms in reducing both the
computational and memory requirements of these algorithms. For nonparametric
models that require to leverage the stored training data at inference time, the
increased cost in memory and computation could be even more problematic. In
this work, we aim to reduce the volume of data these algorithms must process
through an end-to-end two-stage neural subset selection model. We first
efficiently obtain a subset of candidate elements by sampling a mask from a
conditionally independent Bernoulli distribution, and then autoregressivley
construct a subset consisting of the most task relevant elements via sampling
the elements from a conditional Categorical distribution. We validate our
method on set reconstruction and classification tasks with feature selection as
well as the selection of representative samples from a given dataset, on which
our method outperforms relevant baselines. We also show in our experiments that
our method enhances scalability of nonparametric models such as Neural
Processes.Comment: 19 page