Sparsifying generalized linear models

Abstract

We consider the sparsification of sums

F : \mathbb{R}^n \to \mathbb{R}

where

F(x) = f_1(\langle a_1,x\rangle) + \cdots + f_m(\langle a_m,x\rangle)

for vectors

a_1,\ldots,a_m \in \mathbb{R}^n

and functions

f_1,\ldots,f_m : \mathbb{R} \to \mathbb{R}_+

. We show that

(1+\varepsilon)

-approximate sparsifiers of

F

with support size

\frac{n}{\varepsilon^2} (\log \frac{n}{\varepsilon})^{O(1)}

exist whenever the functions

f_1,\ldots,f_m

are symmetric, monotone, and satisfy natural growth bounds. Additionally, we give efficient algorithms to compute such a sparsifier assuming each

f_i

can be evaluated efficiently. Our results generalize the classic case of

\ell_p

sparsification, where

f_i(z) = |z|^p

, for

p \in (0, 2]

, and give the first near-linear size sparsifiers in the well-studied setting of the Huber loss function and its generalizations, e.g.,

f_i(z) = \min\{|z|^p, |z|^2\}

for

0 < p \leq 2

. Our sparsification algorithm can be applied to give near-optimal reductions for optimizing a variety of generalized linear models including

\ell_p

regression for

p \in (1, 2]

to high accuracy, via solving

(\log n)^{O(1)}

sparse regression instances with

m \le n(\log n)^{O(1)}

, plus runtime proportional to the number of nonzero entries in the vectors

a_1, \dots, a_m

Last time updated on 10/05/2024