Search CORE

22 research outputs found

Space-Efficient Interior Point Method, with Applications to Linear Programming and Maximum Weight Bipartite Matching

Author: Liu S. Cliff
Song Zhao
Zhang Hengjie
Zhang Lichen
Zhou Tianyi
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

Dagstuhl Research Online Publication Server

L1 Regression with Lewis Weights Subsampling

Author: Parulekar Aditya
Parulekar Advait
Price Eric
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2021)
Publication date: 01/01/2021
Field of study

Dagstuhl Research Online Publication Server

Sublinear Time Numerical Linear Algebra for Structured Matrices

Author: Shi Xiaofei
Woodruff David P.
Publication venue
Publication date: 17/07/2019
Field of study

We show how to solve a number of problems in numerical linear algebra, such as least squares regression,

\ell_p

-regression for any

p \geq 1

, low rank approximation, and kernel regression, in time T(A) \poly(\log(nd)), where for a given input matrix

A \in \mathbb{R}^{n \times d}

T(A)

is the time needed to compute

A\cdot y

for an arbitrary vector

y \in \mathbb{R}^d

. Since T(A) \leq O(\nnz(A)), where \nnz(A) denotes the number of non-zero entries of

A

, the time is no worse, up to polylogarithmic factors, as all of the recent advances for such problems that run in input-sparsity time. However, for many applications,

T(A)

can be much smaller than \nnz(A), yielding significantly sublinear time algorithms. For example, in the overconstrained

(1+\epsilon)

-approximate polynomial interpolation problem,

A

is a Vandermonde matrix and

T(A) = O(n \log n)

; in this case our running time is n \cdot \poly(\log n) + \poly(d/\epsilon) and we recover the results of \cite{avron2013sketching} as a special case. For overconstrained autoregression, which is a common problem arising in dynamical systems,

T(A) = O(n \log n)

, and we immediately obtain n \cdot \poly(\log n) + \poly(d/\epsilon) time. For kernel autoregression, we significantly improve the running time of prior algorithms for general kernels. For the important case of autoregression with the polynomial kernel and arbitrary target vector

b\in\mathbb{R}^n

, we obtain even faster algorithms. Our algorithms show that, perhaps surprisingly, most of these optimization problems do not require much more time than that of a polylogarithmic number of matrix-vector multiplications

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Input Sparsity and Hardness for Robust Subspace Approximation

Author: Clarkson Kenneth L.
Woodruff David P.
Publication venue
Publication date: 20/10/2015
Field of study

In the subspace approximation problem, we seek a k-dimensional subspace F of R^d that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a_1, ..., a_n in R^d, for p >= 1. More generally than minimizing sum_i dist(a_i,F)^p,we may wish to minimize sum_i M(dist(a_i,F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functions. Such subspaces provide alternatives to the singular value decomposition (SVD), which is the p=2 case, finding such an F that minimizes the sum of squares of distances. For p in [1,2), and for typical M-Estimators, the minimizing

F

gives a solution that is more robust to outliers than that provided by the SVD. We give several algorithmic and hardness results for these robust subspace approximation problems. We think of the n points as forming an n x d matrix A, and letting nnz(A) denote the number of non-zero entries of A. Our results hold for p in [1,2). We use poly(n) to denote n^{O(1)} as n -> infty. We obtain: (1) For minimizing sum_i dist(a_i,F)^p, we give an algorithm running in O(nnz(A) + (n+d)poly(k/eps) + exp(poly(k/eps))), (2) we show that the problem of minimizing sum_i dist(a_i, F)^p is NP-hard, even to output a (1+1/poly(d))-approximation, answering a question of Kannan and Vempala, and complementing prior results which held for p >2, (3) For loss functions for a wide class of M-Estimators, we give a problem-size reduction: for a parameter K=(log n)^{O(log k)}, our reduction takes O(nnz(A) log n + (n+d) poly(K/eps)) time to reduce the problem to a constrained version involving matrices whose dimensions are poly(K eps^{-1} log n). We also give bicriteria solutions, (4) Our techniques lead to the first O(nnz(A) + poly(d/eps)) time algorithms for (1+eps)-approximate regression for a wide class of convex M-Estimators.Comment: paper appeared in FOCS, 201

arXiv.org e-Print Archive

Crossref

Pruning Neural Networks via Coresets and Convex Geometry: Towards No Assumptions

Author: Maalouf Alaa
Mualem Loay
Tukan Murad
Publication venue
Publication date: 18/09/2022
Field of study

Pruning is one of the predominant approaches for compressing deep neural networks (DNNs). Lately, coresets (provable data summarizations) were leveraged for pruning DNNs, adding the advantage of theoretical guarantees on the trade-off between the compression rate and the approximation error. However, coresets in this domain were either data-dependent or generated under restrictive assumptions on both the model's weights and inputs. In real-world scenarios, such assumptions are rarely satisfied, limiting the applicability of coresets. To this end, we suggest a novel and robust framework for computing such coresets under mild assumptions on the model's weights and without any assumption on the training data. The idea is to compute the importance of each neuron in each layer with respect to the output of the following layer. This is achieved by a combination of L\"{o}wner ellipsoid and Caratheodory theorem. Our method is simultaneously data-independent, applicable to various networks and datasets (due to the simplified assumptions), and theoretically supported. Experimental results show that our method outperforms existing coreset based neural pruning approaches across a wide range of networks and datasets. For example, our method achieved a

62\%

compression rate on ResNet50 on ImageNet with

1.09\%

drop in accuracy

arXiv.org e-Print Archive