22 research outputs found
Matrix Infinitely Divisible Series: Tail Inequalities and Applications in Optimization
In this paper, we study tail inequalities of the largest eigenvalue of a
matrix infinitely divisible (i.d.) series, which is a finite sum of fixed
matrices weighted by i.d. random variables. We obtain several types of tail
inequalities, including Bennett-type and Bernstein-type inequalities. This
allows us to further bound the expectation of the spectral norm of a matrix
i.d. series. Moreover, by developing a new lower-bound function for
that appears in the Bennett-type inequality, we derive
a tighter tail inequality of the largest eigenvalue of the matrix i.d. series
than the Bernstein-type inequality when the matrix dimension is high. The
resulting lower-bound function is of independent interest and can improve any
Bennett-type concentration inequality that involves the function . The
class of i.d. probability distributions is large and includes Gaussian and
Poisson distributions, among many others. Therefore, our results encompass the
existing work \cite{tropp2012user} on matrix Gaussian series as a special case.
Lastly, we show that the tail inequalities of a matrix i.d. series have
applications in several optimization problems including the chance constrained
optimization problem and the quadratic optimization problem with orthogonality
constraints.Comment: Comments Welcome
GBHT: Gradient Boosting Histogram Transform for Density Estimation
In this paper, we propose a density estimation algorithm called
\textit{Gradient Boosting Histogram Transform} (GBHT), where we adopt the
\textit{Negative Log Likelihood} as the loss function to make the boosting
procedure available for the unsupervised tasks. From a learning theory
viewpoint, we first prove fast convergence rates for GBHT with the smoothness
assumption that the underlying density function lies in the space
. Then when the target density function lies in spaces
, we present an upper bound for GBHT which is smaller than the
lower bound of its corresponding base learner, in the sense of convergence
rates. To the best of our knowledge, we make the first attempt to theoretically
explain why boosting can enhance the performance of its base learners for
density estimation problems. In experiments, we not only conduct performance
comparisons with the widely used KDE, but also apply GBHT to anomaly detection
to showcase a further application of GBHT.Comment: Accepted to ICML2021. arXiv admin note: text overlap with
arXiv:2106.0198