Search CORE

22 research outputs found

Random Forest Density Estimation

Author: Hang Hanyuan
Wen Hongwei
Publication venue
Publication date: 01/01/2022
Field of study

University of Twente Research Information

Matrix Infinitely Divisible Series: Tail Inequalities and Applications in Optimization

Author: Gao Xianjie
Hang Hanyuan
Hsieh Min-Hsiu
Tao Dacheng
Zhang Chao
Publication venue
Publication date: 03/09/2018
Field of study

In this paper, we study tail inequalities of the largest eigenvalue of a matrix infinitely divisible (i.d.) series, which is a finite sum of fixed matrices weighted by i.d. random variables. We obtain several types of tail inequalities, including Bennett-type and Bernstein-type inequalities. This allows us to further bound the expectation of the spectral norm of a matrix i.d. series. Moreover, by developing a new lower-bound function for

Q(s)=(s+1)\log(s+1)-s

that appears in the Bennett-type inequality, we derive a tighter tail inequality of the largest eigenvalue of the matrix i.d. series than the Bernstein-type inequality when the matrix dimension is high. The resulting lower-bound function is of independent interest and can improve any Bennett-type concentration inequality that involves the function

Q(s)

. The class of i.d. probability distributions is large and includes Gaussian and Poisson distributions, among many others. Therefore, our results encompass the existing work \cite{tropp2012user} on matrix Gaussian series as a special case. Lastly, we show that the tail inequalities of a matrix i.d. series have applications in several optimization problems including the chance constrained optimization problem and the quadratic optimization problem with orthogonality constraints.Comment: Comments Welcome

arXiv.org e-Print Archive

GBHT: Gradient Boosting Histogram Transform for Density Estimation

Author: Cui Jingyi
Hang Hanyuan
Lin Zhouchen
Wang Yisen
Publication venue
Publication date: 10/06/2021
Field of study

In this paper, we propose a density estimation algorithm called \textit{Gradient Boosting Histogram Transform} (GBHT), where we adopt the \textit{Negative Log Likelihood} as the loss function to make the boosting procedure available for the unsupervised tasks. From a learning theory viewpoint, we first prove fast convergence rates for GBHT with the smoothness assumption that the underlying density function lies in the space

C^{0,\alpha}

. Then when the target density function lies in spaces

C^{1,\alpha}

, we present an upper bound for GBHT which is smaller than the lower bound of its corresponding base learner, in the sense of convergence rates. To the best of our knowledge, we make the first attempt to theoretically explain why boosting can enhance the performance of its base learners for density estimation problems. In experiments, we not only conduct performance comparisons with the widely used KDE, but also apply GBHT to anomaly detection to showcase a further application of GBHT.Comment: Accepted to ICML2021. arXiv admin note: text overlap with arXiv:2106.0198

arXiv.org e-Print Archive

University of Twente Research Information

Histogram Transform Ensembles for Large-scale Regression

Author: Hang Hanyuan
Lin Zhouchen
Liu Xiaoyu
Wen Hongwei
Publication venue
Publication date: 01/04/2021
Field of study

University of Twente Research Information