Search CORE

131 research outputs found

Scalable Parallel Factorizations of SDD Matrices and Efficient Sampling for Gaussian Graphical Models

Author: Cheng Dehua
Cheng Yu
Liu Yan
Peng Richard
Teng Shang-Hua
Publication venue
Publication date: 20/10/2014
Field of study

Motivated by a sampling problem basic to computational statistical inference, we develop a nearly optimal algorithm for a fundamental problem in spectral graph theory and numerical analysis. Given an

n\times n

SDDM matrix

{\bf \mathbf{M}}

, and a constant

-1 \leq p \leq 1

, our algorithm gives efficient access to a sparse

n\times n

linear operator

\tilde{\mathbf{C}}

such that

{\mathbf{M}}^{p} \approx \tilde{\mathbf{C}} \tilde{\mathbf{C}}^\top.

The solution is based on factoring

{\bf \mathbf{M}}

into a product of simple and sparse matrices using squaring and spectral sparsification. For

{\mathbf{M}}

with

m

non-zero entries, our algorithm takes work nearly-linear in

m

, and polylogarithmic depth on a parallel machine with

m

processors. This gives the first sampling algorithm that only requires nearly linear work and

n

i.i.d. random univariate Gaussian samples to generate i.i.d. random samples for

n

-dimensional Gaussian random fields with SDDM precision matrices. For sampling this natural subclass of Gaussian random fields, it is optimal in the randomness and nearly optimal in the work and parallel complexity. In addition, our sampling algorithm can be directly extended to Gaussian random fields with SDD precision matrices

arXiv.org e-Print Archive

CiteSeerX

Model Selection for Topic Models via Spectral Decomposition

Author: Dehua Cheng
Xinran He
Yan Liu
Publication venue
Publication date: 03/04/2020
Field of study

Abstract Topic models have achieved significant successes in analyzing large-scale text corpus. In practical applications, we are always confronted with the challenge of model selection, i.e., how to appropriately set the number of topics. Following the recent advances in topic models via tensor decomposition, we make a first attempt to provide theoretical analysis on model selection in latent Dirichlet allocation. With mild conditions, we derive the upper bound and lower bound on the number of topics given a text collection of finite size. Experimental results demonstrate that our bounds are correct and tight. Furthermore, using Gaussian mixture model as an example, we show that our methodology can be easily generalized to model selection analysis in other latent models

CiteSeerX

Towards Automated Neural Interaction Discovery for Click-Through Rate Prediction

Author: Cheng Dehua
Hu Xia
Song Qingquan
Tian Yuandong
Yang Jiyan
Zhou Hanning
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/06/2020
Field of study

Click-Through Rate (CTR) prediction is one of the most important machine learning tasks in recommender systems, driving personalized experience for billions of consumers. Neural architecture search (NAS), as an emerging field, has demonstrated its capabilities in discovering powerful neural network architectures, which motivates us to explore its potential for CTR predictions. Due to 1) diverse unstructured feature interactions, 2) heterogeneous feature space, and 3) high data volume and intrinsic data randomness, it is challenging to construct, search, and compare different architectures effectively for recommendation models. To address these challenges, we propose an automated interaction architecture discovering framework for CTR prediction named AutoCTR. Via modularizing simple yet representative interactions as virtual building blocks and wiring them into a space of direct acyclic graphs, AutoCTR performs evolutionary architecture exploration with learning-to-rank guidance at the architecture level and achieves acceleration using low-fidelity model. Empirical analysis demonstrates the effectiveness of AutoCTR on different datasets comparing to human-crafted architectures. The discovered architecture also enjoys generalizability and transferability among different datasets

arXiv.org e-Print Archive

Crossref