Search CORE

2,277 research outputs found

Characteristic Kernels and Infinitely Divisible Distributions

Author: Fukumizu Kenji
Nishiyama Yu
Publication venue
Publication date: 24/10/2016
Field of study

We connect shift-invariant characteristic kernels to infinitely divisible distributions on

\mathbb{R}^{d}

. Characteristic kernels play an important role in machine learning applications with their kernel means to distinguish any two probability measures. The contribution of this paper is two-fold. First, we show, using the L\'evy-Khintchine formula, that any shift-invariant kernel given by a bounded, continuous and symmetric probability density function (pdf) of an infinitely divisible distribution on

\mathbb{R}^d

is characteristic. We also present some closure property of such characteristic kernels under addition, pointwise product, and convolution. Second, in developing various kernel mean algorithms, it is fundamental to compute the following values: (i) kernel mean values

m_P(x)

x \in \mathcal{X}

, and (ii) kernel mean RKHS inner products

{\left\langle m_P, m_Q \right\rangle_{\mathcal{H}}}

, for probability measures

P, Q

. If

P, Q

, and kernel

k

are Gaussians, then computation (i) and (ii) results in Gaussian pdfs that is tractable. We generalize this Gaussian combination to more general cases in the class of infinitely divisible distributions. We then introduce a {\it conjugate} kernel and {\it convolution trick}, so that the above (i) and (ii) have the same pdf form, expecting tractable computation at least in some cases. As specific instances, we explore

\alpha

-stable distributions and a rich class of generalized hyperbolic distributions, where the Laplace, Cauchy and Student-t distributions are included

arXiv.org e-Print Archive

CiteSeerX

Towards a Learning Theory of Cause-Effect Inference

Author: Lopez-Paz David
Muandet Krikamol
Schölkopf Bernhard
Tolstikhin Ilya
Publication venue
Publication date: 18/05/2015
Field of study

We pose causal inference as the problem of learning to classify probability distributions. In particular, we assume access to a collection

\{(S_i,l_i)\}_{i=1}^n

, where each

S_i

is a sample drawn from the probability distribution of

X_i \times Y_i

, and

l_i

is a binary label indicating whether "

X_i \to Y_i

" or "

X_i \leftarrow Y_i

". Given these data, we build a causal inference rule in two steps. First, we featurize each

S_i

using the kernel mean embedding associated with some characteristic kernel. Second, we train a binary classifier on such embeddings to distinguish between causal directions. We present generalization bounds showing the statistical consistency and learning rates of the proposed approach, and provide a simple implementation that achieves state-of-the-art cause-effect inference. Furthermore, we extend our ideas to infer causal relationships between more than two variables

arXiv.org e-Print Archive

MPG.PuRe