We provide a theoretical foundation for non-parametrically estimating functions of random variables using kernel mean embeddings. We show that for any continuous function f, consistent estimators of the mean embedding of a random variable X lead to consistent estimators of the mean embedding of f(X). For Gaussian kernels and sufficiently smooth functions we also provide rates of convergence. Our results also apply for functions of multiple random variables. If the variables are dependent, we require an estimator of the mean embedding of their joint distribution as a starting point; if they are independent, it is sufficient to have separate mean embeddings of their marginal distributions. In either case, our results cover both mean embeddings expressed based on i.i.d. samples as well as reduced set expansions in terms of dependent expansion points. The latter serves as a justification for using such expansions to limit memory resources when we use the approach as a basis for probabilistic programming.Carl-Johann Simon-Gabriel is supported by a Google European Fellowship in Causal Inference