We focus on kernel methods for set-valued inputs and their application to
Bayesian set optimization, notably combinatorial optimization. We investigate
two classes of set kernels that both rely on Reproducing Kernel Hilbert Space
embeddings, namely the ``Double Sum'' (DS) kernels recently considered in
Bayesian set optimization, and a class introduced here called ``Deep
Embedding'' (DE) kernels that essentially consists in applying a radial kernel
on Hilbert space on top of the canonical distance induced by another kernel
such as a DS kernel. We establish in particular that while DS kernels typically
suffer from a lack of strict positive definiteness, vast subclasses of DE
kernels built upon DS kernels do possess this property, enabling in turn
combinatorial optimization without requiring to introduce a jitter parameter.
Proofs of theoretical results about considered kernels are complemented by a
few practicalities regarding hyperparameter fitting. We furthermore demonstrate
the applicability of our approach in prediction and optimization tasks, relying
both on toy examples and on two test cases from mechanical engineering and
hydrogeology, respectively. Experimental results highlight the applicability
and compared merits of the considered approaches while opening new perspectives
in prediction and sequential design with set inputs