Numerous models describing the human emotional states have been built by the
psychology community. Alongside, Deep Neural Networks (DNN) are reaching
excellent performances and are becoming interesting features extraction tools
in many computer vision tasks.Inspired by works from the psychology community,
we first study the link between the compact two-dimensional representation of
the emotion known as arousal-valence, and discrete emotion classes (e.g. anger,
happiness, sadness, etc.) used in the computer vision community. It enables to
assess the benefits -- in terms of discrete emotion inference -- of adding an
extra dimension to arousal-valence (usually named dominance). Building on these
observations, we propose CAKE, a 3-dimensional representation of emotion
learned in a multi-domain fashion, achieving accurate emotion recognition on
several public datasets. Moreover, we visualize how emotions boundaries are
organized inside DNN representations and show that DNNs are implicitly learning
arousal-valence-like descriptions of emotions. Finally, we use the CAKE
representation to compare the quality of the annotations of different public
datasets