Detecting test data deviating from training data is a central problem for
safe and robust machine learning. Likelihoods learned by a generative model,
e.g., a normalizing flow via standard log-likelihood training, perform poorly
as an outlier score. We propose to use an unlabelled auxiliary dataset and a
probabilistic outlier score for outlier detection. We use a self-supervised
feature extractor trained on the auxiliary dataset and train a normalizing flow
on the extracted features by maximizing the likelihood on in-distribution data
and minimizing the likelihood on the contrastive dataset. We show that this is
equivalent to learning the normalized positive difference between the
in-distribution and the contrastive feature density. We conduct experiments on
benchmark datasets and compare to the likelihood, the likelihood ratio and
state-of-the-art anomaly detection methods