In real-world scenarios, recommenders face non-functional requirements of technical nature and must handle dynamic data in the form
of sequential streams. Evaluation of recommender systems must
take these issues into account in order to be maximally informative.
In this paper, we present Idomaar—a framework that enables the
efficient multi-dimensional benchmarking of recommender algorithms. Idomaar goes beyond current academic research practices
by creating a realistic evaluation environment and computing both
effectiveness and technical metrics for stream-based as well as set-based evaluation. A scenario focussing on “research to prototyping
to productization” cycle at a company illustrates Idomaar’s potential.
We show that Idomaar simplifies testing with varying configurations
and supports flexible integration of different data