We discuss the problem of experimentally evaluating linear-time temporal
logic (LTL) synthesis tools for reactive systems. We first survey previous such
work for the currently publicly available synthesis tools, and then draw
conclusions by deriving useful schemes for future such evaluations.
In particular, we explain why previous tools have incompatible scopes and
semantics and provide a framework that reduces the impact of this problem for
future experimental comparisons of such tools. Furthermore, we discuss which
difficulties the complex workflows that begin to appear in modern synthesis
tools induce on experimental evaluations and give answers to the question how
convincing such evaluations can still be performed in such a setting.Comment: In Proceedings iWIGP 2011, arXiv:1102.374