We present the design and a first performance evaluation of Thrill -- a
prototype of a general purpose big data processing framework with a convenient
data-flow style programming interface. Thrill is somewhat similar to Apache
Spark and Apache Flink with at least two main differences. First, Thrill is
based on C++ which enables performance advantages due to direct native code
compilation, a more cache-friendly memory layout, and explicit memory
management. In particular, Thrill uses template meta-programming to compile
chains of subsequent local operations into a single binary routine without
intermediate buffering and with minimal indirections. Second, Thrill uses
arrays rather than multisets as its primary data structure which enables
additional operations like sorting, prefix sums, window scans, or combining
corresponding fields of several arrays (zipping). We compare Thrill with Apache
Spark and Apache Flink using five kernels from the HiBench suite. Thrill is
consistently faster and often several times faster than the other frameworks.
At the same time, the source codes have a similar level of simplicity and
abstractio