Continual learning with an increasing number of classes is a challenging
task. The difficulty rises when each example is presented exactly once, which
requires the model to learn online. Recent methods with classic parameter
optimization procedures have been shown to struggle in such setups or have
limitations like non-differentiable components or memory buffers. For this
reason, we present the fully differentiable ensemble method that allows us to
efficiently train an ensemble of neural networks in the end-to-end regime. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods. The conducted experiments have also shown a
significant increase in the performance for small ensembles, which demonstrates
the capability of obtaining relatively high classification accuracy with a
reduced number of classifiers