2 research outputs found
Neural Architecture for Online Ensemble Continual Learning
Continual learning with an increasing number of classes is a challenging
task. The difficulty rises when each example is presented exactly once, which
requires the model to learn online. Recent methods with classic parameter
optimization procedures have been shown to struggle in such setups or have
limitations like non-differentiable components or memory buffers. For this
reason, we present the fully differentiable ensemble method that allows us to
efficiently train an ensemble of neural networks in the end-to-end regime. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods. The conducted experiments have also shown a
significant increase in the performance for small ensembles, which demonstrates
the capability of obtaining relatively high classification accuracy with a
reduced number of classifiers
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform
Production deployments in complex systems require ML architectures to be
highly efficient and usable against multiple tasks. Particularly demanding are
classification problems in which data arrives in a streaming fashion and each
class is presented separately. Recent methods with stochastic gradient learning
have been shown to struggle in such setups or have limitations like memory
buffers, and being restricted to specific domains that disable its usage in
real-world scenarios. For this reason, we present a fully differentiable
architecture based on the Mixture of Experts model, that enables the training
of high-performance classifiers when examples from each class are presented
separately. We conducted exhaustive experiments that proved its applicability
in various domains and ability to learn online in production environments. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods.Comment: arXiv admin note: text overlap with arXiv:2211.1496