114 research outputs found
Distributed Equivalent Substitution Training for Large-Scale Recommender Systems
We present Distributed Equivalent Substitution (DES) training, a novel
distributed training framework for large-scale recommender systems with dynamic
sparse features. DES introduces fully synchronous training to large-scale
recommendation system for the first time by reducing communication, thus making
the training of commercial recommender systems converge faster and reach better
CTR. DES requires much less communication by substituting the weights-rich
operators with the computationally equivalent sub-operators and aggregating
partial results instead of transmitting the huge sparse weights directly
through the network. Due to the use of synchronous training on large-scale Deep
Learning Recommendation Models (DLRMs), DES achieves higher AUC(Area Under
ROC). We successfully apply DES training on multiple popular DLRMs of
industrial scenarios. Experiments show that our implementation outperforms the
state-of-the-art PS-based training framework, achieving up to 68.7%
communication savings and higher throughput compared to other PS-based
recommender systems.Comment: Accepted by SIGIR '2020. Proceedings of the 43rd International ACM
SIGIR Conference on Research and Development in Information Retrieval. 202
- …