1,142 research outputs found
Data Leakage via Access Patterns of Sparse Features in Deep Learning-based Recommendation Systems
Online personalized recommendation services are generally hosted in the cloud
where users query the cloud-based model to receive recommended input such as
merchandise of interest or news feed. State-of-the-art recommendation models
rely on sparse and dense features to represent users' profile information and
the items they interact with. Although sparse features account for 99% of the
total model size, there was not enough attention paid to the potential
information leakage through sparse features. These sparse features are employed
to track users' behavior, e.g., their click history, object interactions, etc.,
potentially carrying each user's private information. Sparse features are
represented as learned embedding vectors that are stored in large tables, and
personalized recommendation is performed by using a specific user's sparse
feature to index through the tables. Even with recently-proposed methods that
hides the computation happening in the cloud, an attacker in the cloud may be
able to still track the access patterns to the embedding tables. This paper
explores the private information that may be learned by tracking a
recommendation model's sparse feature access patterns. We first characterize
the types of attacks that can be carried out on sparse features in
recommendation models in an untrusted cloud, followed by a demonstration of how
each of these attacks leads to extracting users' private information or
tracking users by their behavior over time
Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems
Modern deep learning-based recommendation systems exploit hundreds to
thousands of different categorical features, each with millions of different
categories ranging from clicks to posts. To respect the natural diversity
within the categorical data, embeddings map each category to a unique dense
representation within an embedded space. Since each categorical feature could
take on as many as tens of millions of different possible categories, the
embedding tables form the primary memory bottleneck during both training and
inference. We propose a novel approach for reducing the embedding size in an
end-to-end fashion by exploiting complementary partitions of the category set
to produce a unique embedding vector for each category without explicit
definition. By storing multiple smaller embedding tables based on each
complementary partition and combining embeddings from each table, we define a
unique embedding for each category at smaller memory cost. This approach may be
interpreted as using a specific fixed codebook to ensure uniqueness of each
category's representation. Our experimental results demonstrate the
effectiveness of our approach over the hashing trick for reducing the size of
the embedding tables in terms of model loss and accuracy, while retaining a
similar reduction in the number of parameters.Comment: 11 pages, 7 figures, 1 tabl
Clustering the Sketch: A Novel Approach to Embedding Table Compression
Embedding tables are used by machine learning systems to work with
categorical features. In modern Recommendation Systems, these tables can be
very large, necessitating the development of new methods for fitting them in
memory, even during training. We suggest Clustered Compositional Embeddings
(CCE) which combines clustering-based compression like quantization to
codebooks with dynamic methods like The Hashing Trick and Compositional
Embeddings (Shi et al., 2020). Experimentally CCE achieves the best of both
worlds: The high compression rate of codebook-based quantization, but
*dynamically* like hashing-based methods, so it can be used during training.
Theoretically, we prove that CCE is guaranteed to converge to the optimal
codebook and give a tight bound for the number of iterations required
Distributed Equivalent Substitution Training for Large-Scale Recommender Systems
We present Distributed Equivalent Substitution (DES) training, a novel
distributed training framework for large-scale recommender systems with dynamic
sparse features. DES introduces fully synchronous training to large-scale
recommendation system for the first time by reducing communication, thus making
the training of commercial recommender systems converge faster and reach better
CTR. DES requires much less communication by substituting the weights-rich
operators with the computationally equivalent sub-operators and aggregating
partial results instead of transmitting the huge sparse weights directly
through the network. Due to the use of synchronous training on large-scale Deep
Learning Recommendation Models (DLRMs), DES achieves higher AUC(Area Under
ROC). We successfully apply DES training on multiple popular DLRMs of
industrial scenarios. Experiments show that our implementation outperforms the
state-of-the-art PS-based training framework, achieving up to 68.7%
communication savings and higher throughput compared to other PS-based
recommender systems.Comment: Accepted by SIGIR '2020. Proceedings of the 43rd International ACM
SIGIR Conference on Research and Development in Information Retrieval. 202
- …