4,729 research outputs found
Collaborative Deep Learning for Recommender Systems
Collaborative filtering (CF) is a successful approach commonly used by many
recommender systems. Conventional CF-based methods use the ratings given to
items by users as the sole source of information for learning to make
recommendation. However, the ratings are often very sparse in many
applications, causing CF-based methods to degrade significantly in their
recommendation performance. To address this sparsity problem, auxiliary
information such as item content information may be utilized. Collaborative
topic regression (CTR) is an appealing recent method taking this approach which
tightly couples the two components that learn from two different sources of
information. Nevertheless, the latent representation learned by CTR may not be
very effective when the auxiliary information is very sparse. To address this
problem, we generalize recent advances in deep learning from i.i.d. input to
non-i.i.d. (CF-based) input and propose in this paper a hierarchical Bayesian
model called collaborative deep learning (CDL), which jointly performs deep
representation learning for the content information and collaborative filtering
for the ratings (feedback) matrix. Extensive experiments on three real-world
datasets from different domains show that CDL can significantly advance the
state of the art
A Survey on Bayesian Deep Learning
A comprehensive artificial intelligence system needs to not only perceive the
environment with different `senses' (e.g., seeing and hearing) but also infer
the world's conditional (or even causal) relations and corresponding
uncertainty. The past decade has seen major advances in many perception tasks
such as visual object recognition and speech recognition using deep learning
models. For higher-level inference, however, probabilistic graphical models
with their Bayesian nature are still more powerful and flexible. In recent
years, Bayesian deep learning has emerged as a unified probabilistic framework
to tightly integrate deep learning and Bayesian models. In this general
framework, the perception of text or images using deep learning can boost the
performance of higher-level inference and in turn, the feedback from the
inference process is able to enhance the perception of text or images. This
survey provides a comprehensive introduction to Bayesian deep learning and
reviews its recent applications on recommender systems, topic models, control,
etc. Besides, we also discuss the relationship and differences between Bayesian
deep learning and other related topics such as Bayesian treatment of neural
networks.Comment: To appear in ACM Computing Surveys (CSUR) 202
Transfer Meets Hybrid: A Synthetic Approach for Cross-Domain Collaborative Filtering with Text
Collaborative filtering (CF) is the key technique for recommender systems
(RSs). CF exploits user-item behavior interactions (e.g., clicks) only and
hence suffers from the data sparsity issue. One research thread is to integrate
auxiliary information such as product reviews and news titles, leading to
hybrid filtering methods. Another thread is to transfer knowledge from other
source domains such as improving the movie recommendation with the knowledge
from the book domain, leading to transfer learning methods. In real-world life,
no single service can satisfy a user's all information needs. Thus it motivates
us to exploit both auxiliary and source information for RSs in this paper. We
propose a novel neural model to smoothly enable Transfer Meeting Hybrid (TMH)
methods for cross-domain recommendation with unstructured text in an end-to-end
manner. TMH attentively extracts useful content from unstructured text via a
memory module and selectively transfers knowledge from a source domain via a
transfer network. On two real-world datasets, TMH shows better performance in
terms of three ranking metrics by comparing with various baselines. We conduct
thorough analyses to understand how the text content and transferred knowledge
help the proposed model.Comment: 11 pages, 7 figures, a full version for the WWW 2019 short pape
Pyramid: Enhancing Selectivity in Big Data Protection with Count Featurization
Protecting vast quantities of data poses a daunting challenge for the growing
number of organizations that collect, stockpile, and monetize it. The ability
to distinguish data that is actually needed from data collected "just in case"
would help these organizations to limit the latter's exposure to attack. A
natural approach might be to monitor data use and retain only the working-set
of in-use data in accessible storage; unused data can be evicted to a highly
protected store. However, many of today's big data applications rely on machine
learning (ML) workloads that are periodically retrained by accessing, and thus
exposing to attack, the entire data store. Training set minimization methods,
such as count featurization, are often used to limit the data needed to train
ML workloads to improve performance or scalability. We present Pyramid, a
limited-exposure data management system that builds upon count featurization to
enhance data protection. As such, Pyramid uniquely introduces both the idea and
proof-of-concept for leveraging training set minimization methods to instill
rigor and selectivity into big data management. We integrated Pyramid into
Spark Velox, a framework for ML-based targeting and personalization. We
evaluate it on three applications and show that Pyramid approaches
state-of-the-art models while training on less than 1% of the raw data
Automated Machine Learning for Deep Recommender Systems: A Survey
Deep recommender systems (DRS) are critical for current commercial online
service providers, which address the issue of information overload by
recommending items that are tailored to the user's interests and preferences.
They have unprecedented feature representations effectiveness and the capacity
of modeling the non-linear relationships between users and items. Despite their
advancements, DRS models, like other deep learning models, employ sophisticated
neural network architectures and other vital components that are typically
designed and tuned by human experts. This article will give a comprehensive
summary of automated machine learning (AutoML) for developing DRS models. We
first provide an overview of AutoML for DRS models and the related techniques.
Then we discuss the state-of-the-art AutoML approaches that automate the
feature selection, feature embeddings, feature interactions, and system design
in DRS. Finally, we discuss appealing research directions and summarize the
survey
PEPNet: Parameter and Embedding Personalized Network for Infusing with Personalized Prior Information
With the increase of content pages and interactive buttons in online services
such as online-shopping and video-watching websites, industrial-scale
recommender systems face challenges in multi-domain and multi-task
recommendations. The core of multi-task and multi-domain recommendation is to
accurately capture user interests in multiple scenarios given multiple user
behaviors. In this paper, we propose a plug-and-play \textit{\textbf{P}arameter
and \textbf{E}mbedding \textbf{P}ersonalized \textbf{Net}work
(\textbf{PEPNet})} for multi-domain and multi-task recommendation. PEPNet takes
personalized prior information as input and dynamically scales the bottom-level
Embedding and top-level DNN hidden units through gate mechanisms.
\textit{Embedding Personalized Network (EPNet)} performs personalized selection
on Embedding to fuse features with different importance for different users in
multiple domains. \textit{Parameter Personalized Network (PPNet)} executes
personalized modification on DNN parameters to balance targets with different
sparsity for different users in multiple tasks. We have made a series of
special engineering optimizations combining the Kuaishou training framework and
the online deployment environment. By infusing personalized selection of
Embedding and personalized modification of DNN parameters, PEPNet tailored to
the interests of each individual obtains significant performance gains, with
online improvements exceeding 1\% in multiple task metrics across multiple
domains. We have deployed PEPNet in Kuaishou apps, serving over 300 million
users every day.Comment: Accepted by KDD 202
- …