12,160 research outputs found
LambdaOpt: Learn to Regularize Recommender Models in Finer Levels
Recommendation models mainly deal with categorical variables, such as
user/item ID and attributes. Besides the high-cardinality issue, the
interactions among such categorical variables are usually long-tailed, with the
head made up of highly frequent values and a long tail of rare ones. This
phenomenon results in the data sparsity issue, making it essential to
regularize the models to ensure generalization. The common practice is to
employ grid search to manually tune regularization hyperparameters based on the
validation data. However, it requires non-trivial efforts and large computation
resources to search the whole candidate space; even so, it may not lead to the
optimal choice, for which different parameters should have different
regularization strengths. In this paper, we propose a hyperparameter
optimization method, LambdaOpt, which automatically and adaptively enforces
regularization during training. Specifically, it updates the regularization
coefficients based on the performance of validation data. With LambdaOpt, the
notorious tuning of regularization hyperparameters can be avoided; more
importantly, it allows fine-grained regularization (i.e. each parameter can
have an individualized regularization coefficient), leading to better
generalized models. We show how to employ LambdaOpt on matrix factorization, a
classical model that is representative of a large family of recommender models.
Extensive experiments on two public benchmarks demonstrate the superiority of
our method in boosting the performance of top-K recommendation.Comment: Accepted by KDD 201
Joint Deep Modeling of Users and Items Using Reviews for Recommendation
A large amount of information exists in reviews written by users. This source
of information has been ignored by most of the current recommender systems
while it can potentially alleviate the sparsity problem and improve the quality
of recommendations. In this paper, we present a deep model to learn item
properties and user behaviors jointly from review text. The proposed model,
named Deep Cooperative Neural Networks (DeepCoNN), consists of two parallel
neural networks coupled in the last layers. One of the networks focuses on
learning user behaviors exploiting reviews written by the user, and the other
one learns item properties from the reviews written for the item. A shared
layer is introduced on the top to couple these two networks together. The
shared layer enables latent factors learned for users and items to interact
with each other in a manner similar to factorization machine techniques.
Experimental results demonstrate that DeepCoNN significantly outperforms all
baseline recommender systems on a variety of datasets.Comment: WSDM 201
Attentive Aspect Modeling for Review-aware Recommendation
In recent years, many studies extract aspects from user reviews and integrate
them with ratings for improving the recommendation performance. The common
aspects mentioned in a user's reviews and a product's reviews indicate indirect
connections between the user and product. However, these aspect-based methods
suffer from two problems. First, the common aspects are usually very sparse,
which is caused by the sparsity of user-product interactions and the diversity
of individual users' vocabularies. Second, a user's interests on aspects could
be different with respect to different products, which are usually assumed to
be static in existing methods. In this paper, we propose an Attentive
Aspect-based Recommendation Model (AARM) to tackle these challenges. For the
first problem, to enrich the aspect connections between user and product,
besides common aspects, AARM also models the interactions between synonymous
and similar aspects. For the second problem, a neural attention network which
simultaneously considers user, product and aspect information is constructed to
capture a user's attention towards aspects when examining different products.
Extensive quantitative and qualitative experiments show that AARM can
effectively alleviate the two aforementioned problems and significantly
outperforms several state-of-the-art recommendation methods on top-N
recommendation task.Comment: Camera-ready manuscript for TOI
Improving Usability And Scalability Of Big Data Workflows In The Cloud
Big data workflows have recently emerged as the next generation of data-centric workflow technologies to address the five “V” challenges of big data: volume, variety, velocity, veracity, and value. More formally, a big data workflow is the computerized modeling and automation of a process consisting of a set of computational tasks and their data interdependencies to process and analyze data of ever increasing in scale, complexity, and rate of acquisition. The convergence of big data and workflows creates new challenges in workflow community.
First, the variety of big data results in a need for integrating large number of remote Web services and other heterogeneous task components that can consume and produce data in various formats and models into a uniform and interoperable workflow. Existing approaches fall short in addressing the so-called shimming problem only in an adhoc manner and unable to provide a generic solution. We automatically insert a piece of code called shims or adaptors in order to resolve the data type mismatches.
Second, the volume of big data results in a large number of datasets that needs to be queried and analyzed in an effective and personalized manner. Further, there is also a strong need for sharing, reusing, and repurposing existing tasks and workflows across different users and institutes. To overcome such limitations, we propose a folksonomy- based social workflow recommendation system to improve workflow design productivity and efficient dataset querying and analyzing.
Third, the volume of big data results in the need to process and analyze data of ever increasing in scale, complexity, and rate of acquisition. But a scalable distributed data model is still missing that abstracts and automates data distribution, parallelism, and scalable processing. We propose a NoSQL collectional data model that addresses this limitation.
Finally, the volume of big data combined with the unbound resource leasing capability foreseen in the cloud, facilitates data scientists to wring actionable insights from the data in a time and cost efficient manner. We propose BARENTS scheduler that supports high-performance workflow scheduling in a heterogeneous cloud-computing environment with a single objective to minimize the workflow makespan under a user provided budget constraint
- …