11,748 research outputs found
Collaborative Location Recommendation by Integrating Multi-dimensional Contextual Information
Point-of-Interest (POI) recommendation is a new type of recommendation task that comes along with the prevalence of location-based social networks and services in recent years. Compared with traditional recommendation tasks, POI recommendation focuses more on making personalized and context-aware recommendations to improve user experience. Traditionally, the most commonly used contextual information includes geographical and social context information. However, the increasing availability of check-in data makes it possible to design more effective location recommendation applications by modeling and integrating comprehensive types of contextual information, especially the temporal information. In this paper, we propose a collaborative filtering method based on Tensor Factorization, a generalization of the Matrix Factorization approach, to model the multi dimensional contextual information. Tensor Factorization naturally extends Matrix Factorization by increasing the dimensionality of concerns, within which the three-dimensional model is the one most popularly used. Our method exploits a high-order tensor to fuse heterogeneous contextual information about users’ check-ins instead of the traditional two dimensional user-location matrix. The factorization of this tensor leads to a more compact model of the data which is naturally suitable for integrating contextual information to make POI recommendations. Based on the model, we further improve the recommendation accuracy by utilizing the internal relations within users and locations to regularize the latent factors. Experimental results on a large real-world dataset demonstrate the effectiveness of our approach
Tensor Learning for Recovering Missing Information: Algorithms and Applications on Social Media
Real-time social systems like Facebook, Twitter, and Snapchat have been growing
rapidly, producing exabytes of data in different views or aspects. Coupled with more
and more GPS-enabled sharing of videos, images, blogs, and tweets that provide valuable
information regarding “who”, “where”, “when” and “what”, these real-time human
sensor data promise new research opportunities to uncover models of user behavior, mobility,
and information sharing. These real-time dynamics in social systems usually come
in multiple aspects, which are able to help better understand the social interactions of the
underlying network. However, these multi-aspect datasets are often raw and incomplete
owing to various unpredictable or unavoidable reasons; for instance, API limitations and
data sampling policies can lead to an incomplete (and often biased) perspective on these
multi-aspect datasets. This missing data could raise serious concerns such as biased estimations
on structural properties of the network and properties of information cascades in
social networks. In order to recover missing values or information in social systems, we
identify “4S” challenges: extreme sparsity of the observed multi-aspect datasets, adoption
of rich side information that is able to describe the similarities of entities, generation of
robust models rather than limiting them on specific applications, and scalability of models
to handle real large-scale datasets (billions of observed entries). With these challenges
in mind, this dissertation aims to develop scalable and interpretable tensor-based frameworks,
algorithms and methods for recovering missing information on social media. In
particular, this dissertation research makes four unique contributions:
_ The first research contribution of this dissertation research is to propose a scalable
framework based on low-rank tensor learning in the presence of incomplete information.
Concretely, we formally define the problem of recovering the spatio-temporal dynamics of online memes and tackle this problem by proposing a novel tensor-based
factorization approach based on the alternative direction method of multipliers
(ADMM) with the integration of the latent relationships derived from contextual
information among locations, memes, and times.
_ The second research contribution of this dissertation research is to evaluate the generalization
of the proposed tensor learning framework and extend it to the recommendation
problem. In particular, we develop a novel tensor-based approach to
solve the personalized expert recommendation by integrating both the latent relationships
between homogeneous entities (e.g., users and users, experts and experts)
and the relationships between heterogeneous entities (e.g., users and experts, topics
and experts) from the geo-spatial, topical, and social contexts.
_ The third research contribution of this dissertation research is to extend the proposed
tensor learning framework to the user topical profiling problem. Specifically,
we propose a tensor-based contextual regularization model embedded into a matrix
factorization framework, which leverages the social, textual, and behavioral contexts
across users, in order to overcome identified challenges.
_ The fourth research contribution of this dissertation research is to scale up the proposed
tensor learning framework to be capable of handling real large-scale datasets
that are too big to fit in the main memory of a single machine. Particularly, we
propose a novel distributed tensor completion algorithm with the trace-based regularization
of the auxiliary information based on ADMM under the proposed tensor
learning framework, which is designed to scale up to real large-scale tensors (e.g.,
billions of entries) by efficiently computing auxiliary variables, minimizing intermediate
data, and reducing the workload of updating new tensors
Budget-Constrained Item Cold-Start Handling in Collaborative Filtering Recommenders via Optimal Design
It is well known that collaborative filtering (CF) based recommender systems
provide better modeling of users and items associated with considerable rating
history. The lack of historical ratings results in the user and the item
cold-start problems. The latter is the main focus of this work. Most of the
current literature addresses this problem by integrating content-based
recommendation techniques to model the new item. However, in many cases such
content is not available, and the question arises is whether this problem can
be mitigated using CF techniques only. We formalize this problem as an
optimization problem: given a new item, a pool of available users, and a budget
constraint, select which users to assign with the task of rating the new item
in order to minimize the prediction error of our model. We show that the
objective function is monotone-supermodular, and propose efficient optimal
design based algorithms that attain an approximation to its optimum. Our
findings are verified by an empirical study using the Netflix dataset, where
the proposed algorithms outperform several baselines for the problem at hand.Comment: 11 pages, 2 figure
From Common to Special: When Multi-Attribute Learning Meets Personalized Opinions
Visual attributes, which refer to human-labeled semantic annotations, have
gained increasing popularity in a wide range of real world applications.
Generally, the existing attribute learning methods fall into two categories:
one focuses on learning user-specific labels separately for different
attributes, while the other one focuses on learning crowd-sourced global labels
jointly for multiple attributes. However, both categories ignore the joint
effect of the two mentioned factors: the personal diversity with respect to the
global consensus; and the intrinsic correlation among multiple attributes. To
overcome this challenge, we propose a novel model to learn user-specific
predictors across multiple attributes. In our proposed model, the diversity of
personalized opinions and the intrinsic relationship among multiple attributes
are unified in a common-to-special manner. To this end, we adopt a
three-component decomposition. Specifically, our model integrates a common
cognition factor, an attribute-specific bias factor and a user-specific bias
factor. Meanwhile Lasso and group Lasso penalties are adopted to leverage
efficient feature selection. Furthermore, theoretical analysis is conducted to
show that our proposed method could reach reasonable performance. Eventually,
the empirical study carried out in this paper demonstrates the effectiveness of
our proposed method
Quantifying Model Complexity via Functional Decomposition for Better Post-Hoc Interpretability
Post-hoc model-agnostic interpretation methods such as partial dependence
plots can be employed to interpret complex machine learning models. While these
interpretation methods can be applied regardless of model complexity, they can
produce misleading and verbose results if the model is too complex, especially
w.r.t. feature interactions. To quantify the complexity of arbitrary machine
learning models, we propose model-agnostic complexity measures based on
functional decomposition: number of features used, interaction strength and
main effect complexity. We show that post-hoc interpretation of models that
minimize the three measures is more reliable and compact. Furthermore, we
demonstrate the application of these measures in a multi-objective optimization
approach which simultaneously minimizes loss and complexity
- …