19 research outputs found
Localized Lasso for High-Dimensional Regression
We introduce the localized Lasso, which is suited for learning models that
are both interpretable and have a high predictive power in problems with high
dimensionality and small sample size . More specifically, we consider a
function defined by local sparse models, one at each data point. We introduce
sample-wise network regularization to borrow strength across the models, and
sample-wise exclusive group sparsity (a.k.a., norm) to introduce
diversity into the choice of feature sets in the local models. The local models
are interpretable in terms of similarity of their sparsity patterns. The cost
function is convex, and thus has a globally optimal solution. Moreover, we
propose a simple yet efficient iterative least-squares based optimization
procedure for the localized Lasso, which does not need a tuning parameter, and
is guaranteed to converge to a globally optimal solution. The solution is
empirically shown to outperform alternatives for both simulated and genomic
personalized medicine data
Localized lasso for high-dimensional regression
We introduce the localized Lasso, which
learns models that both are interpretable and
have a high predictive power in problems
with high dimensionality d and small sample
size n. More specifically, we consider a function defined by local sparse models, one at
each data point. We introduce sample-wise
network regularization to borrow strength
across the models, and sample-wise exclusive
group sparsity (a.k.a., `1,2 norm) to introduce diversity into the choice of feature sets
in the local models. The local models are
interpretable in terms of similarity of their
sparsity patterns. The cost function is convex, and thus has a globally optimal solution.
Moreover, we propose a simple yet efficient iterative least-squares based optimization procedure for the localized Lasso, which does not
need a tuning parameter, and is guaranteed
to converge to a globally optimal solution.
The solution is empirically shown to outperform alternatives for both simulated and genomic personalized/precision medicine data
Using Posters to Recommend Anime and Mangas in a Cold-Start Scenario
Item cold-start is a classical issue in recommender systems that affects
anime and manga recommendations as well. This problem can be framed as follows:
how to predict whether a user will like a manga that received few ratings from
the community? Content-based techniques can alleviate this issue but require
extra information, that is usually expensive to gather. In this paper, we use a
deep learning technique, Illustration2Vec, to easily extract tag information
from the manga and anime posters (e.g., sword, or ponytail). We propose BALSE
(Blended Alternate Least Squares with Explanation), a new model for
collaborative filtering, that benefits from this extra information to recommend
mangas. We show, using real data from an online manga recommender system called
Mangaki, that our model improves substantially the quality of recommendations,
especially for less-known manga, and is able to provide an interpretation of
the taste of the users.Comment: 6 pages, 3 figures, 1 table, accepted at the MANPU 2017 workshop,
co-located with ICDAR 2017 in Kyoto on November 10, 201
Multi-Task Learning Regression via Convex Clustering
Multi-task learning (MTL) is a methodology that aims to improve the general
performance of estimation and prediction by sharing common information among
related tasks. In the MTL, there are several assumptions for the relationships
and methods to incorporate them. One of the natural assumptions in the
practical situation is that tasks are classified into some clusters with their
characteristics. For this assumption, the group fused regularization approach
performs clustering of the tasks by shrinking the difference among tasks. This
enables us to transfer common information within the same cluster. However,
this approach also transfers the information between different clusters, which
worsens the estimation and prediction. To overcome this problem, we propose an
MTL method with a centroid parameter representing a cluster center of the task.
Because this model separates parameters into the parameters for regression and
the parameters for clustering, we can improve estimation and prediction
accuracy for regression coefficient vectors. We show the effectiveness of the
proposed method through Monte Carlo simulations and applications to real data.Comment: 18 pages, 4 table