165 research outputs found
์ ๋ณด ์์ค์ ์ด์ฉํ ๊ฐ๊ฑดํ ์๋น๊ณต๊ฒฉ ๋ฐฉ์ด ์๊ณ ๋ฆฌ์ฆ ์ค๊ณ ๋ฐ ๋ถ์
ํ์๋
ผ๋ฌธ (๋ฐ์ฌ)-- ์์ธ๋ํ๊ต ๋ํ์ : ์ ๊ธฐยท์ปดํจํฐ๊ณตํ๋ถ, 2014. 8. ๊น์ข
๊ถ.์ถ์ฒ ์์คํ
(Recommender System, RS)์ ๊ถ๊ทน์ ์ธ ์๋น์ (์ฆ, ์ถ์ฒ ์์คํ
์ฌ์ฉ์)์๊ฒ ์์
์ ์ธ ์์ดํ
๋ค์ ์ถ์ฒํด ์ฃผ๋ ๊ฒ์ด ์ฃผ์ ๊ธฐ๋ฅ์ด๋ค. ์ถ์ฒ ์์คํ
์์ ์ ํํ ์ ๋ณด๋ฅผ ์ ๊ณตํ๋ ๊ฒ์ ์ถ์ฒ ์๋น์ค ๊ณต๊ธ์์ ์์คํ
์ฌ์ฉ์ ๋ชจ๋์๊ฒ ์ค์ํ๋ค. ์จ๋ผ์ธ ์์
๋คํธ์ํฌ์ ํ์ฐ์ผ๋ก ์ถ์ฒ ์์คํ
์ ์ํฅ๋ ฅ์ ๊ธ๊ฒฉํ ์ฆ๊ฐํ๊ณ ์๋ค. ๋ฐ๋ฉด์ ์ถ์ฒ ์์คํ
์ ์๋์๋ ๋ฐ๋๋ก ์ ๋ณด๋ฅผ ์กฐ์ํ๋ ๊ฑฐ์ง ์์ด๋ดํฐํฐ๋ค์ ์ฌ์ฉํ ์
์์ ์ธ ์ฌ์ฉ์๋ค์ ์ถ์ฒ ์์คํ
์ ๋ํ ๊ณต๊ฒฉ์ด ์ฆ๊ฐํ๊ณ ์๋ค. ์ด๋ฌํ ๊ฑฐ์ง ์์ด๋ดํฐํฐ๋ค์ ํ์ฉํ ๊ณต๊ฒฉ์ ์๋น(Sybil) ๊ณต๊ฒฉ์ด๋ผ ๋ถ๋ฅธ๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ๋ค๋ฅธ ์ฐ๊ตฌ์์ ์๊ฐ๋ ์ ์ด ์๋ ์ด๋๋ฏธ์
ํต์ ๊ฐ๋
์ ํ์ฉํ RobuRec์ด๋ผ ๋ถ๋ฆฌ๋ ์๋ก์ด ๊ฐ๊ฑดํ ์ถ์ฒ ์์คํ
์ ์ ์ํ๋ค. ์ด๋๋ฏธ์
ํต์ ๋ผ๋ ๊ฐ๋ ฅํ ๊ฐ๋
์ ํ์ฉํ์ฌ ์ ์งํ ์ฌ์ฉ์๊ฐ ์์ฑํ ํ๊ฐ์ธ์ง ํน์ ์๋น ์์ด๋ดํฐํฐ๋ค์ ํ์ฉํ ์
์์ ์ธ ํ๊ฐ์ธ์ง์ ๊ด๊ณ์์ด ๊ณ ์ ๋ขฐ ์์ค์ ์ถ์ฒ์ ์์ธกํ ์ ์๋ค. RobuRec ์์คํ
์ ์ฑ๋ฅ์ ๋ณด์ด๊ธฐ ์ํด, ๋ณธ ๋
ผ๋ฌธ์์๋ ์ฌ๋ฌ๊ฐ์ง ๊ฐ๋ฅํ ์๋น ๊ณต๊ฒฉ ์๋๋ฆฌ์ค๋ ๋ฌผ๋ก ๋ค์ํ ๋ฐ์ดํฐ์
์ ํ์ฉํ์ฌ ๊ด๋ฒ์ํ ์คํ์ ์ํํ์๋ค. RobuRec์ ์คํ ๋ฐ ๋ถ์์ ํตํด RobuRec๊ณผ ๋น๊ต ๊ฐ๋ฅํ PCA (Principal Component Analysis) ๋ฐฉ์ ๋ฐ LTSMF (Least Trimmed Squared Matrix Factorization) ๋ฐฉ์๋ณด๋ค ํ๋ฆฌ๋์
์ฌํํธ (Prediction Shift, PS) ๋ฐ ์ ์ค ๋น์จ(Hit Ratio, HR)์์ ์๋ฑํ ์ฑ๋ฅ์ ๋ณด์ฌ ์ฃผ์๋ค.As the major function of Recommender Systems (RSs) is recommending commercial items to potential consumers (i.e., system users), providing correct information
of RS is crucial to both RS providers and system users. The influence of RS over Online Social Networks (OSNs) is expanding rapidly, whereas malicious users continuously
try to attack the RSs with fake identities (i.e., Sybils) by manipulating the information in the RS adversely. In this thesis, we propose a novel robust recommendation
algorithm called RobuRec which exploits a distinctive feature, admission control. RobuRec provides highly Trusted recommendation results since RobuRec predicts appropriate recommendations regardless of whether the ratings are given by honest users or by Sybils thanks to the power of admission control. To demonstrate the performance of RobuRec, we have conducted extensive exper
iments with various datasets as well as diverse attack scenarios. The evaluation results confirm that RobuRec outperforms the comparable schemes such as Principal
Component Analysis (PCA) and Least Trimmed Squared Matrix Factorization (LTSMF) significantly in terms of Prediction Shift (PS) and Hit Ratio (HR).Chapter 1 Introduction 1
1.1 Background . . . . . . . . . . . . . . . . . . . 1
1.2 Goal and Contribution . . . . . . . . . . . . . . 3
1.3 Thesis Organization . . . . . . . . . . . . . . . 6
Chapter 2 Related Work 7
2.1 RS approaches . . . . . . . . . . . . . . . . . . 7
2.2 Sybil Attack Defense . . . . . . . . . . . . . . 9
2.3 Robust RS Approaches . . . . . . . . . . . . . . 10
Chapter 3 System Model 13
3.1 Target Applications . . . . . . . . . . . . . . 17
3.2 Strong Attacker . . . . . . . . . . . . . . . . 17
3.3 Attack Model . . . . . . . . . . . . . . . . . . 18
3.4 Model Assumptions . . . . . . . . . . . . . . . 21
Chapter 4 RobuRec Design 23
4.1 Algorithm Intuition . . . . . . . . . . . . . . 23
4.2 Initialization Phase . . . . . . . . . . . . . . 25
4.3 Admission Control Phase . . . . . . . . . . . . 26
4.4 Rating Prediction Phase . . . . . . . . . . . . 30
4.5 Dynamic Parameter Control . . . . . . . . . . . 35
4.5.1 Simplifying Control Parameters . . . . . . . . 36
4.5.2 Dynamic Cmax Control . . . . . . . . . . . . . 37
4.5.3 Dynamic Global and Local Control . . . . . . 42
Chapter 5 Evaluation and Analysis 45
5.1 Evaluation Metrics . . . . . . . . . . . . . . . 45
5.2 Parameter (alpha) Study . . . . . . . . . . . . 47
5.3 Datasets and Setup . . . . . . . . . . . . . . . 48
5.4 Results and Analysis . . . . . . . . . . . . . . 52
5.4.1 Performance on PS . . . . . . . . . . . . . . 52
5.4.2 Impact of Filler Size . . . . . . . . . . . . 55
5.4.3 Impact of Target Selection Strategy . . . . . 58
5.4.4 Dynamic Parameter Control . . . . . . . . . . 59
5.4.5 Performance on HR . . . . . . . . . . . . . . 62
5.4.6 Analysis on Escaping Probability . . . . . . . 63
Chapter 6 Conclusion 67Docto
Context-aware food recommendation system
Recommendation systems are commonly used in websites with large datasets, frequently used in e-commerce or multimedia streaming services. These systems effectively help users in the task of finding items of their interest, while also being helpful from the perspective of the service or product provider. However, successful applications to other domains are less common, and the number of personalized food recommendation systems is surprisingly small although this particular domain could benefit significantly from recommendation knowledge. This work proposes a contextaware food recommendation system for well-being care applications, using mobile devices, beacons, medical records and a recommender engine. Users passing near a food place receives food recommendation based on available offers order by appropriate foods for everyoneโs health at the table in real time. We also use a new robust recipe recommendation method based on matrix factorization and feature engineering, both supported by contextual information and statistical aggregation of information from users and items. The results got from the application of this method to three heterogeneous datasets of recipeโs user ratings, showed that gains are achieved regarding recommendation performance independently of the dataset size, the items textual properties or even the rating values distribution.info:eu-repo/semantics/publishedVersio
Learning informative priors from heterogeneous domains to improve recommendation in cold-start user domains
ยฉ 2016 ACM. In the real-world environment, users have sufficient experience in their focused domains but lack experience in other domains. Recommender systems are very helpful for recommending potentially desirable items to users in unfamiliar domains, and cross-domain collaborative filtering is therefore an important emerging research topic. However, it is inevitable that the cold-start issue will be encountered in unfamiliar domains due to the lack of feedback data. The Bayesian approach shows that priors play an important role when there are insufficient data, which implies that recommendation performance can be significantly improved in cold-start domains if informative priors can be provided. Based on this idea, we propose a Weighted Irregular Tensor Factorization (WITF) model to leverage multi-domain feedback data across all users to learn the cross-domain priors w.r.t. both users and items. The features learned from WITF serve as the informative priors on the latent factors of users and items in terms of weighted matrix factorization models. Moreover, WITF is a unified framework for dealing with both explicit feedback and implicit feedback. To prove the effectiveness of our approach, we studied three typical real-world cases in which a collection of empirical evaluations were conducted on real-world datasets to compare the performance of our model and other state-of-the-art approaches. The results show the superiority of our model over comparison models
Non-convex Optimization for Machine Learning
A vast majority of machine learning algorithms train their models and perform
inference by solving optimization problems. In order to capture the learning
and prediction problems accurately, structural constraints such as sparsity or
low rank are frequently imposed or else the objective itself is designed to be
a non-convex function. This is especially true of algorithms that operate in
high-dimensional spaces or that train non-linear models such as tensor models
and deep networks.
The freedom to express the learning problem as a non-convex optimization
problem gives immense modeling power to the algorithm designer, but often such
problems are NP-hard to solve. A popular workaround to this has been to relax
non-convex problems to convex ones and use traditional methods to solve the
(convex) relaxed optimization problems. However this approach may be lossy and
nevertheless presents significant challenges for large scale optimization.
On the other hand, direct approaches to non-convex optimization have met with
resounding success in several domains and remain the methods of choice for the
practitioner, as they frequently outperform relaxation-based techniques -
popular heuristics include projected gradient descent and alternating
minimization. However, these are often poorly understood in terms of their
convergence and other properties.
This monograph presents a selection of recent advances that bridge a
long-standing gap in our understanding of these heuristics. The monograph will
lead the reader through several widely used non-convex optimization techniques,
as well as applications thereof. The goal of this monograph is to both,
introduce the rich literature in this area, as well as equip the reader with
the tools and techniques needed to analyze these simple procedures for
non-convex problems.Comment: The official publication is available from now publishers via
http://dx.doi.org/10.1561/220000005
Recommender Systems and their Security Concerns
Instead of simply using two-dimensional User ร Item features, advanced recommender systems
rely on more additional dimensions (e.g. time, location, social network) in order to provide better recommendation services. In the first part of this paper, we will survey a variety of dimension features and show how they are integrated into the recommendation process. When the service providers collect more and more personal information, it brings great privacy concerns to the public. On another side, the service providers could also suffer from attacks launched by malicious users who want to bias the recommendations. In the second part of this paper, we will survey attacks from and against recommender service providers, and existing solutions
Statistical Models and Optimization Algorithms for High-Dimensional Computer Vision Problems
Data-driven and computational approaches are showing significant promise in solving several challenging problems in various fields such as bioinformatics, finance and many branches of engineering. In this dissertation, we explore the potential of these approaches, specifically statistical data models and optimization algorithms, for solving several challenging problems in computer vision. In doing so, we contribute to the literatures of both statistical data models and computer vision. In the context of statistical data models, we propose principled approaches for solving robust regression problems, both linear and kernel, and missing data matrix factorization problem. In computer vision, we propose statistically optimal and efficient algorithms for solving the remote face recognition and structure from motion (SfM) problems.
The goal of robust regression is to estimate the functional relation between two variables from a given data set which might be contaminated with outliers. Under the reasonable assumption that there are fewer outliers than inliers in a data set, we formulate the robust linear regression problem as a sparse learning problem, which can be solved using efficient polynomial-time algorithms. We also provide sufficient conditions under which the proposed algorithms correctly solve the robust regression problem. We then extend our robust formulation to the case of kernel regression, specifically to propose a robust version for relevance vector machine (RVM) regression.
Matrix factorization is used for finding a low-dimensional representation for data embedded in a high-dimensional space. Singular value decomposition is the standard algorithm for solving this problem. However, when the matrix has many missing elements this is a hard problem to solve. We formulate the missing data matrix factorization problem as a low-rank semidefinite programming problem (essentially a rank constrained SDP), which allows us to find accurate and efficient solutions for large-scale factorization problems.
Face recognition from remotely acquired images is a challenging problem because of variations due to blur and illumination. Using the convolution model for blur, we show that the set of all images obtained by blurring a given image forms a convex set. We then use convex optimization techniques to find the distances between a given blurred (probe) image and the gallery images to find the best match. Further, using a low-dimensional linear subspace model for illumination variations, we extend our theory in a similar fashion to recognize blurred and poorly illuminated faces.
Bundle adjustment is the final optimization step of the SfM problem where the goal is to obtain the 3-D structure of the observed scene and the camera parameters from multiple images of the scene. The traditional bundle adjustment algorithm, based on minimizing the l_2 norm of the image re-projection error, has cubic complexity in the number of unknowns. We propose an algorithm, based on minimizing the l_infinity norm of the re-projection error, that has quadratic complexity in the number of unknowns. This is achieved by reducing the large-scale optimization problem into many small scale sub-problems each of which can be solved using second-order cone programming
Low-rank Matrix Completion using Alternating Minimization
Alternating minimization represents a widely applicable and empirically
successful approach for finding low-rank matrices that best fit the given data.
For example, for the problem of low-rank matrix completion, this method is
believed to be one of the most accurate and efficient, and formed a major
component of the winning entry in the Netflix Challenge.
In the alternating minimization approach, the low-rank target matrix is
written in a bi-linear form, i.e. ; the algorithm then alternates
between finding the best and the best . Typically, each alternating step
in isolation is convex and tractable. However the overall problem becomes
non-convex and there has been almost no theoretical understanding of when this
approach yields a good result.
In this paper we present first theoretical analysis of the performance of
alternating minimization for matrix completion, and the related problem of
matrix sensing. For both these problems, celebrated recent results have shown
that they become well-posed and tractable once certain (now standard)
conditions are imposed on the problem. We show that alternating minimization
also succeeds under similar conditions. Moreover, compared to existing results,
our paper shows that alternating minimization guarantees faster (in particular,
geometric) convergence to the true matrix, while allowing a simpler analysis
Untargeted Attack against Federated Recommendation Systems via Poisonous Item Embeddings and the Defense
Federated recommendation (FedRec) can train personalized recommenders without
collecting user data, but the decentralized nature makes it susceptible to
poisoning attacks. Most previous studies focus on the targeted attack to
promote certain items, while the untargeted attack that aims to degrade the
overall performance of the FedRec system remains less explored. In fact,
untargeted attacks can disrupt the user experience and bring severe financial
loss to the service provider. However, existing untargeted attack methods are
either inapplicable or ineffective against FedRec systems. In this paper, we
delve into the untargeted attack and its defense for FedRec systems. (i) We
propose ClusterAttack, a novel untargeted attack method. It uploads poisonous
gradients that converge the item embeddings into several dense clusters, which
make the recommender generate similar scores for these items in the same
cluster and perturb the ranking order. (ii) We propose a uniformity-based
defense mechanism (UNION) to protect FedRec systems from such attacks. We
design a contrastive learning task that regularizes the item embeddings toward
a uniform distribution. Then the server filters out these malicious gradients
by estimating the uniformity of updated item embeddings. Experiments on two
public datasets show that ClusterAttack can effectively degrade the performance
of FedRec systems while circumventing many defense methods, and UNION can
improve the resistance of the system against various untargeted attacks,
including our ClusterAttack.Comment: Accepted by AAAI 202
Matrix Completion With Noise
On the heels of compressed sensing, a remarkable new field has very recently
emerged. This field addresses a broad range of problems of significant
practical interest, namely, the recovery of a data matrix from what appears to
be incomplete, and perhaps even corrupted, information. In its simplest form,
the problem is to recover a matrix from a small sample of its entries, and
comes up in many areas of science and engineering including collaborative
filtering, machine learning, control, remote sensing, and computer vision to
name a few.
This paper surveys the novel literature on matrix completion, which shows
that under some suitable conditions, one can recover an unknown low-rank matrix
from a nearly minimal set of entries by solving a simple convex optimization
problem, namely, nuclear-norm minimization subject to data constraints.
Further, this paper introduces novel results showing that matrix completion is
provably accurate even when the few observed entries are corrupted with a small
amount of noise. A typical result is that one can recover an unknown n x n
matrix of low rank r from just about nr log^2 n noisy samples with an error
which is proportional to the noise level. We present numerical results which
complement our quantitative analysis and show that, in practice, nuclear norm
minimization accurately fills in the many missing entries of large low-rank
matrices from just a few noisy samples. Some analogies between matrix
completion and compressed sensing are discussed throughout.Comment: 11 pages, 4 figures, 1 tabl
- โฆ