Search CORE

1,578 research outputs found

Document Clustering Based On Max-Correntropy Non-Negative Matrix Factorization

Author: Li Le
Qin Zhen
Xu Yang
Yang Jianjun
Zhang Honggang
Publication venue
Publication date: 03/10/2014
Field of study

Nonnegative matrix factorization (NMF) has been successfully applied to many areas for classification and clustering. Commonly-used NMF algorithms mainly target on minimizing the

l_2

distance or Kullback-Leibler (KL) divergence, which may not be suitable for nonlinear case. In this paper, we propose a new decomposition method by maximizing the correntropy between the original and the product of two low-rank matrices for document clustering. This method also allows us to learn the new basis vectors of the semantic feature space from the data. To our knowledge, we haven't seen any work has been done by maximizing correntropy in NMF to cluster high dimensional document data. Our experiment results show the supremacy of our proposed method over other variants of NMF algorithm on Reuters21578 and TDT2 databasets.Comment: International Conference of Machine Learning and Cybernetics (ICMLC) 201

arXiv.org e-Print Archive

CiteSeerX

Computing a Nonnegative Matrix Factorization -- Provably

Author: Arora Sanjeev
Ge Rong
Kannan Ravi
Moitra Ankur
Publication venue
Publication date: 03/11/2011
Field of study

In the Nonnegative Matrix Factorization (NMF) problem we are given an

n \times m

nonnegative matrix

M

and an integer

r > 0

. Our goal is to express

M

A W

where

A

and

W

are nonnegative matrices of size

n \times r

and

r \times m

respectively. In some applications, it makes sense to ask instead for the product

AW

to approximate

M

-- i.e. (approximately) minimize \norm{M - AW}_F where \norm{}_F denotes the Frobenius norm; we refer to this as Approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where

A

and

W

are computed using a variety of local search heuristics. Vavasis proved that this problem is NP-complete. We initiate a study of when this problem is solvable in polynomial time: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant

r

. Indeed NMF is most interesting in applications precisely when

r

is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time

(nm)^{o(r)}

, 3-SAT has a sub-exponential time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in

n

m

and

r

under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.Comment: 29 pages, 3 figure

arXiv.org e-Print Archive

CiteSeerX

Princeton University Open Access Repository

Graph Regularized Non-negative Matrix Factorization By Maximizing Correntropy

Author: Fan Zhuoyi
Li Le
Xu Yang
Yang Jianjun
Zhang Honggang
Zhao Kaili
Publication venue
Publication date: 09/05/2014
Field of study

Non-negative matrix factorization (NMF) has proved effective in many clustering and classification tasks. The classic ways to measure the errors between the original and the reconstructed matrix are

l_2

distance or Kullback-Leibler (KL) divergence. However, nonlinear cases are not properly handled when we use these error measures. As a consequence, alternative measures based on nonlinear kernels, such as correntropy, are proposed. However, the current correntropy-based NMF only targets on the low-level features without considering the intrinsic geometrical distribution of data. In this paper, we propose a new NMF algorithm that preserves local invariance by adding graph regularization into the process of max-correntropy-based matrix factorization. Meanwhile, each feature can learn corresponding kernel from the data. The experiment results of Caltech101 and Caltech256 show the benefits of such combination against other NMF algorithms for the unsupervised image clustering

arXiv.org e-Print Archive

CiteSeerX

A Broad Learning Approach for Context-Aware Mobile Application Recommendation

Author: Anmin Lei
Chuanying Pan
Shan Lv
Wenxian Zeng
Xiaoteng Fan
Yang Pan
Zhanjun Ren
Zhendong Zhu
Publication venue
Publication date: 10/07/2017
Field of study

With the rapid development of mobile apps, the availability of a large number of mobile apps in application stores brings challenge to locate appropriate apps for users. Providing accurate mobile app recommendation for users becomes an imperative task. Conventional approaches mainly focus on learning users' preferences and app features to predict the user-app ratings. However, most of them did not consider the interactions among the context information of apps. To address this issue, we propose a broad learning approach for \textbf{C}ontext-\textbf{A}ware app recommendation with \textbf{T}ensor \textbf{A}nalysis (CATA). Specifically, we utilize a tensor-based framework to effectively integrate user's preference, app category information and multi-view features to facilitate the performance of app rating prediction. The multidimensional structure is employed to capture the hidden relationships between multiple app categories with multi-view features. We develop an efficient factorization method which applies Tucker decomposition to learn the full-order interactions within multiple categories and features. Furthermore, we employ a group

\ell_{1}-

norm regularization to learn the group-wise feature importance of each view with respect to each app category. Experiments on two real-world mobile app datasets demonstrate the effectiveness of the proposed method

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare