287 research outputs found
Manifold Elastic Net: A Unified Framework for Sparse Dimension Reduction
It is difficult to find the optimal sparse solution of a manifold learning
based dimensionality reduction algorithm. The lasso or the elastic net
penalized manifold learning based dimensionality reduction is not directly a
lasso penalized least square problem and thus the least angle regression (LARS)
(Efron et al. \cite{LARS}), one of the most popular algorithms in sparse
learning, cannot be applied. Therefore, most current approaches take indirect
ways or have strict settings, which can be inconvenient for applications. In
this paper, we proposed the manifold elastic net or MEN for short. MEN
incorporates the merits of both the manifold learning based dimensionality
reduction and the sparse learning based dimensionality reduction. By using a
series of equivalent transformations, we show MEN is equivalent to the lasso
penalized least square problem and thus LARS is adopted to obtain the optimal
sparse solution of MEN. In particular, MEN has the following advantages for
subsequent classification: 1) the local geometry of samples is well preserved
for low dimensional data representation, 2) both the margin maximization and
the classification error minimization are considered for sparse projection
calculation, 3) the projection matrix of MEN improves the parsimony in
computation, 4) the elastic net penalty reduces the over-fitting problem, and
5) the projection matrix of MEN can be interpreted psychologically and
physiologically. Experimental evidence on face recognition over various popular
datasets suggests that MEN is superior to top level dimensionality reduction
algorithms.Comment: 33 pages, 12 figure
Non-negative Matrix Factorization: A Survey
CAUL read and publish agreement 2022Publishe
Learning Ideological Latent space in Twitter
People are shifting from traditional news sources to online news at an incredibly fast rate. However, the technology behind online news consumption forces users to be confined to content that confirms with their own point of view. This has led to social phenomena like polarization of point-of-view and intolerance towards opposing views. In this thesis we study information filter bubbles from a mathematical standpoint. We use data mining techniques to learn a liberal-conservative ideology space in Twitter and presents a case study on how such a latent space can be used to tackle the filter bubble problem on social networks.
We model the problem of learning liberal-conservative ideology as a constrained optimization problem. Using matrix factorization we uncover an ideological latent space for content consumption and social interaction habits of users in Twitter. We validate our model on real world Twitter dataset on three controversial topics - "Obamacare", "gun control" and "abortion". Using the proposed technique we are able to separate users by their ideology with 95% purity. Our analysis shows that there is a very high correlation (0.8 - 0.9) between the estimated ideology using machine learning and true ideology collected from various sources.
Finally, we re-examine the learnt latent space, and present a case study showcasing how this ideological latent space can be used to develop exploratory and interactive interfaces that can help in diffusing the information filter bubble. Our matrix factorization based model for learning ideology latent space, along with the case studies provide a theoretically solid as well as a practical and interesting point-of-view to online polarization. Further, it provides a strong foundation and suggests several avenues for future work in multiple emerging interdisciplinary research areas, for instance, humanly interpretable and explanatory machine learning, transparent recommendations and a new field that we coin as Next Generation Social Networks
Ranking Preserving Nonnegative Matrix Factorization
Nonnegative matrix factorization (NMF), a wellknown
technique to find parts-based representations
of nonnegative data, has been widely studied.
In reality, ordinal relations often exist among data,
such as data i is more related to j than to q. Such
relative order is naturally available, and more importantly,
it truly reflects the latent data structure.
Preserving the ordinal relations enables us to find
structured representations of data that are faithful
to the relative order, so that the learned representations
become more discriminative. However, this
cannot be achieved by current NMFs. In this paper,
we make the first attempt towards incorporating the
ordinal relations and propose a novel ranking preserving
nonnegative matrix factorization (RPNMF)
approach, which enforces the learned representations
to be ranked according to the relations.
We derive iterative updating rules to solve RPNMF’s
objective function with convergence guaranteed.
Experimental results with several datasets for
clustering and classification have demonstrated that
RPNMF achieves greater performance against the
state-of-the-arts, not only in terms of accuracy, but
also interpretation of orderly data structure
- …