Search CORE

807 research outputs found

A deep matrix factorization method for learning attribute representations

Author: Bousmalis Konstantinos
Schuller Bjoern W.
Trigeorgis George
Zafeiriou Stefanos
Publication venue
Publication date: 10/09/2015
Field of study

Semi-Non-negative Matrix Factorization is a technique that learns a low-dimensional representation of a dataset that lends itself to a clustering interpretation. It is possible that the mapping between this new representation and our original data matrix contains rather complex hierarchical information with implicit lower-level hidden attributes, that classical one level clustering methodologies can not interpret. In this work we propose a novel model, Deep Semi-NMF, that is able to learn such hidden representations that allow themselves to an interpretation of clustering according to different, unknown attributes of a given dataset. We also present a semi-supervised version of the algorithm, named Deep WSF, that allows the use of (partial) prior information for each of the known attributes of a dataset, that allows the model to be used on datasets with mixed attribute knowledge. Finally, we show that our models are able to learn low-dimensional representations that are better suited for clustering, but also classification, outperforming Semi-Non-negative Matrix Factorization, but also other state-of-the-art methodologies variants.Comment: Submitted to TPAMI (16-Mar-2015

arXiv.org e-Print Archive

OPUS Augsburg

Spiral - Imperial College Digital Repository

Conic Optimization Theory: Convexification Techniques and Numerical Algorithms

Author: ahmadi
andersen
banerjee
benson
bertsekas
bienstock
bühlmann
camacho
candes
candès
carpentier
chen
chen
coffrin
coffrin
coleman
conforti
curto
curto
d'angelo
dullerud
fan
fattahi
fazel
george
glowinski
graepel
graepel
hsieh
josz
josz
josz
josz
kato
krivine
kuang
lall
lanckriet
lasserre
lasserre
laurent
lavaei
liu
ljung
madani
madani
madani
magron
mazumder
nemirovskii
nesterov
parrilo
parrilo
parrilo
pena
powell
pólya
shor
shor
shor
sojoudi
tütüncü
valmorbida
weisser
wright
wu
zhang
zhang
zheng
zhou
Publication venue: 'Elsevier BV'
Publication date: 26/09/2017
Field of study

Optimization is at the core of control theory and appears in several areas of this field, such as optimal control, distributed control, system identification, robust control, state estimation, model predictive control and dynamic programming. The recent advances in various topics of modern optimization have also been revamping the area of machine learning. Motivated by the crucial role of optimization theory in the design, analysis, control and operation of real-world systems, this tutorial paper offers a detailed overview of some major advances in this area, namely conic optimization and its emerging applications. First, we discuss the importance of conic optimization in different areas. Then, we explain seminal results on the design of hierarchies of convex relaxations for a wide range of nonconvex problems. Finally, we study different numerical algorithms for large-scale conic optimization problems.Comment: 18 page

arXiv.org e-Print Archive

Crossref

A Comparative Study of Pairwise Learning Methods based on Kernel Ridge Regression

Author: Airola Antti
De Baets Bernard
Pahikkala Tapio
Stock Michiel
Waegeman Willem
Publication venue
Publication date: 01/01/2018
Field of study

Many machine learning problems can be formulated as predicting labels for a pair of objects. Problems of that kind are often referred to as pairwise learning, dyadic prediction or network inference problems. During the last decade kernel methods have played a dominant role in pairwise learning. They still obtain a state-of-the-art predictive performance, but a theoretical analysis of their behavior has been underexplored in the machine learning literature. In this work we review and unify existing kernel-based algorithms that are commonly used in different pairwise learning settings, ranging from matrix filtering to zero-shot learning. To this end, we focus on closed-form efficient instantiations of Kronecker kernel ridge regression. We show that independent task kernel ridge regression, two-step kernel ridge regression and a linear matrix filter arise naturally as a special case of Kronecker kernel ridge regression, implying that all these methods implicitly minimize a squared loss. In addition, we analyze universality, consistency and spectral filtering properties. Our theoretical results provide valuable insights in assessing the advantages and limitations of existing pairwise learning methods.Comment: arXiv admin note: text overlap with arXiv:1606.0427

arXiv.org e-Print Archive

Ghent University Academic Bibliography

Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem

Author: Hrga Timotej
Hribar Rok
Papa Gregor
Petelin Gašper
Povh Janez
Pržulj Nataša
Vukašinović Vida
Publication venue
Publication date: 10/12/2020
Field of study

In this paper, we consider the symmetric multi-type non-negative matrix tri-factorization problem (SNMTF), which attempts to factorize several symmetric non-negative matrices simultaneously. This can be considered as a generalization of the classical non-negative matrix tri-factorization problem and includes a non-convex objective function which is a multivariate sixth degree polynomial and a has convex feasibility set. It has a special importance in data science, since it serves as a mathematical model for the fusion of different data sources in data clustering. We develop four methods to solve the SNMTF. They are based on four theoretical approaches known from the literature: the fixed point method (FPM), the block-coordinate descent with projected gradient (BCD), the gradient method with exact line search (GM-ELS) and the adaptive moment estimation method (ADAM). For each of these methods we offer a software implementation: for the former two methods we use Matlab and for the latter Python with the TensorFlow library. We test these methods on three data-sets: the synthetic data-set we generated, while the others represent real-life similarities between different objects. Extensive numerical results show that with sufficient computing time all four methods perform satisfactorily and ADAM most often yields the best mean square error (

\mathrm{MSE}

). However, if the computation time is limited, FPM gives the best

\mathrm{MSE}

because it shows the fastest convergence at the beginning. All data-sets and codes are publicly available on our GitLab profile

arXiv.org e-Print Archive

UPCommons. Portal del coneixement obert de la UPC

Repository of the University of Ljubljana