Search CORE

24,213 research outputs found

Low-Rank Boolean Matrix Approximation by Integer Programming

Author: Gunluk Oktay
Hauser Raphael
Kovacs Reka
Publication venue
Publication date: 01/01/2018
Field of study

Low-rank approximations of data matrices are an important dimensionality reduction tool in machine learning and regression analysis. We consider the case of categorical variables, where it can be formulated as the problem of finding low-rank approximations to Boolean matrices. In this paper we give what is to the best of our knowledge the first integer programming formulation that relies on only polynomially many variables and constraints, we discuss how to solve it computationally and report numerical tests on synthetic and real-world data

arXiv.org e-Print Archive

Oxford University Research Archive

Boolean Matrix Factorization Meets Consecutive Ones Property

Author: Miettinen P.
Tatti N.
Publication venue
Publication date: 01/01/2019
Field of study

Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well

MPG.PuRe

Using Underapproximations for Sparse Nonnegative Matrix Factorization

Author: Anstreicher
Berry
Boutsidis
Cichocki
Cichocki
Cichocki
Curry
d’Aspremont
François Glineur
Gao
Golub
Heiler
Hoyer
Kim
Kim
Lee
Lee
Lin
Nicolas Gillis
Paatero
Pauca
Peeters
Shahnaz
Vavasis
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/01/2008
Field of study

Nonnegative Matrix Factorization consists in (approximately) factorizing a nonnegative data matrix by the product of two low-rank nonnegative matrices. It has been successfully applied as a data analysis technique in numerous domains, e.g., text mining, image processing, microarray data analysis, collaborative filtering, etc. We introduce a novel approach to solve NMF problems, based on the use of an underapproximation technique, and show its effectiveness to obtain sparse solutions. This approach, based on Lagrangian relaxation, allows the resolution of NMF problems in a recursive fashion. We also prove that the underapproximation problem is NP-hard for any fixed factorization rank, using a reduction of the maximum edge biclique problem in bipartite graphs. We test two variants of our underapproximation approach on several standard image datasets and show that they provide sparse part-based representations with low reconstruction error. Our results are comparable and sometimes superior to those obtained by two standard Sparse Nonnegative Matrix Factorization techniques.Comment: Version 2 removed the section about convex reformulations, which was not central to the development of our main results; added material to the introduction; added a review of previous related work (section 2.3); completely rewritten the last part (section 4) to provide extensive numerical results supporting our claims. Accepted in J. of Pattern Recognitio

arXiv.org e-Print Archive

CiteSeerX

Crossref

DIAL UCLouvain

Detection of Review Abuse via Semi-Supervised Binary Multi-Target Tensor Decomposition

Author: Feng S.
Hooi B.
Hu C.
Li H.
Rai P.
Rai P.
Ye J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/05/2019
Field of study

Product reviews and ratings on e-commerce websites provide customers with detailed insights about various aspects of the product such as quality, usefulness, etc. Since they influence customers' buying decisions, product reviews have become a fertile ground for abuse by sellers (colluding with reviewers) to promote their own products or to tarnish the reputation of competitor's products. In this paper, our focus is on detecting such abusive entities (both sellers and reviewers) by applying tensor decomposition on the product reviews data. While tensor decomposition is mostly unsupervised, we formulate our problem as a semi-supervised binary multi-target tensor decomposition, to take advantage of currently known abusive entities. We empirically show that our multi-target semi-supervised model achieves higher precision and recall in detecting abusive entities as compared to unsupervised techniques. Finally, we show that our proposed stochastic partial natural gradient inference for our model empirically achieves faster convergence than stochastic gradient and Online-EM with sufficient statistics.Comment: Accepted to the 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019. Contains supplementary material. arXiv admin note: text overlap with arXiv:1804.0383

arXiv.org e-Print Archive

Crossref

A Map of the Inorganic Ternary Metal Nitrides

Author: Arca Elisabetta
Bartel Christopher
Bauers Sage
Ceder Gerbrand
Chen Bor-Rong
Holder Aaron
Lany Stephan
Matthews Bethany
Orvañanos Bernardo
Schelhas Laura T.
Sun Wenhao
Tate Janet
Toney Michael F.
Tumas William
Zakutayev Andriy
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/09/2018
Field of study

Exploratory synthesis in novel chemical spaces is the essence of solid-state chemistry. However, uncharted chemical spaces can be difficult to navigate, especially when materials synthesis is challenging. Nitrides represent one such space, where stringent synthesis constraints have limited the exploration of this important class of functional materials. Here, we employ a suite of computational materials discovery and informatics tools to construct a large stability map of the inorganic ternary metal nitrides. Our map clusters the ternary nitrides into chemical families with distinct stability and metastability, and highlights hundreds of promising new ternary nitride spaces for experimental investigation--from which we experimentally realized 7 new Zn- and Mg-based ternary nitrides. By extracting the mixed metallicity, ionicity, and covalency of solid-state bonding from the DFT-computed electron density, we reveal the complex interplay between chemistry, composition, and electronic structure in governing large-scale stability trends in ternary nitride materials

arXiv.org e-Print Archive

eScholarship - University of California