99 research outputs found

    Constrained Low-Rank Matrix/Tensor Factorisation

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Constrained low-rank matrix and tensor factorisation (MF/TF) have been widely used in machine learning and data analytics. Studies on the way of modelling constraints and the solution of optimisation task in general can provide theoretical supports for applications like image clustering, recommender systems and data compression. This thesis studies three algorithms of constrained low-rank MF/TF. Imposing constraints on each feature vector of factor matrices is a common practice in many constrained low-rank MF algorithms. However, in many real scenarios, the relationships among features can influence the factorisation results as well. In order to better characterise the relationships among features, a novel MF algorithm, Relative Pairwise Relationship Constrained Non-negative Matrix Factorisation, is proposed. It places soft constraints over relative pairwise distances amongst features as regularisations to retain expected relationships after factorisation. It conforms to the so-called \multiplicative update rules" and detailed convergence proofs are provided. Experiments on both synthetic and real datasets have verified that imposing such constraints can keep most expected relationships unchanged after factorisation. Directly adopted on tensor data, low-rank TF can effectively avoid the information loss caused by matricisation. The relationships among features of factor matrices in TF have practical meanings in many real scenarios. To describe such relative relationships in low-rank TF, this thesis proposes Relative Pairwise Relationship Constrained Non-negative Tensor Factorisation. It deals with both Candecomp/Parafac and Tucker decomposition schemes and both squared Euclidean distance and divergence measures. The utilisation of tensor factorisation matricisation equation simplifies the update rules and greatly improves the computation efficiency. Experiments have demonstrated that the proposed algorithm can achieve higher accuracy when adopted on tensor applications. There exists a problem of acquiring out-of-bounds and fluctuating values over predictions when applying low-rank MF on recommender systems. The commonly used solutions, truncation and imposing penalties, can cause the decrease in the number of effective predictions and affect the recommendation accuracy. This thesis creatively proposes Magnitude Bounded Matrix Factorisation to handle the above problem by imposing magnitude constraints for the first time. It first converts the original quadratically constrained quadratic programming task to an unconstrained one which is then solved by the well-known stochastic gradient descent. An acceleration approach for improving computation efficiency, an extracting method for magnitude constraints and a variant of MBMF for non-negative data are also introduced. Experiments have demonstrated that the algorithm is superior to existing bounding algorithms on both computing efficiency and recommendation performance

    BLC: Private Matrix Factorization Recommenders via Automatic Group Learning

    Get PDF
    We propose a privacy-enhanced matrix factorization recommender that exploits the fact that users can often be grouped together by interest. This allows a form of “hiding in the crowd” privacy. We introduce a novel matrix factorization approach suited to making recommendations in a shared group (or “nym”) setting and the BLC algorithm for carrying out this matrix factorization in a privacy-enhanced manner. We demonstrate that the increased privacy does not come at the cost of reduced recommendation accuracy

    A scalable recommender system : using latent topics and alternating least squares techniques

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsA recommender system is one of the major techniques that handles information overload problem of Information Retrieval. Improves access and proactively recommends relevant information to each user, based on preferences and objectives. During the implementation and planning phases, designers have to cope with several issues and challenges that need proper attention. This thesis aims to show the issues and challenges in developing high-quality recommender systems. A paper solves a current research problem in the field of job recommendations using a distributed algorithmic framework built on top of Spark for parallel computation which allows the algorithm to scale linearly with the growing number of users. The final solution consists of two different recommenders which could be utilised for different purposes. The first method is mainly driven by latent topics among users, meanwhile the second technique utilises a latent factor algorithm that directly addresses the preference-confidence paradigm

    An Oracle Inequality for Quasi-Bayesian Non-Negative Matrix Factorization

    Get PDF
    The aim of this paper is to provide some theoretical understanding of quasi-Bayesian aggregation methods non-negative matrix factorization. We derive an oracle inequality for an aggregated estimator. This result holds for a very general class of prior distributions and shows how the prior affects the rate of convergence.Comment: This is the corrected version of the published paper P. Alquier, B. Guedj, An Oracle Inequality for Quasi-Bayesian Non-negative Matrix Factorization, Mathematical Methods of Statistics, 2017, vol. 26, no. 1, pp. 55-67. Since then Arnak Dalalyan (ENSAE) found a mistake in the proofs. We fixed the mistake at the price of a slightly different logarithmic term in the boun

    Implications of sparsity and high triangle density for graph representation learning

    Full text link
    Recent work has shown that sparse graphs containing many triangles cannot be reproduced using a finite-dimensional representation of the nodes, in which link probabilities are inner products. Here, we show that such graphs can be reproduced using an infinite-dimensional inner product model, where the node representations lie on a low-dimensional manifold. Recovering a global representation of the manifold is impossible in a sparse regime. However, we can zoom in on local neighbourhoods, where a lower-dimensional representation is possible. As our constructions allow the points to be uniformly distributed on the manifold, we find evidence against the common perception that triangles imply community structure
    • …
    corecore