613 research outputs found

    Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data

    Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

    Get PDF
    Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed---either explicitly or implicitly---to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, speed, and robustness. These claims are supported by extensive numerical experiments and a detailed error analysis

    PICAE – Intelligent publication of audiovisual and editorial contents

    Get PDF
    The development in internet infrastructure and technology in last tow decades have given users and retailers the possibility to purchase and sell items online. This has of course broadened the horizons of what products can be offered outside of the traditional trading sense, to the point where virtually any product can be offered. These massive online markets have had a considerable impact on the habits of consumers, providing them access to a greater variety of products and information on these goods. This variety has made online commerce into a multi-billion dollar industry but it has also put the customer in a position where it is getting increasingly difficult to select the products that best fit their individual needs. In the same vein, the rise of both availability and the amounts of data that computers have been able to process in the last decades have allowed for many solutions that are computationally expensive to exist, and recommender systems are no exception. These systems are the perfect tools to overcome the information overload problem since they provide automated and personalized suggestions to consumers. The PICAE project tackles the recommendation problem in the audiovisual sector. The vast amount of audiovisual content that is available nowadays to the user can be overwhelming, which is why recommenders have been increasingly growing in popularity in this sector ---Netflix being the biggest example. PICAE seeks to provide insightful and personalized recommendations to users in a public TV setting. The PICAE project develops new models and analytical tools for recommending audiovisual and editorial content with the aim of improving the user experience, based on their profile and environment, and the level of satisfaction and loyalty. These new tools represent a qualitative improvement in the state of the art of television and editorial content recommendation. On the other hand, the project also improves the digital consumption index of these contents based on the identification of products that these new forms of consumption demand and how they must be produced, distributed and promoted to respond to the needs of this emerging market. The main challenge of the PICAE project is to resolve two differentiating aspects with respect to other existing solutions such as: variety and dynamic contents that requires a real-time analysis of the recommendation and the lack of available information about the user, who in these areas is reluctant to register, making it difficult to identify in multi-device consumption. This document will explain the contributions made in the development of the project, which can be divided in two: the development of the project, which can be divided in two: the development of a recommender system that takes into account information of both users and items and a deep analysis of the current metrics used to assess the performance of a recommender system

    Recommender systems in industrial contexts

    Full text link
    This thesis consists of four parts: - An analysis of the core functions and the prerequisites for recommender systems in an industrial context: we identify four core functions for recommendation systems: Help do Decide, Help to Compare, Help to Explore, Help to Discover. The implementation of these functions has implications for the choices at the heart of algorithmic recommender systems. - A state of the art, which deals with the main techniques used in automated recommendation system: the two most commonly used algorithmic methods, the K-Nearest-Neighbor methods (KNN) and the fast factorization methods are detailed. The state of the art presents also purely content-based methods, hybridization techniques, and the classical performance metrics used to evaluate the recommender systems. This state of the art then gives an overview of several systems, both from academia and industry (Amazon, Google ...). - An analysis of the performances and implications of a recommendation system developed during this thesis: this system, Reperio, is a hybrid recommender engine using KNN methods. We study the performance of the KNN methods, including the impact of similarity functions used. Then we study the performance of the KNN method in critical uses cases in cold start situation. - A methodology for analyzing the performance of recommender systems in industrial context: this methodology assesses the added value of algorithmic strategies and recommendation systems according to its core functions.Comment: version 3.30, May 201

    Efficient implementation of high-order finite elements for Helmholtz problems

    No full text
    Computational modeling remains key to the acoustic design of various applications, but it is constrained by the cost of solving large Helmholtz problems at high frequencies. This paper presents an efficient implementation of the high-order Finite Element Method for tackling large-scale engineering problems arising in acoustics. A key feature of the proposed method is the ability to select automatically the order of interpolation in each element so as to obtain a target accuracy while minimising the cost. This is achieved using a simple local a priori error indicator. For simulations involving several frequencies, the use of hierarchic shape functions leads to an efficient strategy to accelerate the assembly of the finite element model. The intrinsic performance of the high-order FEM for 3D Helmholtz problem is assessed and an error indicator is devised to select the polynomial order in each element. A realistic 3D application is presented in detail to demonstrate the reduction in computational costs and the robustness of the a priori error indicator. For this test case the proposed method accelerates the simulation by an order of magnitude and requires less than a quarter of the memory needed by the standard FEM

    Improving Code Generation by Training with Natural Language Feedback

    Full text link
    The potential for pre-trained large language models (LLMs) to use natural language feedback at inference time has been an exciting recent development. We build upon this observation by formalizing an algorithm for learning from natural language feedback at training time instead, which we call Imitation learning from Language Feedback (ILF). ILF requires only a small amount of human-written feedback during training and does not require the same feedback at test time, making it both user-friendly and sample-efficient. We further show that ILF can be seen as a form of minimizing the KL divergence to the ground truth distribution and demonstrate a proof-of-concept on a neural program synthesis task. We use ILF to improve a Codegen-Mono 6.1B model's pass@1 rate by 38% relative (and 10% absolute) on the Mostly Basic Python Problems (MBPP) benchmark, outperforming both fine-tuning on MBPP and fine-tuning on repaired programs written by humans. Overall, our results suggest that learning from human-written natural language feedback is both more effective and sample-efficient than training exclusively on demonstrations for improving an LLM's performance on code generation tasks
    corecore