1,132 research outputs found

    A Study of Boolean Matrix Factorization Under Supervised Settings

    Get PDF
    International audienceBoolean matrix factorization is a generally accepted approach used in data analysis to explain data. It is commonly used under unsu-pervised setting or for data preprocessing under supervised settings. In this paper we study factors under supervised settings. We provide an experimental proof that factors are able to explain not only data as a whole but also classes in the data

    MLI: An API for Distributed Machine Learning

    Full text link
    MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability

    Nonexistence Certificates for Ovals in a Projective Plane of Order Ten

    Full text link
    In 1983, a computer search was performed for ovals in a projective plane of order ten. The search was exhaustive and negative, implying that such ovals do not exist. However, no nonexistence certificates were produced by this search, and to the best of our knowledge the search has never been independently verified. In this paper, we rerun the search for ovals in a projective plane of order ten and produce a collection of nonexistence certificates that, when taken together, imply that such ovals do not exist. Our search program uses the cube-and-conquer paradigm from the field of satisfiability (SAT) checking, coupled with a programmatic SAT solver and the nauty symbolic computation library for removing symmetries from the search.Comment: Appears in the Proceedings of the 31st International Workshop on Combinatorial Algorithms (IWOCA 2020

    Machine-Learning-based Prediction of Sepsis Events from Vertical Clinical Trial Data: a Naïve Approach

    Get PDF
    Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease. This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an eventual amended disease definition. Longitudinal clinical trial data (ntrials=30,915) were vectorized into a machine-readable format compatible with predictive modeling, selected and reduced in dimension, and used to predict incidences of sepsis via application of several machine learning models: logistic regression, support vector machines (SVM), naïve Bayes Classifier, decision trees, and random forests. The intent of the study was to identify possible predictive features for sepsis via comparative analysis of different machine learning models, and to recommend subsequent study of sepsis prediction using the training model on new data (non-clinical-trial-derived) in the same format. If the models can be generalized to new data, it stands to assume they could eventually become clinically useful. In referencing F1 scores and recall scores, the random forest classifier was the best performer among this cohort of models
    corecore