1,132 research outputs found
A Study of Boolean Matrix Factorization Under Supervised Settings
International audienceBoolean matrix factorization is a generally accepted approach used in data analysis to explain data. It is commonly used under unsu-pervised setting or for data preprocessing under supervised settings. In this paper we study factors under supervised settings. We provide an experimental proof that factors are able to explain not only data as a whole but also classes in the data
MLI: An API for Distributed Machine Learning
MLI is an Application Programming Interface designed to address the
challenges of building Machine Learn- ing algorithms in a distributed setting
based on data-centric computing. Its primary goal is to simplify the
development of high-performance, scalable, distributed algorithms. Our initial
results show that, relative to existing systems, this interface can be used to
build distributed implementations of a wide variety of common Machine Learning
algorithms with minimal complexity and highly competitive performance and
scalability
Nonexistence Certificates for Ovals in a Projective Plane of Order Ten
In 1983, a computer search was performed for ovals in a projective plane of
order ten. The search was exhaustive and negative, implying that such ovals do
not exist. However, no nonexistence certificates were produced by this search,
and to the best of our knowledge the search has never been independently
verified. In this paper, we rerun the search for ovals in a projective plane of
order ten and produce a collection of nonexistence certificates that, when
taken together, imply that such ovals do not exist. Our search program uses the
cube-and-conquer paradigm from the field of satisfiability (SAT) checking,
coupled with a programmatic SAT solver and the nauty symbolic computation
library for removing symmetries from the search.Comment: Appears in the Proceedings of the 31st International Workshop on
Combinatorial Algorithms (IWOCA 2020
Machine-Learning-based Prediction of Sepsis Events from Vertical Clinical Trial Data: a Naïve Approach
Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease.
This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an eventual amended disease definition. Longitudinal clinical trial data (ntrials=30,915) were vectorized into a machine-readable format compatible with predictive modeling, selected and reduced in dimension, and used to predict incidences of sepsis via application of several machine learning models: logistic regression, support vector machines (SVM), naïve Bayes Classifier, decision trees, and random forests. The intent of the study was to identify possible predictive features for sepsis via comparative analysis of different machine learning models, and to recommend subsequent study of sepsis prediction using the training model on new data (non-clinical-trial-derived) in the same format. If the models can be generalized to new data, it stands to assume they could eventually become clinically useful. In referencing F1 scores and recall scores, the random forest classifier was the best performer among this cohort of models
- …