Search CORE

6,084 research outputs found

Recommended from our members

Graph Construction for Manifold Discovery

Author: Carey CJ
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/11/2017
Field of study

Manifold learning is a class of machine learning methods that exploits the observation that high-dimensional data tend to lie on a smooth lower-dimensional manifold. Manifold discovery is the essential first component of manifold learning methods, in which the manifold structure is inferred from available data. This task is typically posed as a graph construction problem: selecting a set of vertices and edges that most closely approximates the true underlying manifold. The quality of this learned graph is critical to the overall accuracy of the manifold learning method. Thus, it is essential to develop accurate, efficient, and reliable algorithms for constructing manifold approximation graphs. To aid in this investigation of graph construction methods, we propose new methods for evaluating graph quality. These quality measures act as a proxy for ground-truth manifold approximation error and are applicable even when prior information about the dataset is limited. We then develop an incremental update scheme for some quality measures, demonstrating their usefulness for efficient parameter tuning. We then propose two novel methods for graph construction, the Manifold Spanning Graph and the Mutual Neighbors Graph algorithms. Each method leverages assumptions about the structure of both the input data and the subsequent manifold learning task. The algorithms are experimentally validated against state of the art graph construction techniques on a multi-disciplinary set of application domains, including image classification, directional audio prediction, and spectroscopic analysis. The final contribution of the thesis is a method for aligning sequential datasets while still respecting each set’s internal manifold structure. The use of high quality manifold approximation graphs enables accurate alignments with few ground-truth correspondences

ScholarWorks@UMass Amherst

Design of New Dispersants Using Machine Learning and Visual Analytics

Author: Bernabei Marco
Campillo Nuria E.
Franco Mario
Gómez Arrayas Ramón Jesús
Kim-Lee Shin Ho
Lozano Ordoñez Héctor
Martínez María Jimena
Mauleón Pérez Pablo
Naveiro Roi
Ponzoni Ignacio
Revilla López Guillermo
Soto Axel J.
Talavante Pablo
Publication venue: 'MDPI AG'
Publication date: 01/03/2023
Field of study

Artificial intelligence (AI) is an emerging technology that is revolutionizing the discovery of new materials. One key application of AI is virtual screening of chemical libraries, which enables the accelerated discovery of materials with desired properties. In this study, we developed computational models to predict the dispersancy efficiency of oil and lubricant additives, a critical property in their design that can be estimated through a quantity named blotter spot. We propose a comprehensive approach that combines machine learning techniques with visual analytics strategies in an interactive tool that supports domain experts’ decision-making. We evaluated the proposed models quantitatively and illustrated their benefits through a case study. Specifically, we analyzed a series of virtual polyisobutylene succinimide (PIBSI) molecules derived from a known reference substrate. Our best-performing probabilistic model was Bayesian Additive Regression Trees (BART), which achieved a mean absolute error of (Formula presented.) and a root mean square error of (Formula presented.), as estimated through 5-fold cross-validation. To facilitate future research, we have made the dataset, including the potential dispersants used for modeling, publicly available. Our approach can help accelerate the discovery of new oil and lubricant additives, and our interactive tool can aid domain experts in making informed decisions based on blotter spot and other key propertie

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

Digital.CSIC

Biblos-e Archivo

A very simple safe-Bayesian random forest

Author: Ghahramani Zoubin
Quadrianto Novi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2015
Field of study

Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct

Crossref

Sussex Research Online