Search CORE

29 research outputs found

Feature Selection Based on the Shapley Value

Author: Cohen S. B.
Dror G.
Ruppin E.
Publication venue
Publication date: 01/01/2005
Field of study

Influence in Classification via Cooperative Game Theory

Author: Datta Amit
Datta Anupam
Procaccia Ariel D.
Zick Yair
Publication venue
Publication date: 30/04/2015
Field of study

A dataset has been classified by some unknown classifier into two types of points. What were the most important factors in determining the classification outcome? In this work, we employ an axiomatic approach in order to uniquely characterize an influence measure: a function that, given a set of classified points, outputs a value for each feature corresponding to its influence in determining the classification outcome. We show that our influence measure takes on an intuitive form when the unknown classifier is linear. Finally, we employ our influence measure in order to analyze the effects of user profiling on Google's online display advertising.Comment: accepted to IJCAI 201

arXiv.org e-Print Archive

CiteSeerX

Public spending impact on short term growth : a machine learning approach

Author: Santos Lucas Dierings Tanus dos
Publication venue
Publication date: 01/01/2021
Field of study

The public spending multiplier has long been a subject of analysis with central discussion on how its size varies under different economic contexts. The article that integrates this dissertation introduces a causal machine learning technique as a tool to estimate the public spending multiplier and make individual predictions based on each country’s economic context. We propose to model the multiplier with a causal random forest, developed by Wager e Athey (2018), uncovering possible heterogeneous treatment effects. We apply this methodology to a dataset provided by the International Monetary Fund, including data from 35 developed countries for the years from 2000 to 2020. The multiplier estimates obtained with this methodology are between 1.7 and 2.7. In addition, we use this methodology as a tool to uncover which features are important to the multiplier heterogeneity.O multiplicador do gasto público é objeto de análise há muito tempo, com a discussão centrada em como seu tamanho varia em diferentes contextos econômicos. No artigo que integra esta dissertação, apresentamos uma técnica de aprendizado de máquina causal como uma ferramenta para estimar o multiplicador do gasto público e fazer previsões individualizadas com base no contexto econômico de cada país. Propomos modelar o multiplicador com uma floresta aleatória causal, desenvolvida por Wager e Athey (2018), descobrindo possíveis efeitos de tratamento heterogêneos. Aplicamos essa metodologia em um conjunto de dados fornecido pelo Fundo Monetário Internacional, incluindo dados de 35 países desenvolvidos ao longo dos anos de 2000 a 2020. As estimativas dos multiplicadores obtidas com esta metodologia estão entre 1,7 e 2,7. Além disso, usamos essa metodologia como uma ferramenta para descobrir quais recursos são importantes para a heterogeneidade do multiplicador

Lume 5.8

Elucidating the Auxetic Behavior of Cementitious Cellular Composites Using Finite Element Analysis and Interpretable Machine Learning

Author: Das Sumanta
Donor Sami
Kelter Nora-Kristin
Krishnan N. M. Anoop
Lyngdoh Gideon A.
Publication venue: Digital Commons @ George Fox University
Publication date: 01/01/2022
Field of study

With the advent of 3D printing, auxetic cellular cementitious composites (ACCCs) have recently garnered signiﬁcant attention owing to their unique mechanical performance. To enable seamless performance prediction of the ACCCs, interpretable machine learning (ML)-based approaches can provide efﬁcient means. However, the prediction of Poisson’s ratio using such ML approaches requires large and consistent datasets which is not readily available for ACCCs. To address this challenge, this paper synergistically integrates a ﬁnite element analysis (FEA)-based framework with ML to predict the Poisson’s ratios. In particular, the FEA-based approach is used to generate a dataset containing 850 combinations of different mesoscale architectural void features. The dataset is leveraged to develop an ML-based prediction tool using a feed-forward multilayer perceptron-based neural network (NN) approach which shows excellent prediction efﬁcacy. To shed light on the relative inﬂuence of the design parameters on the auxetic behavior of the ACCCs, Shapley additive explanations (SHAP) is employed, which establishes the volume fraction of voids as the most inﬂuential parameter in inducing auxetic behavior. Overall, this paper develops an efﬁcient approach to evaluate geometry-dependent auxetic behaviors for cementitious materials which can be used as a starting point toward the design and development of auxetic behavior in cementitious composites

Digital Commons @ George Fox University

DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation

Author: Garrido-Lucero Felipe
Heymann Benjamin
Loiseau Patrick
Perchet Vianney
Vono Maxime
Publication venue
Publication date: 03/06/2023
Field of study

Many machine learning problems require performing dataset valuation, i.e. to quantify the incremental gain, to some relevant pre-defined utility, of aggregating an individual dataset to others. As seminal examples, dataset valuation has been leveraged in collaborative and federated learning to create incentives for data sharing across several data owners. The Shapley value has recently been proposed as a principled tool to achieve this goal due to formal axiomatic justification. Since its computation often requires exponential time, standard approximation strategies based on Monte Carlo integration have been considered. Such generic approximation methods, however, remain expensive in some cases. In this paper, we exploit the knowledge about the structure of the dataset valuation problem to devise more efficient Shapley value estimators. We propose a novel approximation of the Shapley value, referred to as discrete uniform Shapley (DU-Shapley) which is expressed as an expectation under a discrete uniform distribution with support of reasonable size. We justify the relevancy of the proposed framework via asymptotic and non-asymptotic theoretical guarantees and show that DU-Shapley tends towards the Shapley value when the number of data owners is large. The benefits of the proposed framework are finally illustrated on several dataset valuation benchmarks. DU-Shapley outperforms other Shapley value approximations, even when the number of data owners is small.Comment: 22 page

arXiv.org e-Print Archive

Elucidating the Costitutive Relationship of Calcium-Silicate-Hydrate Gel Using High Throughput Reactive Molecular Simulations and Machine Learning

Author: Das Sumanta
Krishnan N. M. Anoop
Li Hewenxuan
Lyngdoh Gideon A.
Zaki Mohd
Publication venue: Digital Commons @ George Fox University
Publication date: 01/01/2020
Field of study

Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary Portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties

Digital Commons @ George Fox University

Elucidating the constitutive relationship of calcium–silicate–hydrate gel using high throughput reactive molecular simulations and machine learning

Author: Das Sumanta
Krishnan N. M. Anoop
Li Hewenxuan
Lyngdoh Gideon A.
Zaki Mohd
Publication venue: DigitalCommons@URI
Publication date: 01/01/2020
Field of study

Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties

DigitalCommons@URI

Digital Commons @ George Fox University