29 research outputs found

    Influence in Classification via Cooperative Game Theory

    Full text link
    A dataset has been classified by some unknown classifier into two types of points. What were the most important factors in determining the classification outcome? In this work, we employ an axiomatic approach in order to uniquely characterize an influence measure: a function that, given a set of classified points, outputs a value for each feature corresponding to its influence in determining the classification outcome. We show that our influence measure takes on an intuitive form when the unknown classifier is linear. Finally, we employ our influence measure in order to analyze the effects of user profiling on Google's online display advertising.Comment: accepted to IJCAI 201

    Public spending impact on short term growth : a machine learning approach

    Get PDF
    The public spending multiplier has long been a subject of analysis with central discussion on how its size varies under different economic contexts. The article that integrates this dissertation introduces a causal machine learning technique as a tool to estimate the public spending multiplier and make individual predictions based on each country’s economic context. We propose to model the multiplier with a causal random forest, developed by Wager e Athey (2018), uncovering possible heterogeneous treatment effects. We apply this methodology to a dataset provided by the International Monetary Fund, including data from 35 developed countries for the years from 2000 to 2020. The multiplier estimates obtained with this methodology are between 1.7 and 2.7. In addition, we use this methodology as a tool to uncover which features are important to the multiplier heterogeneity.O multiplicador do gasto pĂșblico Ă© objeto de anĂĄlise hĂĄ muito tempo, com a discussĂŁo centrada em como seu tamanho varia em diferentes contextos econĂŽmicos. No artigo que integra esta dissertação, apresentamos uma tĂ©cnica de aprendizado de mĂĄquina causal como uma ferramenta para estimar o multiplicador do gasto pĂșblico e fazer previsĂ”es individualizadas com base no contexto econĂŽmico de cada paĂ­s. Propomos modelar o multiplicador com uma floresta aleatĂłria causal, desenvolvida por Wager e Athey (2018), descobrindo possĂ­veis efeitos de tratamento heterogĂȘneos. Aplicamos essa metodologia em um conjunto de dados fornecido pelo Fundo MonetĂĄrio Internacional, incluindo dados de 35 paĂ­ses desenvolvidos ao longo dos anos de 2000 a 2020. As estimativas dos multiplicadores obtidas com esta metodologia estĂŁo entre 1,7 e 2,7. AlĂ©m disso, usamos essa metodologia como uma ferramenta para descobrir quais recursos sĂŁo importantes para a heterogeneidade do multiplicador

    Elucidating the Auxetic Behavior of Cementitious Cellular Composites Using Finite Element Analysis and Interpretable Machine Learning

    Full text link
    With the advent of 3D printing, auxetic cellular cementitious composites (ACCCs) have recently garnered signiïŹcant attention owing to their unique mechanical performance. To enable seamless performance prediction of the ACCCs, interpretable machine learning (ML)-based approaches can provide efïŹcient means. However, the prediction of Poisson’s ratio using such ML approaches requires large and consistent datasets which is not readily available for ACCCs. To address this challenge, this paper synergistically integrates a ïŹnite element analysis (FEA)-based framework with ML to predict the Poisson’s ratios. In particular, the FEA-based approach is used to generate a dataset containing 850 combinations of different mesoscale architectural void features. The dataset is leveraged to develop an ML-based prediction tool using a feed-forward multilayer perceptron-based neural network (NN) approach which shows excellent prediction efïŹcacy. To shed light on the relative inïŹ‚uence of the design parameters on the auxetic behavior of the ACCCs, Shapley additive explanations (SHAP) is employed, which establishes the volume fraction of voids as the most inïŹ‚uential parameter in inducing auxetic behavior. Overall, this paper develops an efïŹcient approach to evaluate geometry-dependent auxetic behaviors for cementitious materials which can be used as a starting point toward the design and development of auxetic behavior in cementitious composites

    DU-Shapley: A Shapley Value Proxy for Efficient Dataset Valuation

    Full text link
    Many machine learning problems require performing dataset valuation, i.e. to quantify the incremental gain, to some relevant pre-defined utility, of aggregating an individual dataset to others. As seminal examples, dataset valuation has been leveraged in collaborative and federated learning to create incentives for data sharing across several data owners. The Shapley value has recently been proposed as a principled tool to achieve this goal due to formal axiomatic justification. Since its computation often requires exponential time, standard approximation strategies based on Monte Carlo integration have been considered. Such generic approximation methods, however, remain expensive in some cases. In this paper, we exploit the knowledge about the structure of the dataset valuation problem to devise more efficient Shapley value estimators. We propose a novel approximation of the Shapley value, referred to as discrete uniform Shapley (DU-Shapley) which is expressed as an expectation under a discrete uniform distribution with support of reasonable size. We justify the relevancy of the proposed framework via asymptotic and non-asymptotic theoretical guarantees and show that DU-Shapley tends towards the Shapley value when the number of data owners is large. The benefits of the proposed framework are finally illustrated on several dataset valuation benchmarks. DU-Shapley outperforms other Shapley value approximations, even when the number of data owners is small.Comment: 22 page

    Elucidating the Costitutive Relationship of Calcium-Silicate-Hydrate Gel Using High Throughput Reactive Molecular Simulations and Machine Learning

    Full text link
    Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary Portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties

    Elucidating the constitutive relationship of calcium–silicate–hydrate gel using high throughput reactive molecular simulations and machine learning

    Get PDF
    Prediction of material behavior using machine learning (ML) requires consistent, accurate, and, representative large data for training. However, such consistent and reliable experimental datasets are not always available for materials. To address this challenge, we synergistically integrate ML with high-throughput reactive molecular dynamics (MD) simulations to elucidate the constitutive relationship of calcium–silicate–hydrate (C–S–H) gel—the primary binding phase in concrete formed via the hydration of ordinary portland cement. Specifically, a highly consistent dataset on the nine elastic constants of more than 300 compositions of C–S–H gel is developed using high-throughput reactive simulations. From a comparative analysis of various ML algorithms including neural networks (NN) and Gaussian process (GP), we observe that NN provides excellent predictions. To interpret the predicted results from NN, we employ SHapley Additive exPlanations (SHAP), which reveals that the influence of silicate network on all the elastic constants of C–S–H is significantly higher than that of water and CaO content. Additionally, the water content is found to have a more prominent influence on the shear components than the normal components along the direction of the interlayer spaces within C–S–H. This result suggests that the in-plane elastic response is controlled by water molecules whereas the transverse response is mainly governed by the silicate network. Overall, by seamlessly integrating MD simulations with ML, this paper can be used as a starting point toward accelerated optimization of C–S–H nanostructures to design efficient cementitious binders with targeted properties
    corecore