189 research outputs found

    Axiomatic Attribution for Multilinear Functions

    Full text link
    We study the attribution problem, that is, the problem of attributing a change in the value of a characteristic function to its independent variables. We make three contributions. First, we propose a formalization of the problem based on a standard cost sharing model. Second, we show that there is a unique attribution method that satisfies Dummy, Additivity, Conditional Nonnegativity, Affine Scale Invariance, and Anonymity for all characteristic functions that are the sum of a multilinear function and an additive function. We term this the Aumann-Shapley-Shubik method. Conversely, we show that such a uniqueness result does not hold for characteristic functions outside this class. Third, we study multilinear characteristic functions in detail; we describe a computationally efficient implementation of the Aumann-Shapley-Shubik method and discuss practical applications to pay-per-click advertising and portfolio analysis.Comment: 21 pages, 2 figures, updated version for EC '1

    The Shapley Value in Machine Learning

    Get PDF

    Model free variable importance for high dimensional data

    Full text link
    A model-agnostic variable importance method can be used with arbitrary prediction functions. Here we present some model-free methods that do not require access to the prediction function. This is useful when that function is proprietary and not available, or just extremely expensive. It is also useful when studying residuals from a model. The cohort Shapley (CS) method is model-free but has exponential cost in the dimension of the input space. A supervised on-manifold Shapley method from Frye et al. (2020) is also model free but requires as input a second black box model that has to be trained for the Shapley value problem. We introduce an integrated gradient (IG) version of cohort Shapley, called IGCS, with cost O(nd)\mathcal{O}(nd). We show that over the vast majority of the relevant unit cube that the IGCS value function is close to a multilinear function for which IGCS matches CS. Another benefit of IGCS is that is allows IG methods to be used with binary predictors. We use some area between curves (ABC) measures to quantify the performance of IGCS. On a problem from high energy physics we verify that IGCS has nearly the same ABCs as CS does. We also use it on a problem from computational chemistry in 1024 variables. We see there that IGCS attains much higher ABCs than we get from Monte Carlo sampling. The code is publicly available at https://github.com/cohortshapley/cohortintgra

    Computing Solution Operators of Boundary-value Problems for Some Linear Hyperbolic Systems of PDEs

    Full text link
    We discuss possibilities of application of Numerical Analysis methods to proving computability, in the sense of the TTE approach, of solution operators of boundary-value problems for systems of PDEs. We prove computability of the solution operator for a symmetric hyperbolic system with computable real coefficients and dissipative boundary conditions, and of the Cauchy problem for the same system (we also prove computable dependence on the coefficients) in a cube Q⊆RmQ\subseteq\mathbb R^m. Such systems describe a wide variety of physical processes (e.g. elasticity, acoustics, Maxwell equations). Moreover, many boundary-value problems for the wave equation also can be reduced to this case, thus we partially answer a question raised in Weihrauch and Zhong (2002). Compared with most of other existing methods of proving computability for PDEs, this method does not require existence of explicit solution formulas and is thus applicable to a broader class of (systems of) equations.Comment: 31 page

    Consistent and Truthful Interpretation with Fourier Analysis

    Full text link
    For many interdisciplinary fields, ML interpretations need to be consistent with what-if scenarios related to the current case, i.e., if one factor changes, how does the model react? Although the attribution methods are supported by the elegant axiomatic systems, they mainly focus on individual inputs, and are generally inconsistent. To support what-if scenarios, we introduce a new notion called truthful interpretation, and apply Fourier analysis of Boolean functions to get rigorous guarantees. Experimental results show that for neighborhoods with various radii, our method achieves 2x - 50x lower interpretation error compared with the other methods

    Root Cause Analysis Using Graph Representation of Constraints

    Get PDF
    When several quantitative variables are related through constraints and objectives, it can be difficult to understand why a certain quantity changes in magnitude after certain changes, or why a certain number seems larger or smaller than expected (as compared to a reference value). Large organizations which seek to optimize very large numbers of parameters to achieve constraints such as supply-demand matching face such problems, where the relationships between variables are controlled by a mixture of human processes and software algorithms. This disclosure describes scalable, flexible frameworks and searching techniques that improve transparency, e.g., enable root-cause analysis, for large families of variables. The techniques enable the understanding of the difference between two data-generating flows (versions) with comparable inputs, intermediate variables, and output variables

    The consumer’s demand functions defined to study contingent consumption plans. Summarized probability distributions: a mathematical application to contingent consumption choices

    Get PDF
    Given two probability distributions expressing returns on two single risky assets of a portfolio, we innovatively define two consumer’s demand functions connected with two contingent consumption plans. This thing is possible whenever we coherently summarize every probability distribution being chosen by the consumer. Since prevision choices are consumption choices being made by the consumer inside of a metric space, we show that prevision choices can be studied by means of the standard economic model of consumer behavior. Such a model implies that we consider all coherent previsions of a joint distribution. They are decomposed inside of a metric space. Such a space coincides with the consumer’s consumption space. In this paper, we do not consider a joint distribution only. It follows that we innovatively define a stand-alone and double risky asset. Different summary measures of it characterizing consumption choices being made by the consumer can then be studied inside of a linear space over ℝ. We show that it is possible to obtain different summary measures of probability distributions by using two different quadratic metrics. In this paper, our results are based on a particular approach to the origin of the variability of probability distributions. We realize that it is not standardized, but it always depends on the state of information and knowledge of the consumer
    • 

    corecore