189 research outputs found
Axiomatic Attribution for Multilinear Functions
We study the attribution problem, that is, the problem of attributing a
change in the value of a characteristic function to its independent variables.
We make three contributions. First, we propose a formalization of the problem
based on a standard cost sharing model. Second, we show that there is a unique
attribution method that satisfies Dummy, Additivity, Conditional Nonnegativity,
Affine Scale Invariance, and Anonymity for all characteristic functions that
are the sum of a multilinear function and an additive function. We term this
the Aumann-Shapley-Shubik method. Conversely, we show that such a uniqueness
result does not hold for characteristic functions outside this class. Third, we
study multilinear characteristic functions in detail; we describe a
computationally efficient implementation of the Aumann-Shapley-Shubik method
and discuss practical applications to pay-per-click advertising and portfolio
analysis.Comment: 21 pages, 2 figures, updated version for EC '1
Model free variable importance for high dimensional data
A model-agnostic variable importance method can be used with arbitrary
prediction functions. Here we present some model-free methods that do not
require access to the prediction function. This is useful when that function is
proprietary and not available, or just extremely expensive. It is also useful
when studying residuals from a model. The cohort Shapley (CS) method is
model-free but has exponential cost in the dimension of the input space. A
supervised on-manifold Shapley method from Frye et al. (2020) is also model
free but requires as input a second black box model that has to be trained for
the Shapley value problem. We introduce an integrated gradient (IG) version of
cohort Shapley, called IGCS, with cost . We show that over the
vast majority of the relevant unit cube that the IGCS value function is close
to a multilinear function for which IGCS matches CS. Another benefit of IGCS is
that is allows IG methods to be used with binary predictors. We use some area
between curves (ABC) measures to quantify the performance of IGCS. On a problem
from high energy physics we verify that IGCS has nearly the same ABCs as CS
does. We also use it on a problem from computational chemistry in 1024
variables. We see there that IGCS attains much higher ABCs than we get from
Monte Carlo sampling. The code is publicly available at
https://github.com/cohortshapley/cohortintgra
Computing Solution Operators of Boundary-value Problems for Some Linear Hyperbolic Systems of PDEs
We discuss possibilities of application of Numerical Analysis methods to
proving computability, in the sense of the TTE approach, of solution operators
of boundary-value problems for systems of PDEs. We prove computability of the
solution operator for a symmetric hyperbolic system with computable real
coefficients and dissipative boundary conditions, and of the Cauchy problem for
the same system (we also prove computable dependence on the coefficients) in a
cube . Such systems describe a wide variety of physical
processes (e.g. elasticity, acoustics, Maxwell equations). Moreover, many
boundary-value problems for the wave equation also can be reduced to this case,
thus we partially answer a question raised in Weihrauch and Zhong (2002).
Compared with most of other existing methods of proving computability for PDEs,
this method does not require existence of explicit solution formulas and is
thus applicable to a broader class of (systems of) equations.Comment: 31 page
Consistent and Truthful Interpretation with Fourier Analysis
For many interdisciplinary fields, ML interpretations need to be consistent
with what-if scenarios related to the current case, i.e., if one factor
changes, how does the model react? Although the attribution methods are
supported by the elegant axiomatic systems, they mainly focus on individual
inputs, and are generally inconsistent. To support what-if scenarios, we
introduce a new notion called truthful interpretation, and apply Fourier
analysis of Boolean functions to get rigorous guarantees. Experimental results
show that for neighborhoods with various radii, our method achieves 2x - 50x
lower interpretation error compared with the other methods
Root Cause Analysis Using Graph Representation of Constraints
When several quantitative variables are related through constraints and objectives, it can be difficult to understand why a certain quantity changes in magnitude after certain changes, or why a certain number seems larger or smaller than expected (as compared to a reference value). Large organizations which seek to optimize very large numbers of parameters to achieve constraints such as supply-demand matching face such problems, where the relationships between variables are controlled by a mixture of human processes and software algorithms. This disclosure describes scalable, flexible frameworks and searching techniques that improve transparency, e.g., enable root-cause analysis, for large families of variables. The techniques enable the understanding of the difference between two data-generating flows (versions) with comparable inputs, intermediate variables, and output variables
The consumerâs demand functions defined to study contingent consumption plans. Summarized probability distributions: a mathematical application to contingent consumption choices
Given two probability distributions expressing returns on two single risky assets of a portfolio, we innovatively define two consumerâs demand functions connected with two
contingent consumption plans. This thing is possible whenever we coherently summarize every probability distribution being chosen by the consumer. Since prevision choices are consumption choices being made by the consumer inside of a metric space, we show that prevision choices can be studied by means of the standard economic model of consumer behavior. Such a model implies that we consider all coherent previsions of a joint distribution. They are decomposed inside of a metric space. Such a space coincides with the consumerâs consumption space. In this paper, we do not consider a joint distribution only. It follows that we innovatively define a stand-alone and double risky asset. Different summary measures of it characterizing consumption choices being made by the consumer can then be studied inside of a linear space over â. We show that it is possible to obtain different summary measures of probability distributions by using two different quadratic metrics.
In this paper, our results are based on a particular approach to the origin of the variability of probability distributions. We realize that it is not standardized, but it always depends on the state of information and knowledge of the consumer
- âŠ