11,195 research outputs found
Rational Shapley Values
Explaining the predictions of opaque machine learning algorithms is an
important and challenging task, especially as complex models are increasingly
used to assist in high-stakes decisions such as those arising in healthcare and
finance. Most popular tools for post-hoc explainable artificial intelligence
(XAI) are either insensitive to context (e.g., feature attributions) or
difficult to summarize (e.g., counterfactuals). In this paper, I introduce
\emph{rational Shapley values}, a novel XAI method that synthesizes and extends
these seemingly incompatible approaches in a rigorous, flexible manner. I
leverage tools from decision theory and causal modeling to formalize and
implement a pragmatic approach that resolves a number of known challenges in
XAI. By pairing the distribution of random variables with the appropriate
reference class for a given explanation task, I illustrate through theory and
experiments how user goals and knowledge can inform and constrain the solution
set in an iterative fashion. The method compares favorably to state of the art
XAI tools in a range of quantitative and qualitative comparisons.Comment: 20 pages, 3 figures, 7 table
Conceptual challenges for interpretable machine learning
As machine learning has gradually entered into ever more sectors of public and private life, there has been a growing demand for algorithmic explainability. How can we make the predictions of complex statistical models more intelligible to end users? A subdiscipline of computer science known as interpretable machine learning (IML) has emerged to address this urgent question. Numerous influential methods have been proposed, from local linear approximations to rule lists and counterfactuals. In this article, I highlight three conceptual challenges that are largely overlooked by authors in this area. I argue that the vast majority of IML algorithms are plagued by (1) ambiguity with respect to their true target; (2) a disregard for error rates and severe testing; and (3) an emphasis on product over process. Each point is developed at length, drawing on relevant debates in epistemology and philosophy of science. Examples and counterexamples from IML are considered, demonstrating how failure to acknowledge these problems can result in counterintuitive and potentially misleading explanations. Without greater care for the conceptual foundations of IML, future work in this area is doomed to repeat the same mistakes
Recommended from our members
Organic geochemistry of the Boltysh impact crater, Ukraine
The Boltysh crater has been know for several decades and was originally drilled in the 1960s - 1980s in a study of economic oil shale deposits. Unfortunately, the cores were not curated and have been lost. However we have recently re-drilled the impact crater and have recovered a near continuous record of ~400m of organic rich sediments deposited in a deep isolated lake which overly the basement rocks spanning a period ~10 Ma. The Boltysh impact crater, centred at 48°54–N and 32°15–E is a complex impact structure formed on the basement rocks of the Ukrainian shield. The age of the impact is 65.17±0.64 Ma [1]. At 24km diameter, the impact is unlikely to have contributed substantially to the worldwide devastation at the end of the Cretaceous.
However, the precise age of the Boltysh impact relative to the Chicxulub impact and its location on a stable low lying coastal plain which allowed formation of the postimpact crater lake make it a particularly important locality. After the impact, the crater quickly filled with water, and the crater lake received sediment input from the surrounding land surface for a period >10 Ma [2]. These strata contain a valuable record of Paleogene environmental change in central Europe, and one of very few terrestrial records of the KT event. This preeminent record of the Paleogene of central Europe can help us to answer several related scientific questions.
What is the relative age of Boltysh compared with Chicxulub? How long was the hydrothermal system active for after the impact event? How did the devastated area surrounding the crater recover, and how rapid was the recovery? The first sediments to be deposited in the crater lake were a series of relatively thin turbidites, the sediments then become organic rich shales and oil shales. Within the core there is ~400 m of organic rich shales/oil shales spanning a period of ~10 Ma some of which contain macrofossils such as ostracods, fish and plant fossils. Preliminary palynological studies suggest initial sedimentation was slow after the impact followed by more rapid sedimentation through the Late Paleocene. Hydrocarbons extracted from these samples are commonly dominated by terrestrial n-alkanes (Fig 1), Hopanes (including 3-methylhopanes) and steranes are also abundant and indicate the immaturity of the samples. The immaturity of samples is also evident from the abundance of hopenes, sterenes and oleanenes especially in the upper section of the core. In some of the oil shales the hopenes and sterenes are the most abundant hydrocarbons present. There is variation in the distribution of hydrocarbons/biomarkers and palynology throughout the core caused by changing inputs and environmental conditions
Recommended from our members
Potential of short wavelength laser ablation of organic materials
Although the literature contains several articles on UV laser ablation of synthetic polymers [1] and human tissue for surgical applications, to our knowledge there is no published record on organic geochemical applications for UV laser pyrolysis–gas chromatography–mass spectrometry (LA-GC-MS). In this study we have demonstrated the use of a 213 nm UV laser beam for ablating kerogens and organic rich rocks to liberate and analyse hydrocarbon signatures and compared the results against IR laser pyrolysis and traditional Py-GC-MS. It is possible to equate laser wavelength to electron volts where 1064 nm (IR) = 1.2 eV and 213 nm (UV) = 5.8 eV. Most chemical bonds have an energy between 2-4 eV and C-C bonds are ~3.6 eV. Organic materials can absorb radiation from a UV laser and chemical bonds can be cleaved cleanly by complex photochemical pathways by a single photon [2]. Ablation occurs with almost no heating of the sample and hence the term laser ablation instead of pyrolysis. Visible or IR lasers have insufficient energy to break bonds with a single photon this results in the heating of sample by the absobtion of energy into the vibrational modes of the molecule which can then result in pyrolysis. A solvent-extracted kerogen consisting mainly of higher plant material (Brownie Butte, Montanna, ~ 70 Ma) was used for initial experiments. A number of other samples have also been analysed. Laser ablation work was performed off-line in a static helium cell followed by solvent extraction of the laser cell. Separate analysis of the same samples using a more traditional flash pyrolysis approach was performed with a CDS pyroprobe and IR laser pyrolysis [3] for comparative purposes. As can be seen in Fig 1 UV laser ablation is able to liberate relatively high molecular weight fragments with no alkenes or other pyrolysis artefacts detected. SEM images of ablation pits indicate there is no obvious thermal alteration of the sample. The results of the pyrolysis techniques (on-line and IR laser pyrolysis) are similar and display a number of artefacts related to the pyrolysis process. Laser ablation of a number of samples has also shown that the distributions of biomarkers are comparable with the solvent extracts. Product yields although not quantified appear to be much higher than traditional pyrolysis technique
Recommended from our members
Is UV laser ablation a suitable tool for geochemical analysis of organic rich source materials?
Abstract not available
Testing Conditional Independence in Supervised Learning Algorithms
We propose the conditional predictive impact (CPI), a consistent and unbiased
estimator of the association between one or several features and a given
outcome, conditional on a reduced feature set. Building on the knockoff
framework of Cand\`es et al. (2018), we develop a novel testing procedure that
works in conjunction with any valid knockoff sampler, supervised learning
algorithm, and loss function. The CPI can be efficiently computed for
high-dimensional data without any sparsity constraints. We demonstrate
convergence criteria for the CPI and develop statistical inference procedures
for evaluating its magnitude, significance, and precision. These tests aid in
feature and model selection, extending traditional frequentist and Bayesian
techniques to general supervised learning tasks. The CPI may also be applied in
causal discovery to identify underlying multivariate graph structures. We test
our method using various algorithms, including linear regression, neural
networks, random forests, and support vector machines. Empirical results show
that the CPI compares favorably to alternative variable importance measures and
other nonparametric tests of conditional independence on a diverse array of
real and simulated datasets. Simulations confirm that our inference procedures
successfully control Type I error and achieve nominal coverage probability. Our
method has been implemented in an R package, cpi, which can be downloaded from
https://github.com/dswatson/cpi
Parallel Deterministic and Stochastic Global Minimization of Functions with Very Many Minima
The optimization of three problems with high dimensionality and many local minima are investigated
under five different optimization algorithms: DIRECT, simulated annealing, Spall’s SPSA algorithm, the KNITRO
package, and QNSTOP, a new algorithm developed at Indiana University
Conditional Feature Importance for Mixed Data
Despite the popularity of feature importance (FI) measures in interpretable
machine learning, the statistical adequacy of these methods is rarely
discussed. From a statistical perspective, a major distinction is between
analyzing a variable's importance before and after adjusting for covariates -
i.e., between and measures. Our work
draws attention to this rarely acknowledged, yet crucial distinction and
showcases its implications. Further, we reveal that for testing conditional FI,
only few methods are available and practitioners have hitherto been severely
restricted in method application due to mismatching data requirements. Most
real-world data exhibits complex feature dependencies and incorporates both
continuous and categorical data (mixed data). Both properties are oftentimes
neglected by conditional FI measures. To fill this gap, we propose to combine
the conditional predictive impact (CPI) framework with sequential knockoff
sampling. The CPI enables conditional FI measurement that controls for any
feature dependencies by sampling valid knockoffs - hence, generating synthetic
data with similar statistical properties - for the data to be analyzed.
Sequential knockoffs were deliberately designed to handle mixed data and thus
allow us to extend the CPI approach to such datasets. We demonstrate through
numerous simulations and a real-world example that our proposed workflow
controls type I error, achieves high power and is in line with results given by
other conditional FI measures, whereas marginal FI metrics result in misleading
interpretations. Our findings highlight the necessity of developing
statistically adequate, specialized methods for mixed data
Local Explanations via Necessity and Sufficiency: Unifying Theory and Practice
Necessity and sufficiency are the building blocks of all successful explanations. Yet despite their importance, these notions have been conceptually underdeveloped and inconsistently applied in explainable artificial intelligence (XAI), a fast-growing research area that is so far lacking in firm theoretical foundations. In this article, an expanded version of a paper originally presented at the 37th Conference on Uncertainty in Artificial Intelligence (Watson et al., 2021), we attempt to fill this gap. Building on work in logic, probability, and causality, we establish the central role of necessity and sufficiency in XAI, unifying seemingly disparate methods in a single formal framework. We propose a novel formulation of these concepts, and demonstrate its advantages over leading alternatives. We present a sound and complete algorithm for computing explanatory factors with respect to a given context and set of agentive preferences, allowing users to identify necessary and sufficient conditions for desired outcomes at minimal cost. Experiments on real and simulated data confirm our method’s competitive performance against state of the art XAI tools on a diverse array of tasks
- …