19 research outputs found
Neural Graphical Models
Probabilistic Graphical Models are often used to understand dynamics of a
system. They can model relationships between features (nodes) and the
underlying distribution. Theoretically these models can represent very complex
dependency functions, but in practice often simplifying assumptions are made
due to computational limitations associated with graph operations. In this work
we introduce Neural Graphical Models (NGMs) which attempt to represent complex
feature dependencies with reasonable computational costs. Given a graph of
feature relationships and corresponding samples, we capture the dependency
structure between the features along with their complex function
representations by using a neural network as a multi-task learning framework.
We provide efficient learning, inference and sampling algorithms. NGMs can fit
generic graph structures including directed, undirected and mixed-edge graphs
as well as support mixed input data types. We present empirical studies that
show NGMs' capability to represent Gaussian graphical models, perform inference
analysis of a lung cancer data and extract insights from a real world infant
mortality data provided by Centers for Disease Control and Prevention
Federated Learning with Neural Graphical Models
Federated Learning (FL) addresses the need to create models based on
proprietary data in such a way that multiple clients retain exclusive control
over their data, while all benefit from improved model accuracy due to pooled
resources. Recently proposed Neural Graphical Models (NGMs) are Probabilistic
Graphical models that utilize the expressive power of neural networks to learn
complex non-linear dependencies between the input features. They learn to
capture the underlying data distribution and have efficient algorithms for
inference and sampling. We develop a FL framework which maintains a global NGM
model that learns the averaged information from the local NGM models while
keeping the training data within the client's environment. Our design, FedNGMs,
avoids the pitfalls and shortcomings of neuron matching frameworks like
Federated Matched Averaging that suffers from model parameter explosion. Our
global model size remains constant throughout the process. In the cases where
clients have local variables that are not part of the combined global
distribution, we propose a `Stitching' algorithm, which personalizes the global
NGM models by merging the additional variables using the client's data. FedNGM
is robust to data heterogeneity, large number of participants, and limited
communication bandwidth
Axiomatic Interpretability for Multiclass Additive Models
Generalized additive models (GAMs) are favored in many regression and binary
classification problems because they are able to fit complex, nonlinear
functions while still remaining interpretable. In the first part of this paper,
we generalize a state-of-the-art GAM learning algorithm based on boosted trees
to the multiclass setting, and show that this multiclass algorithm outperforms
existing GAM learning algorithms and sometimes matches the performance of full
complexity models such as gradient boosted trees.
In the second part, we turn our attention to the interpretability of GAMs in
the multiclass setting. Surprisingly, the natural interpretability of GAMs
breaks down when there are more than two classes. Naive interpretation of
multiclass GAMs can lead to false conclusions. Inspired by binary GAMs, we
identify two axioms that any additive model must satisfy in order to not be
visually misleading. We then develop a technique called Additive
Post-Processing for Interpretability (API), that provably transforms a
pre-trained additive model to satisfy the interpretability axioms without
sacrificing accuracy. The technique works not just on models trained with our
learning algorithm, but on any multiclass additive model, including multiclass
linear and logistic regression. We demonstrate the effectiveness of API on a
12-class infant mortality dataset.Comment: KDD 201
Defining explanation in probabilistic systems
As probabilistic systems gain popularity and are coming into wider use, the need for a mechanism that explains the system’s findings and recommendations becomes more critical. The system will also need a mechanism for ordering competing explanations. We examine two representative approaches to explanation in the literature— one due to Gärdenfors and one due to Pearl—and show that both suffer from significant problems. We propose an approach to defining a notion of “better explanation ” that combines some of the features of both together with more recent work by Pearl and others on causality.
Making rational decisions using adaptive utility elicitation
Rational decision making requires full knowledge of the utility function of the person affected by the decisions. However, in many cases, the task of acquiring such knowledge is not feasible due to the size of the outcome space and the complexity of the utility elicitation process. Given that the amount of utility information we can acquire is limited, we need to make decisions with partial utility information and should carefully select which utility elicitation questions we ask. In this paper, we propose a new approach for this problem that utilizes a prior probability distribution over the person’s utility function, perhaps learned from a population of similar people. The relevance of a utility elicitation question for the current decision problem can then be measured using its value of information. We propose an algorithm that interleaves the analysis of the decision problem and utility elicitation to allow these two tasks to inform each other. At every step, it asks the utility elicitation question giving us the highest value of information and computes the best strategy based on the information acquired so far, stopping when the expected utility loss resulting from our recommendation falls below a pre-specified threshold. We show how the various steps of this algorithm can be implemented efficiently