1,026 research outputs found
A Survey on Neural Network Interpretability
Along with the great success of deep neural networks, there is also growing
concern about their black-box nature. The interpretability issue affects
people's trust on deep learning systems. It is also related to many ethical
problems, e.g., algorithmic discrimination. Moreover, interpretability is a
desired property for deep networks to become powerful tools in other research
fields, e.g., drug discovery and genomics. In this survey, we conduct a
comprehensive review of the neural network interpretability research. We first
clarify the definition of interpretability as it has been used in many
different contexts. Then we elaborate on the importance of interpretability and
propose a novel taxonomy organized along three dimensions: type of engagement
(passive vs. active interpretation approaches), the type of explanation, and
the focus (from local to global interpretability). This taxonomy provides a
meaningful 3D view of distribution of papers from the relevant literature as
two of the dimensions are not simply categorical but allow ordinal
subcategories. Finally, we summarize the existing interpretability evaluation
methods and suggest possible research directions inspired by our new taxonomy.Comment: This work has been accepted by IEEE-TETC
A Survey of Neural Trees
Neural networks (NNs) and decision trees (DTs) are both popular models of
machine learning, yet coming with mutually exclusive advantages and
limitations. To bring the best of the two worlds, a variety of approaches are
proposed to integrate NNs and DTs explicitly or implicitly. In this survey,
these approaches are organized in a school which we term as neural trees (NTs).
This survey aims to present a comprehensive review of NTs and attempts to
identify how they enhance the model interpretability. We first propose a
thorough taxonomy of NTs that expresses the gradual integration and
co-evolution of NNs and DTs. Afterward, we analyze NTs in terms of their
interpretability and performance, and suggest possible solutions to the
remaining challenges. Finally, this survey concludes with a discussion about
other considerations like conditional computation and promising directions
towards this field. A list of papers reviewed in this survey, along with their
corresponding codes, is available at:
https://github.com/zju-vipa/awesome-neural-treesComment: 35 pages, 7 figures and 1 tabl
Understanding deep neural networks from the perspective of piecewise linear property
In recent years, deep learning models have been widely used and are behind major breakthroughs across many fields. Deep learning models are usually considered to be black boxes due to their large model structures and complicated hierarchical nonlinear transformations. As deep learning technology continues to develop, the understanding of deep learning models is raising concerns, such as the understanding of the training and prediction behaviors and the internal mechanism of models. In this thesis, we study the model understanding problem of deep neural networks from the perspective of piecewise linear property. First, we introduce the piecewise linear property. Next, we review the role and progress of deep learning understanding from the perspective of the piecewise linear property. The piecewise linear property reveals that deep neural networks with piecewise linear activation functions can generally divide the input space into a number of small disjointed regions that correspond to a local linear function within each region. Next, we investigate two typical understanding problems, namely model interpretation, and model complexity. In particular, we provide a series of derivations and analyses of the piecewise linear property of deep neural networks with piecewise linear activation functions. We propose an approach for interpreting the predictions given by such models based on the piecewise linear property. Next, we propose a method to provide local interpretation to a black box deep model by mimicking a piecewise linear approximation from the deep model. Then, we study deep neural networks with curve activation functions with the aim of providing piecewise linear approximations for these networks that would let them benefit from the piecewise linear property. After proposing a piecewise linear approximation framework, we investigate model complexity and model interpretation using the approximation. The thesis concludes by discussing future directions for understanding deep neural networks from the perspective of the piecewise linear property
Geoadditive Regression Modeling of Stream Biological Condition
Indices of biotic integrity (IBI) have become an established tool to quantify the condition of small non-tidal streams and their watersheds. To investigate the effects of watershed characteristics on stream biological condition, we present a new technique for regressing IBIs on watershed-specific explanatory variables. Since IBIs are typically evaluated on anordinal scale, our method is based on the proportional odds model for ordinal outcomes. To avoid overfitting, we do not use classical maximum likelihood estimation but a component-wise functional gradient boosting approach. Because component-wise gradient boosting has an intrinsic mechanism for variable selection and model choice, determinants of biotic integrity can be identified. In addition, the method offers a relatively simple way to account for spatial correlation in ecological data. An analysis of the Maryland Biological Streams Survey shows that nonlinear effects of predictor variables on stream condition can be quantified while, in addition, accurate predictions of biological condition at unsurveyed locations are obtained
Global Explanations with Decision Rules:a Co-learning Approach
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence (UAI 2021)International audienceBlack-box machine learning models can be extremely accurate. Yet, in critical applications such as in healthcare or justice, if models cannot be explained, domain experts will be reluctant to use them. A common way to explain a black-box model is to approximate it by a simpler model such as a decision tree. In this paper, we propose a co-learning framework to learn decision rules as explanations of black-box models through knowledge distillation and simultaneously constrain the blackbox model by these explanations; all of this in a differentiable manner. To do so, we introduce the soft truncated Gaussian mixture analysis (STruGMA), a probabilistic model which encapsulates hyper-rectangle decision rules. With STruGMA, global explanations can be provided by any rule learner such as decision lists, sets or trees. We provide evidences through experiments that our framework can globally explain black-box models such as neural networks. In particular, the explanation fidelity is increased, while the accuracy of the models is marginally impacted
Machine Learning Applications to Land and Structure Valuation
Acknowledgments: We thank Nicola Stalder and his IAZI team for preparing the dataset for the Swiss case study. The authors are grateful to the referees, whose feedback and comments have improved the quality of the paper.Peer reviewedPublisher PD
- …