2,182 research outputs found

    Enhancing Decision Tree based Interpretation of Deep Neural Networks through L1-Orthogonal Regularization

    Full text link
    One obstacle that so far prevents the introduction of machine learning models primarily in critical areas is the lack of explainability. In this work, a practicable approach of gaining explainability of deep artificial neural networks (NN) using an interpretable surrogate model based on decision trees is presented. Simply fitting a decision tree to a trained NN usually leads to unsatisfactory results in terms of accuracy and fidelity. Using L1-orthogonal regularization during training, however, preserves the accuracy of the NN, while it can be closely approximated by small decision trees. Tests with different data sets confirm that L1-orthogonal regularization yields models of lower complexity and at the same time higher fidelity compared to other regularizers.Comment: 8 pages, 18th IEEE International Conference on Machine Learning and Applications (ICMLA) 201

    Deep Multitask Learning for Semantic Dependency Parsing

    Full text link
    We present a deep neural architecture that parses sentences into three semantic dependency graph formalisms. By using efficient, nearly arc-factored inference and a bidirectional-LSTM composed with a multi-layer perceptron, our base system is able to significantly improve the state of the art for semantic dependency parsing, without using hand-engineered features or syntax. We then explore two multitask learning approaches---one that shares parameters across formalisms, and one that uses higher-order structures to predict the graphs jointly. We find that both approaches improve performance across formalisms on average, achieving a new state of the art. Our code is open-source and available at https://github.com/Noahs-ARK/NeurboParser.Comment: Proceedings of ACL 201

    How deep is deep enough? -- Quantifying class separability in the hidden layers of deep neural networks

    Full text link
    Deep neural networks typically outperform more traditional machine learning models in their ability to classify complex data, and yet is not clear how the individual hidden layers of a deep network contribute to the overall classification performance. We thus introduce a Generalized Discrimination Value (GDV) that measures, in a non-invasive manner, how well different data classes separate in each given network layer. The GDV can be used for the automatic tuning of hyper-parameters, such as the width profile and the total depth of a network. Moreover, the layer-dependent GDV(L) provides new insights into the data transformations that self-organize during training: In the case of multi-layer perceptrons trained with error backpropagation, we find that classification of highly complex data sets requires a temporal {\em reduction} of class separability, marked by a characteristic 'energy barrier' in the initial part of the GDV(L) curve. Even more surprisingly, for a given data set, the GDV(L) is running through a fixed 'master curve', independently from the total number of network layers. Furthermore, applying the GDV to Deep Belief Networks reveals that also unsupervised training with the Contrastive Divergence method can systematically increase class separability over tens of layers, even though the system does not 'know' the desired class labels. These results indicate that the GDV may become a useful tool to open the black box of deep learning

    Decentralization of Multiagent Policies by Learning What to Communicate

    Full text link
    Effective communication is required for teams of robots to solve sophisticated collaborative tasks. In practice it is typical for both the encoding and semantics of communication to be manually defined by an expert; this is true regardless of whether the behaviors themselves are bespoke, optimization based, or learned. We present an agent architecture and training methodology using neural networks to learn task-oriented communication semantics based on the example of a communication-unaware expert policy. A perimeter defense game illustrates the system's ability to handle dynamically changing numbers of agents and its graceful degradation in performance as communication constraints are tightened or the expert's observability assumptions are broken.Comment: 7 page
    • …
    corecore