Search CORE

1,816 research outputs found

Beyond Sparsity: Tree Regularization of Deep Models for Interpretability

Author: Doshi-Velez Finale
Hughes Michael C.
Parbhoo Sonali
Roth Volker
Wu Mike
Zazzi Maurizio
Publication venue
Publication date: 16/11/2017
Field of study

The lack of interpretability remains a key barrier to the adoption of deep models in many applications. In this work, we explicitly regularize deep models so human users might step through the process behind their predictions in little time. Specifically, we train deep time-series models so their class-probability predictions have high accuracy while being closely modeled by decision trees with few nodes. Using intuitive toy examples as well as medical tasks for treating sepsis and HIV, we demonstrate that this new tree regularization yields models that are easier for humans to simulate than simpler L1 or L2 penalties without sacrificing predictive power.Comment: To appear in AAAI 2018. Contains 9-page main paper and appendix with supplementary materia

arXiv.org e-Print Archive

edoc

Continuous-variable quantum neural networks

Author: Arrazola Juan Miguel
Bromley Thomas R.
Killoran Nathan
Lloyd Seth
Quesada Nicolás
Schuld Maria
Publication venue: 'American Physical Society (APS)'
Publication date: 18/06/2018
Field of study

We introduce a general method for building neural networks on quantum computers. The quantum neural network is a variational quantum circuit built in the continuous-variable (CV) architecture, which encodes quantum information in continuous degrees of freedom such as the amplitudes of the electromagnetic field. This circuit contains a layered structure of continuously parameterized gates which is universal for CV quantum computation. Affine transformations and nonlinear activation functions, two key elements in neural networks, are enacted in the quantum network using Gaussian and non-Gaussian gates, respectively. The non-Gaussian gates provide both the nonlinearity and the universality of the model. Due to the structure of the CV model, the CV quantum neural network can encode highly nonlinear transformations while remaining completely unitary. We show how a classical network can be embedded into the quantum formalism and propose quantum versions of various specialized model such as convolutional, recurrent, and residual networks. Finally, we present numerous modeling experiments built with the Strawberry Fields software library. These experiments, including a classifier for fraud detection, a network which generates Tetris images, and a hybrid classical-quantum autoencoder, demonstrate the capability and adaptability of CV quantum neural networks

arXiv.org e-Print Archive

DSpace@MIT

PolyPublie

Patterns versus Characters in Subword-aware Neural Language Modeling

Author: MP Marcus
N Chomsky
PJ Werbos
S Hochreiter
Publication venue
Publication date: 02/09/2017
Field of study

Words in some natural languages can have a composite structure. Elements of this structure include the root (that could also be composite), prefixes and suffixes with which various nuances and relations to other words can be expressed. Thus, in order to build a proper word representation one must take into account its internal structure. From a corpus of texts we extract a set of frequent subwords and from the latter set we select patterns, i.e. subwords which encapsulate information on character

n

-gram regularities. The selection is made using the pattern-based Conditional Random Field model with

l_1

regularization. Further, for every word we construct a new sequence over an alphabet of patterns. The new alphabet's symbols confine a local statistical context stronger than the characters, therefore they allow better representations in

{\mathbb{R}}^n

and are better building blocks for word representation. In the task of subword-aware language modeling, pattern-based models outperform character-based analogues by 2-20 perplexity points. Also, a recurrent neural network in which a word is represented as a sum of embeddings of its patterns is on par with a competitive and significantly more sophisticated character-based convolutional architecture.Comment: 10 page

arXiv.org e-Print Archive

Crossref