1,816 research outputs found
Beyond Sparsity: Tree Regularization of Deep Models for Interpretability
The lack of interpretability remains a key barrier to the adoption of deep
models in many applications. In this work, we explicitly regularize deep models
so human users might step through the process behind their predictions in
little time. Specifically, we train deep time-series models so their
class-probability predictions have high accuracy while being closely modeled by
decision trees with few nodes. Using intuitive toy examples as well as medical
tasks for treating sepsis and HIV, we demonstrate that this new tree
regularization yields models that are easier for humans to simulate than
simpler L1 or L2 penalties without sacrificing predictive power.Comment: To appear in AAAI 2018. Contains 9-page main paper and appendix with
supplementary materia
Continuous-variable quantum neural networks
We introduce a general method for building neural networks on quantum
computers. The quantum neural network is a variational quantum circuit built in
the continuous-variable (CV) architecture, which encodes quantum information in
continuous degrees of freedom such as the amplitudes of the electromagnetic
field. This circuit contains a layered structure of continuously parameterized
gates which is universal for CV quantum computation. Affine transformations and
nonlinear activation functions, two key elements in neural networks, are
enacted in the quantum network using Gaussian and non-Gaussian gates,
respectively. The non-Gaussian gates provide both the nonlinearity and the
universality of the model. Due to the structure of the CV model, the CV quantum
neural network can encode highly nonlinear transformations while remaining
completely unitary. We show how a classical network can be embedded into the
quantum formalism and propose quantum versions of various specialized model
such as convolutional, recurrent, and residual networks. Finally, we present
numerous modeling experiments built with the Strawberry Fields software
library. These experiments, including a classifier for fraud detection, a
network which generates Tetris images, and a hybrid classical-quantum
autoencoder, demonstrate the capability and adaptability of CV quantum neural
networks
Patterns versus Characters in Subword-aware Neural Language Modeling
Words in some natural languages can have a composite structure. Elements of
this structure include the root (that could also be composite), prefixes and
suffixes with which various nuances and relations to other words can be
expressed. Thus, in order to build a proper word representation one must take
into account its internal structure. From a corpus of texts we extract a set of
frequent subwords and from the latter set we select patterns, i.e. subwords
which encapsulate information on character -gram regularities. The selection
is made using the pattern-based Conditional Random Field model with
regularization. Further, for every word we construct a new sequence over an
alphabet of patterns. The new alphabet's symbols confine a local statistical
context stronger than the characters, therefore they allow better
representations in and are better building blocks for word
representation. In the task of subword-aware language modeling, pattern-based
models outperform character-based analogues by 2-20 perplexity points. Also, a
recurrent neural network in which a word is represented as a sum of embeddings
of its patterns is on par with a competitive and significantly more
sophisticated character-based convolutional architecture.Comment: 10 page
- …