27,229 research outputs found
Bivariate Beta-LSTM
Long Short-Term Memory (LSTM) infers the long term dependency through a cell
state maintained by the input and the forget gate structures, which models a
gate output as a value in [0,1] through a sigmoid function. However, due to the
graduality of the sigmoid function, the sigmoid gate is not flexible in
representing multi-modality or skewness. Besides, the previous models lack
modeling on the correlation between the gates, which would be a new method to
adopt inductive bias for a relationship between previous and current input.
This paper proposes a new gate structure with the bivariate Beta distribution.
The proposed gate structure enables probabilistic modeling on the gates within
the LSTM cell so that the modelers can customize the cell state flow with
priors and distributions. Moreover, we theoretically show the higher upper
bound of the gradient compared to the sigmoid function, and we empirically
observed that the bivariate Beta distribution gate structure provides higher
gradient values in training. We demonstrate the effectiveness of bivariate Beta
gate structure on the sentence classification, image classification, polyphonic
music modeling, and image caption generation.Comment: AAAI 202
The Bregman Variational Dual-Tree Framework
Graph-based methods provide a powerful tool set for many non-parametric
frameworks in Machine Learning. In general, the memory and computational
complexity of these methods is quadratic in the number of examples in the data
which makes them quickly infeasible for moderate to large scale datasets. A
significant effort to find more efficient solutions to the problem has been
made in the literature. One of the state-of-the-art methods that has been
recently introduced is the Variational Dual-Tree (VDT) framework. Despite some
of its unique features, VDT is currently restricted only to Euclidean spaces
where the Euclidean distance quantifies the similarity. In this paper, we
extend the VDT framework beyond the Euclidean distance to more general Bregman
divergences that include the Euclidean distance as a special case. By
exploiting the properties of the general Bregman divergence, we show how the
new framework can maintain all the pivotal features of the VDT framework and
yet significantly improve its performance in non-Euclidean domains. We apply
the proposed framework to different text categorization problems and
demonstrate its benefits over the original VDT.Comment: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty
in Artificial Intelligence (UAI2013
Modelling and analyzing adaptive self-assembling strategies with Maude
Building adaptive systems with predictable emergent behavior is a challenging task and it is becoming a critical need. The research community has accepted the challenge by introducing approaches of various nature: from software architectures, to programming paradigms, to analysis techniques. We recently proposed a conceptual framework for adaptation centered around the role of control data. In this paper we show that it can be naturally realized in a reflective logical language like Maude by using the Reflective Russian Dolls model. Moreover, we exploit this model to specify, validate and analyse a prominent example of adaptive system: robot swarms equipped with self-assembly strategies. The analysis exploits the statistical model checker PVeStA
- …