Search CORE

2,308 research outputs found

Integrated Inference and Learning of Neural Factors in Structural Support Vector Machines

Author: De Turck Filip
Houthooft Rein
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Tackling pattern recognition problems in areas such as computer vision, bioinformatics, speech or text recognition is often done best by taking into account task-specific statistical relations between output variables. In structured prediction, this internal structure is used to predict multiple outputs simultaneously, leading to more accurate and coherent predictions. Structural support vector machines (SSVMs) are nonprobabilistic models that optimize a joint input-output function through margin-based learning. Because SSVMs generally disregard the interplay between unary and interaction factors during the training phase, final parameters are suboptimal. Moreover, its factors are often restricted to linear combinations of input features, limiting its generalization power. To improve prediction accuracy, this paper proposes: (i) Joint inference and learning by integration of back-propagation and loss-augmented inference in SSVM subgradient descent; (ii) Extending SSVM factors to neural networks that form highly nonlinear functions of input features. Image segmentation benchmark results demonstrate improvements over conventional SSVM training methods in terms of accuracy, highlighting the feasibility of end-to-end SSVM training with neural factors

arXiv.org e-Print Archive

Crossref

Ghent University Academic Bibliography

Archivsystem Ask23

Deep Reinforcement Learning for Event-Triggered Control

Author: Baumann Dominik
Martius Georg
Trimpe Sebastian
Zhu Jia-Jie
Publication venue
Publication date: 01/01/2018
Field of study

Event-triggered control (ETC) methods can achieve high-performance control with a significantly lower number of samples compared to usual, time-triggered methods. These frameworks are often based on a mathematical model of the system and specific designs of controller and event trigger. In this paper, we show how deep reinforcement learning (DRL) algorithms can be leveraged to simultaneously learn control and communication behavior from scratch, and present a DRL approach that is particularly suitable for ETC. To our knowledge, this is the first work to apply DRL to ETC. We validate the approach on multiple control tasks and compare it to model-based event-triggering frameworks. In particular, we demonstrate that it can, other than many model-based ETC designs, be straightforwardly applied to nonlinear systems

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation

Author: Bengio Yoshua
Courville Aaron
Klinger Tim
Serban Iulian Vlad
Talamadupula Kartik
Tesauro Gerald
Zhou Bowen
Publication venue
Publication date: 13/06/2016
Field of study

We introduce the multiresolution recurrent neural network, which extends the sequence-to-sequence framework to model natural language generation as two parallel discrete stochastic processes: a sequence of high-level coarse tokens, and a sequence of natural language tokens. There are many ways to estimate or learn the high-level coarse tokens, but we argue that a simple extraction procedure is sufficient to capture a wealth of high-level discourse semantics. Such procedure allows training the multiresolution recurrent neural network by maximizing the exact joint log-likelihood over both sequences. In contrast to the standard log- likelihood objective w.r.t. natural language tokens (word perplexity), optimizing the joint log-likelihood biases the model towards modeling high-level abstractions. We apply the proposed model to the task of dialogue response generation in two challenging domains: the Ubuntu technical support domain, and Twitter conversations. On Ubuntu, the model outperforms competing approaches by a substantial margin, achieving state-of-the-art results according to both automatic evaluation metrics and a human evaluation study. On Twitter, the model appears to generate more relevant and on-topic responses according to automatic evaluation metrics. Finally, our experiments demonstrate that the proposed model is more adept at overcoming the sparsity of natural language and is better able to capture long-term structure.Comment: 21 pages, 2 figures, 10 table

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications