6,206 research outputs found
Adversarial Learning for Neural Dialogue Generation
In this paper, drawing intuition from the Turing test, we propose using
adversarial training for open-domain dialogue generation: the system is trained
to produce sequences that are indistinguishable from human-generated dialogue
utterances. We cast the task as a reinforcement learning (RL) problem where we
jointly train two systems, a generative model to produce response sequences,
and a discriminator---analagous to the human evaluator in the Turing test--- to
distinguish between the human-generated dialogues and the machine-generated
ones. The outputs from the discriminator are then used as rewards for the
generative model, pushing the system to generate dialogues that mostly resemble
human dialogues.
In addition to adversarial training we describe a model for adversarial {\em
evaluation} that uses success in fooling an adversary as a dialogue evaluation
metric, while avoiding a number of potential pitfalls. Experimental results on
several metrics, including adversarial evaluation, demonstrate that the
adversarially-trained system generates higher-quality responses than previous
baselines
End-to-end optimization of goal-driven and visually grounded dialogue systems
End-to-end design of dialogue systems has recently become a popular research
topic thanks to powerful tools such as encoder-decoder architectures for
sequence-to-sequence learning. Yet, most current approaches cast human-machine
dialogue management as a supervised learning problem, aiming at predicting the
next utterance of a participant given the full history of the dialogue. This
vision is too simplistic to render the intrinsic planning problem inherent to
dialogue as well as its grounded nature, making the context of a dialogue
larger than the sole history. This is why only chit-chat and question answering
tasks have been addressed so far using end-to-end architectures. In this paper,
we introduce a Deep Reinforcement Learning method to optimize visually grounded
task-oriented dialogues, based on the policy gradient algorithm. This approach
is tested on a dataset of 120k dialogues collected through Mechanical Turk and
provides encouraging results at solving both the problem of generating natural
dialogues and the task of discovering a specific object in a complex picture
Adding fuel to the flames: how TTIP reinvigorated the politicization of trade
It is a truism to state that the Transatlantic Trade and Investment Partnership (TTIP) is a politicized issue, yet the explanations that account for this politicization are mostly singular in nature. In this paper I add to this understanding theoretically and empirically by presenting a broad analytic framework that puts TTIP at the intersection of two evolutions. There is, firstly, a longer-term trend of increasing political authority of (European) trade policy that is (at least by several organizations and citizens) not considered legitimate. I argue that TTIP is an extension and an intensification of this perceived authority-without-legitimacy trend. Secondly, the particular explosive situation that has occurred since 2013 is furthermore the result of a specific combination of a favoring political opportunity structure, combined with pre-existing mobilization resources that have facilitated a large mobilization by civil society organizations. This explains the spike of politicization that is attached onto this longer term trend. Relying on several exploratory interviews, I try to uncover the determinants in the different categories
What Role(s) for the European Union in National Dialogues? Lessons Learned from Yemen. EU Diplomacy Paper 05/2018
National dialogues aim to reconstruct a legitimate institutional framework after a
conflict through broader representativeness as well as a new social contract between
the state and the society. The European Union (EU) can play various roles in such
processes. However, the involvement of an external actor may undermine the
national ownership and credibility of national dialogues. Hence, the first aim of this
paper is to analyse how the EU can support national dialogues without undermining
their national ownership and legitimacy. To answer that question, this paper develops
a new analytical grid conceptualising the various roles that the EU can play in national
dialogues. This analytical grid, the Analysis of National Dialogues External Support
model – or ANDES model - shows that the EU has various entry points to support
national dialogues and that those vary from one national dialogue to the other. It is
then only through lessons learned that the EU may find the right balance between
pushing for liberal reforms and respecting the national ownership of the national
dialogue. In order to illustrate the ANDES model by a concrete example, the Yemeni
National Dialogue Conference (NDC) is analysed. While the Yemeni NDC was
considered at the beginning as highly promising, this paper’s second objective is to
analyse why this national dialogue failed. The paper finds that various elements in the
process-design, the decision-making management and the operationalisation of the
Yemeni national dialogue were not appropriate for the country’s situation, reflecting
a general lack of social cohesion of the Yemeni society during the national dialogue
and undermining the success of the process
Deep reinforcement learning of dialogue policies with less weight updates
Deep reinforcement learning dialogue systems are attractive because they can jointly learn their feature representations and policies without manual feature engineering. But its application is challenging due to slow learning. We propose a two-stage method for accelerating the induction of single or multi-domain dialogue policies. While the first stage reduces the amount of weight updates over time, the second stage uses very limited minibatches (of as much as two learning experiences) sampled from experience replay memories. The former frequently updates the weights of the neural nets at early stages of training, and decreases the amount of updates as training progresses by performing updates during exploration and by skipping updates during exploitation. The learning process is thus accelerated
through less weight updates in both stages. An empirical evaluation in three domains (restaurants, hotels and tv guide) confirms that the proposed method trains policies 5 times faster than a baseline without the proposed method. Our findings are useful for training larger-scale neural-based spoken dialogue systems
- …