271 research outputs found

    Boosting Naturalness of Language in Task-oriented Dialogues via Adversarial Training

    Full text link
    The natural language generation (NLG) module in a task-oriented dialogue system produces user-facing utterances conveying required information. Thus, it is critical for the generated response to be natural and fluent. We propose to integrate adversarial training to produce more human-like responses. The model uses Straight-Through Gumbel-Softmax estimator for gradient computation. We also propose a two-stage training scheme to boost performance. Empirical results show that the adversarial training can effectively improve the quality of language generation in both automatic and human evaluations. For example, in the RNN-LG Restaurant dataset, our model AdvNLG outperforms the previous state-of-the-art result by 3.6% in BLEU.Comment: SIGDial 202

    Information Diffusion and External Influence in Networks

    Full text link
    Social networks play a fundamental role in the diffusion of information. However, there are two different ways of how information reaches a person in a network. Information reaches us through connections in our social networks, as well as through the influence of external out-of-network sources, like the mainstream media. While most present models of information adoption in networks assume information only passes from a node to node via the edges of the underlying network, the recent availability of massive online social media data allows us to study this process in more detail. We present a model in which information can reach a node via the links of the social network or through the influence of external sources. We then develop an efficient model parameter fitting technique and apply the model to the emergence of URL mentions in the Twitter network. Using a complete one month trace of Twitter we study how information reaches the nodes of the network. We quantify the external influences over time and describe how these influences affect the information adoption. We discover that the information tends to "jump" across the network, which can only be explained as an effect of an unobservable external influence on the network. We find that only about 71% of the information volume in Twitter can be attributed to network diffusion, and the remaining 29% is due to external events and factors outside the network

    Optomechanical Transductions in Single and Coupled Wheel Resonators

    Full text link
    In this report, the optomechanical transductions in both single and two side-coupled wheel resonators are investigated. In the single resonator, the optomechanical transduction sensitivity is determined by the optical and mechanical quality factors of the resonator. In the coupled resonators, the optomechanical transduction is related to the energy distribution in the two resonators, which is strongly dependent on the input detuning. Compared to a single resonator, the coupled resonators can still provide very sensitive optomechanical transduction even if the optical and mechanical quality factors of one resonator are degraded

    Learning to Attend On Essential Terms: An Enhanced Retriever-Reader Model for Open-domain Question Answering

    Full text link
    Open-domain question answering remains a challenging task as it requires models that are capable of understanding questions and answers, collecting useful information, and reasoning over evidence. Previous work typically formulates this task as a reading comprehension or entailment problem given evidence retrieved from search engines. However, existing techniques struggle to retrieve indirectly related evidence when no directly related evidence is provided, especially for complex questions where it is hard to parse precisely what the question asks. In this paper we propose a retriever-reader model that learns to attend on essential terms during the question answering process. We build (1) an essential term selector which first identifies the most important words in a question, then reformulates the query and searches for related evidence; and (2) an enhanced reader that distinguishes between essential terms and distracting words to predict the answer. We evaluate our model on multiple open-domain multiple-choice QA datasets, notably performing at the level of the state-of-the-art on the AI2 Reasoning Challenge (ARC) dataset

    Data Augmentation for Spoken Language Understanding via Pretrained Models

    Full text link
    The training of spoken language understanding (SLU) models often faces the problem of data scarcity. In this paper, we put forward a data augmentation method with pretrained language models to boost the variability and accuracy of generated utterances. Furthermore, we investigate and propose solutions to two previously overlooked scenarios of data scarcity in SLU: i) Rich-in-Ontology: ontology information with numerous valid dialogue acts are given; ii) Rich-in-Utterance: a large number of unlabelled utterances are available. Empirical results show that our method can produce synthetic training data that boosts the performance of language understanding models in various scenarios.Comment: 6 pages, 1 figur

    FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension

    Full text link
    This paper introduces a new neural structure called FusionNet, which extends existing attention approaches from three perspectives. First, it puts forward a novel concept of "history of word" to characterize attention information from the lowest word-level embedding up to the highest semantic-level representation. Second, it introduces an improved attention scoring function that better utilizes the "history of word" concept. Third, it proposes a fully-aware multi-level attention mechanism to capture the complete information in one text (such as a question) and exploit it in its counterpart (such as context or passage) layer by layer. We apply FusionNet to the Stanford Question Answering Dataset (SQuAD) and it achieves the first position for both single and ensemble model on the official SQuAD leaderboard at the time of writing (Oct. 4th, 2017). Meanwhile, we verify the generalization of FusionNet with two adversarial SQuAD datasets and it sets up the new state-of-the-art on both datasets: on AddSent, FusionNet increases the best F1 metric from 46.6% to 51.4%; on AddOneSent, FusionNet boosts the best F1 metric from 56.0% to 60.7%.Comment: Published in Sixth International Conference on Learning Representations (ICLR), 201

    Impossible Triangle: What's Next for Pre-trained Language Models?

    Full text link
    Recent development of large-scale pre-trained language models (PLM) have significantly improved the capability of models in various NLP tasks, in terms of performance after task-specific fine-tuning and zero-shot / few-shot learning. However, many of such models come with a dauntingly huge size that few institutions can afford to pre-train, fine-tune or even deploy, while moderate-sized models usually lack strong generalized few-shot learning capabilities. In this paper, we first elaborate the current obstacles of using PLM models in terms of the Impossible Triangle: 1) moderate model size, 2) state-of-the-art few-shot learning capability, and 3) state-of-the-art fine-tuning capability. We argue that all existing PLM models lack one or more properties from the Impossible Triangle. To remedy these missing properties of PLMs, various techniques have been proposed, such as knowledge distillation, data augmentation and prompt learning, which inevitably brings additional work to the application of PLMs in real scenarios. We then offer insights into future research directions of PLMs to achieve the Impossible Triangle, and break down the task into several key phases

    SDNet: Contextualized Attention-based Deep Network for Conversational Question Answering

    Full text link
    Conversational question answering (CQA) is a novel QA task that requires understanding of dialogue context. Different from traditional single-turn machine reading comprehension (MRC) tasks, CQA includes passage comprehension, coreference resolution, and contextual understanding. In this paper, we propose an innovated contextualized attention-based deep neural network, SDNet, to fuse context into traditional MRC models. Our model leverages both inter-attention and self-attention to comprehend conversation context and extract relevant information from passage. Furthermore, we demonstrated a novel method to integrate the latest BERT contextual model. Empirical results show the effectiveness of our model, which sets the new state of the art result in CoQA leaderboard, outperforming the previous best model by 1.6% F1. Our ensemble model further improves the result by 2.7% F1.Comment: 8 pages, 2 figure

    SIM: A Slot-Independent Neural Model for Dialogue State Tracking

    Full text link
    Dialogue state tracking is an important component in task-oriented dialogue systems to identify users' goals and requests as a dialogue proceeds. However, as most previous models are dependent on dialogue slots, the model complexity soars when the number of slots increases. In this paper, we put forward a slot-independent neural model (SIM) to track dialogue states while keeping the model complexity invariant to the number of dialogue slots. The model utilizes attention mechanisms between user utterance and system actions. SIM achieves state-of-the-art results on WoZ and DSTC2 tasks, with only 20% of the model size of previous models.Comment: 6 pages, 1 figur

    SPLAT: Speech-Language Joint Pre-Training for Spoken Language Understanding

    Full text link
    Spoken language understanding (SLU) requires a model to analyze input acoustic signal to understand its linguistic content and make predictions. To boost the models' performance, various pre-training methods have been proposed to learn rich representations from large-scale unannotated speech and text. However, the inherent disparities between the two modalities necessitate a mutual analysis. In this paper, we propose a novel semi-supervised learning framework, SPLAT, to jointly pre-train the speech and language modules. Besides conducting a self-supervised masked language modeling task on the two individual modules using unpaired speech and text, SPLAT aligns representations from the two modules in a shared latent space using a small amount of paired speech and text. Thus, during fine-tuning, the speech module alone can produce representations carrying both acoustic information and contextual semantic knowledge of an input acoustic signal. Experimental results verify the effectiveness of our approach on various SLU tasks. For example, SPLAT improves the previous state-of-the-art performance on the Spoken SQuAD dataset by more than 10%
    • …
    corecore