283 research outputs found

    Memory-augmented Neural Machine Translation

    Get PDF
    Neural machine translation (NMT) has achieved notable success in recent times, however it is also widely recognized that this approach has limitations with handling infrequent words and word pairs. This paper presents a novel memory-augmented NMT (M-NMT) architecture, which stores knowledge about how words (usually infrequently encountered ones) should be translated in a memory and then utilizes them to assist the neural model. We use this memory mechanism to combine the knowledge learned from a conventional statistical machine translation system and the rules learned by an NMT system, and also propose a solution for out-of-vocabulary (OOV) words based on this framework. Our experiments on two Chinese-English translation tasks demonstrated that the M-NMT architecture outperformed the NMT baseline by 9.09.0 and 2.72.7 BLEU points on the two tasks, respectively. Additionally, we found this architecture resulted in a much more effective OOV treatment compared to competitive methods

    Theoretical properties of quasi-stationary Monte Carlo methods

    Full text link
    This paper gives foundational results for the application of quasi-stationarity to Monte Carlo inference problems. We prove natural sufficient conditions for the quasi-limiting distribution of a killed diffusion to coincide with a target density of interest. We also quantify the rate of convergence to quasi-stationarity by relating the killed diffusion to an appropriate Langevin diffusion. As an example, we consider in detail a killed Ornstein--Uhlenbeck process with Gaussian quasi-stationary distribution.Comment: 27 pages, 1 figure. Final version of accepted paper. Minor typos correcte

    Flexible and Creative Chinese Poetry Generation Using Neural Memory

    Full text link
    It has been shown that Chinese poems can be successfully generated by sequence-to-sequence neural models, particularly with the attention mechanism. A potential problem of this approach, however, is that neural models can only learn abstract rules, while poem generation is a highly creative process that involves not only rules but also innovations for which pure statistical models are not appropriate in principle. This work proposes a memory-augmented neural model for Chinese poem generation, where the neural model and the augmented memory work together to balance the requirements of linguistic accordance and aesthetic innovation, leading to innovative generations that are still rule-compliant. In addition, it is found that the memory mechanism provides interesting flexibility that can be used to generate poems with different styles

    Incidental Vocabulary Learning From Bilingual Subtitled Viewing: An Eye-Tracking Study

    Get PDF
    This study examined the effectiveness of bilingual subtitles (relative to captions, subtitles, and no subtitles) for incidental vocabulary learning. Learners’ processing of novel words in the subtitles and its relationship with learning gains were also explored. One-hundred-and-twelve intermediate to advanced Chinese learners of English watched a documentary in one of four conditions (bilingual, captions, subtitles, and no subtitles), while their eye movements were recorded. Pre- and post- vocabulary tests (form recognition, meaning recall, and meaning recognition) assessed participants’ knowledge of the target vocabulary. Results suggested an advantage of bilingual subtitles over captions for meaning recognition and over subtitles for meaning recall. Bilingual subtitles were less effective than captions for form recognition. Participants in the bilingual subtitles group spent more time reading the Chinese translations of the target items than the English target words. Amount of attention to the English target words (but not to the translations) predicted learning gains

    Ano-SuPs: Multi-size anomaly detection for manufactured products by identifying suspected patches

    Full text link
    Image-based systems have gained popularity owing to their capacity to provide rich manufacturing status information, low implementation costs and high acquisition rates. However, the complexity of the image background and various anomaly patterns pose new challenges to existing matrix decomposition methods, which are inadequate for modeling requirements. Moreover, the uncertainty of the anomaly can cause anomaly contamination problems, making the designed model and method highly susceptible to external disturbances. To address these challenges, we propose a two-stage strategy anomaly detection method that detects anomalies by identifying suspected patches (Ano-SuPs). Specifically, we propose to detect the patches with anomalies by reconstructing the input image twice: the first step is to obtain a set of normal patches by removing those suspected patches, and the second step is to use those normal patches to refine the identification of the patches with anomalies. To demonstrate its effectiveness, we evaluate the proposed method systematically through simulation experiments and case studies. We further identified the key parameters and designed steps that impact the model's performance and efficiency.Comment: accepted oral presentation at the 18th INFORMS DMDA Worksho

    Subgeometric hypocoercivity for piecewise-deterministic Markov process Monte Carlo methods

    Get PDF
    We extend the hypocoercivity framework for piecewise-deterministic Markov process (PDMP) Monte Carlo established in [Andrieu et. al. (2018)] to heavy-tailed target distributions, which exhibit subgeometric rates of convergence to equilibrium. We make use of weak Poincar\'e inequalities, as developed in the work of [Grothaus and Wang (2019)], the ideas of which we adapt to the PDMPs of interest. On the way we report largely potential-independent approaches to bounding explicitly solutions of the Poisson equation of the Langevin diffusion and its first and second derivatives, required here to control various terms arising in the application of the hypocoercivity result.Comment: 33 pages, 1 figure. Minor revisions mad

    Advanced Data Analytics for Data-rich Multistage Manufacturing Processes

    Get PDF
    Nowadays, multistage manufacturing processes (MMPs) are usually equipped with complex sensing systems. They generate data with several unique characteristics: the output quality measurements from each stage are of different types, the comprehensive set of inputs (or process variables) have distinct degrees of influence over the process, and the relationship between the inputs and outputs is sometimes ambiguous, and multiple types of faults repetitively occur to the process during its operation. These characteristics of the data lead to new challenges in the data analytics of MMPs. In this thesis, we conduct three studies to tackle those new challenges from MMPs. In the first study, we propose a feature ranking scheme that ranks the process features based on their relationship with the final product quality. Our ranking scheme is called sparse distance correlation (SpaDC), and it satisfies the important diversity criteria from the engineering perspective and encourages the features that uniquely characterize the manufacturing process to be prioritized. The theoretical properties of SpaDC are studied. Simulations, as well as two real-case studies, are conducted to validate the method. In the second study, we propose a holistic modeling approach for the MMPs, aiming at understanding how intermediate quality measurements of mixed profile outputs relate to sparse effective inputs. This model can identify the effective inputs, output variation patterns, and establish connections between them. Specifically, the aforementioned objective is achieved by formulating and solving an optimization problem that involves the effects of process inputs on the outputs across the entire MMP. This ADMM algorithm that solves this problem is highly parallelizable and thus can handle a large amount of data of mixed types obtained from MMPs. In the third study, a retrospective analysis method is proposed for multiple functional signals. This method simultaneously identifies when multiple events occur to the system and characterizes how they affect the multiple sensing signals. A problem is formulated using the dictionary learning method, and the solution is obtained by iteratively updating the event signatures and sequences using ADMM algorithms. In the end, the potential extensions to the general interconnect systems are discussed.Ph.D
    • …
    corecore