Search CORE

238 research outputs found

Deep Neural Machine Translation with Linear Associative Unit

Author: Liu Qun
Lu Zhengdong
Wang Mingxuan
Zhou Jie
Publication venue
Publication date: 01/01/2017
Field of study

Deep Neural Networks (DNNs) have provably enhanced the state-of-the-art Neural Machine Translation (NMT) with their capability in modeling complex functions and capturing complex linguistic structures. However NMT systems with deep architecture in their encoder or decoder RNNs often suffer from severe gradient diffusion due to the non-linear recurrent activations, which often make the optimization much more difficult. To address this problem we propose novel linear associative units (LAU) to reduce the gradient propagation length inside the recurrent unit. Different from conventional approaches (LSTM unit and GRU), LAUs utilizes linear associative connections between input and output of the recurrent unit, which allows unimpeded information flow through both space and time direction. The model is quite simple, but it is surprisingly effective. Our empirical study on Chinese-English translation shows that our model with proper configuration can improve by 11.7 BLEU upon Groundhog and the best reported results in the same setting. On WMT14 English-German task and a larger WMT14 English-French task, our model achieves comparable results with the state-of-the-art.Comment: 10 pages, ACL 201

arXiv.org e-Print Archive

Crossref

Memory-enhanced Decoder for Neural Machine Translation

Author: Li Hang
Liu Qun
Lu Zhengdong
Wang Mingxuan
Publication venue
Publication date: 01/01/2016
Field of study

We propose to enhance the RNN decoder in a neural machine translator (NMT) with external memory, as a natural but powerful extension to the state in the decoding RNN. This memory-enhanced RNN decoder is called \textsc{MemDec}. At each time during decoding, \textsc{MemDec} will read from this memory and write to this memory once, both with content-based addressing. Unlike the unbounded memory in previous work\cite{RNNsearch} to store the representation of source sentence, the memory in \textsc{MemDec} is a matrix with pre-determined size designed to better capture the information important for the decoding process at each time step. Our empirical study on Chinese-English translation shows that it can improve by

4.8

BLEU upon Groundhog and

5.3

BLEU upon on Moses, yielding the best performance achieved with the same training set.Comment: 11 page

arXiv.org e-Print Archive

Crossref

KINEMATIC ANALYSIS OF SHOT PUT IN ELITE ATHLETES – A CASE STUDY

Author: Liu Weimin
Wang Mingxuan
Publication venue: International Society of Biomechanics in Sports (ISBS)
Publication date: 18/02/2009
Field of study

This paper presented the application of biomechanics in the shot put. Three elite shot-putters was video recorded. By planar analysis, the following kinematic data have been discussed: (1) the loss of distance in performances, (2) the swinging span of the leg, (3) the height of the shot before the last effort, (4) the waving manner of the swinging arm, and (5) the influence of the differences between the velocity angle of the released shot and its optimum angle. The effects of the measured values of above parameters on performances and their mechanic causes were analyzed. The results of this study provided the information for improvement of performance in athletes

ISBS (International Society of Biomechanics in Sports): Conference Proceedings Archive

Bridging the Gap Between Variational Inference and Wasserstein Gradient Flows

Author: Liu Song
Yi Mingxuan
Publication venue
Publication date: 30/10/2023
Field of study

Variational inference is a technique that approximates a target distribution by optimizing within the parameter space of variational families. On the other hand, Wasserstein gradient flows describe optimization within the space of probability measures where they do not necessarily admit a parametric density function. In this paper, we bridge the gap between these two methods. We demonstrate that, under certain conditions, the Bures-Wasserstein gradient flow can be recast as the Euclidean gradient flow where its forward Euler scheme is the standard black-box variational inference algorithm. Specifically, the vector field of the gradient flow is generated via the path-derivative gradient estimator. We also offer an alternative perspective on the path-derivative gradient, framing it as a distillation procedure to the Wasserstein gradient flow. Distillations can be extended to encompass

f

-divergences and non-Gaussian variational families. This extension yields a new gradient estimator for

f

-divergences, readily implementable using contemporary machine learning libraries like PyTorch or TensorFlow

arXiv.org e-Print Archive

Task Transfer by Preference-Based Cost Learning

Author: Huang Wenbing
Jing Mingxuan
Liu Huaping
Ma Xiaojian
Sun Fuchun
Publication venue
Publication date: 18/02/2019
Field of study

The goal of task transfer in reinforcement learning is migrating the action policy of an agent to the target task from the source task. Given their successes on robotic action planning, current methods mostly rely on two requirements: exactly-relevant expert demonstrations or the explicitly-coded cost function on target task, both of which, however, are inconvenient to obtain in practice. In this paper, we relax these two strong conditions by developing a novel task transfer framework where the expert preference is applied as a guidance. In particular, we alternate the following two steps: Firstly, letting experts apply pre-defined preference rules to select related expert demonstrates for the target task. Secondly, based on the selection result, we learn the target cost function and trajectory distribution simultaneously via enhanced Adversarial MaxEnt IRL and generate more trajectories by the learned target distribution for the next preference selection. The theoretical analysis on the distribution learning and convergence of the proposed algorithm are provided. Extensive simulations on several benchmarks have been conducted for further verifying the effectiveness of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed equally to this wor

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Rayleigh-Taylor Unstable Flames: the Coupled Effect of Multiple Perturbations

Author: Hicks Elizabeth P.
Liu Mingxuan
Publication venue
Publication date: 26/09/2023
Field of study

The Rayleigh-Taylor (RT) instability is important in the fields of aerospace engineering, nuclear physics, and astrophysical research, particularly in studies of Type Ia supernovae. In some applications, the RT instability is complicated by a reaction at the unstable interface. In this paper, we show how this reaction changes the behavior of the RT instability. Using 2D direct numerical simulations (DNS) of Boussinesq premixed flames with a model reaction rate, we show how the flame responds to three types of perturbation: a large amplitude single mode primary perturbation, a smaller amplitude single mode secondary perturbation, and a numerically generated system perturbation with both single mode and multimode components. Early on, the evolution of the flame is dominated by the primary perturbation and, differently from single mode nonreacting RT, the flame propagates as a metastable traveling wave in the form of bubbles separated by cusp-like spikes. However, the lifetime of this traveling wave depends on the properties of the secondary and system perturbations and on the strength of gravity. Once the traveling wave is destabilized, the flame front bubbles rapidly grow to large scales. We identify five distinct flame growth solution types, with the symmetry and properties of each depending on the balance and interactions between the three types of perturbation. In particular, we show that the primary and secondary modes can couple to generate a tertiary mode which ultimately dominates the flow. Depending on the wavenumber of the tertiary mode, the flame may stall, develop coherent pulsations, or even become a metastable traveling wave again, behaviors not seen in nonreacting RT.Comment: 25 pages, 15 figures; Submitted to: Physical Review Fluids; Code and Data Release: see https://doi.org/10.5281/zenodo.834691

arXiv.org e-Print Archive

CDF W mass anomaly from a dark sector with a Stueckelberg-Higgs portal

Author: Du Mingxuan
Liu Zuowei
Nath Pran
Publication venue: 'Elsevier BV'
Publication date: 29/04/2022
Field of study

We propose an explanation to the new W mass measurement recently reported by the CDF collaboration, which is larger than the standard model expectation by about 7 standard deviations. To alleviate the tensions that are imposed on the electroweak sector by the new W mass measurement, we carry out an analysis in the Stueckelberg extended standard model where a new neutral gauge boson appears which mixes with the two neutral gauge bosons in the electroweak sector both via the Stueckelberg mass terms and via the gauge invariant Stueckelberg-Higgs portal interaction and spoils the custodial symmetry at the tree level so that the simple relation between the W boson mass and the Z boson mass does not hold. We find that such an extension increases the W boson mass if the new gauge boson mass is larger than the Z boson mass. We further show that there exists a significant part of the parameter space in the extended model which includes the CDF mass anomaly and is consistent with the various observables at the Z pole and consistent with the ATLAS dilepton limits. The Stueckelberg

Z'_{\rm St}

boson, which resolves the CDF W mass anomaly, should be searchable in future LHC experiments.Comment: v1, 6 pages, 2 figures. v2, refs adde

arXiv.org e-Print Archive