238 research outputs found
Deep Neural Machine Translation with Linear Associative Unit
Deep Neural Networks (DNNs) have provably enhanced the state-of-the-art
Neural Machine Translation (NMT) with their capability in modeling complex
functions and capturing complex linguistic structures. However NMT systems with
deep architecture in their encoder or decoder RNNs often suffer from severe
gradient diffusion due to the non-linear recurrent activations, which often
make the optimization much more difficult. To address this problem we propose
novel linear associative units (LAU) to reduce the gradient propagation length
inside the recurrent unit. Different from conventional approaches (LSTM unit
and GRU), LAUs utilizes linear associative connections between input and output
of the recurrent unit, which allows unimpeded information flow through both
space and time direction. The model is quite simple, but it is surprisingly
effective. Our empirical study on Chinese-English translation shows that our
model with proper configuration can improve by 11.7 BLEU upon Groundhog and the
best reported results in the same setting. On WMT14 English-German task and a
larger WMT14 English-French task, our model achieves comparable results with
the state-of-the-art.Comment: 10 pages, ACL 201
Memory-enhanced Decoder for Neural Machine Translation
We propose to enhance the RNN decoder in a neural machine translator (NMT)
with external memory, as a natural but powerful extension to the state in the
decoding RNN. This memory-enhanced RNN decoder is called \textsc{MemDec}. At
each time during decoding, \textsc{MemDec} will read from this memory and write
to this memory once, both with content-based addressing. Unlike the unbounded
memory in previous work\cite{RNNsearch} to store the representation of source
sentence, the memory in \textsc{MemDec} is a matrix with pre-determined size
designed to better capture the information important for the decoding process
at each time step. Our empirical study on Chinese-English translation shows
that it can improve by BLEU upon Groundhog and BLEU upon on Moses,
yielding the best performance achieved with the same training set.Comment: 11 page
KINEMATIC ANALYSIS OF SHOT PUT IN ELITE ATHLETES – A CASE STUDY
This paper presented the application of biomechanics in the shot put. Three elite shot-putters was video recorded. By planar analysis, the following kinematic data have been discussed: (1) the loss of distance in performances, (2) the swinging span of the leg, (3) the height of the shot before the last effort, (4) the waving manner of the swinging arm, and (5) the influence of the differences between the velocity angle of the released shot and its optimum angle. The effects of the measured values of above parameters on performances and their mechanic causes were analyzed. The results of this study provided the information for improvement of performance in athletes
Bridging the Gap Between Variational Inference and Wasserstein Gradient Flows
Variational inference is a technique that approximates a target distribution
by optimizing within the parameter space of variational families. On the other
hand, Wasserstein gradient flows describe optimization within the space of
probability measures where they do not necessarily admit a parametric density
function. In this paper, we bridge the gap between these two methods. We
demonstrate that, under certain conditions, the Bures-Wasserstein gradient flow
can be recast as the Euclidean gradient flow where its forward Euler scheme is
the standard black-box variational inference algorithm. Specifically, the
vector field of the gradient flow is generated via the path-derivative gradient
estimator. We also offer an alternative perspective on the path-derivative
gradient, framing it as a distillation procedure to the Wasserstein gradient
flow. Distillations can be extended to encompass -divergences and
non-Gaussian variational families. This extension yields a new gradient
estimator for -divergences, readily implementable using contemporary machine
learning libraries like PyTorch or TensorFlow
Task Transfer by Preference-Based Cost Learning
The goal of task transfer in reinforcement learning is migrating the action
policy of an agent to the target task from the source task. Given their
successes on robotic action planning, current methods mostly rely on two
requirements: exactly-relevant expert demonstrations or the explicitly-coded
cost function on target task, both of which, however, are inconvenient to
obtain in practice. In this paper, we relax these two strong conditions by
developing a novel task transfer framework where the expert preference is
applied as a guidance. In particular, we alternate the following two steps:
Firstly, letting experts apply pre-defined preference rules to select related
expert demonstrates for the target task. Secondly, based on the selection
result, we learn the target cost function and trajectory distribution
simultaneously via enhanced Adversarial MaxEnt IRL and generate more
trajectories by the learned target distribution for the next preference
selection. The theoretical analysis on the distribution learning and
convergence of the proposed algorithm are provided. Extensive simulations on
several benchmarks have been conducted for further verifying the effectiveness
of the proposed method.Comment: Accepted to AAAI 2019. Mingxuan Jing and Xiaojian Ma contributed
equally to this wor
Rayleigh-Taylor Unstable Flames: the Coupled Effect of Multiple Perturbations
The Rayleigh-Taylor (RT) instability is important in the fields of aerospace
engineering, nuclear physics, and astrophysical research, particularly in
studies of Type Ia supernovae. In some applications, the RT instability is
complicated by a reaction at the unstable interface. In this paper, we show how
this reaction changes the behavior of the RT instability. Using 2D direct
numerical simulations (DNS) of Boussinesq premixed flames with a model reaction
rate, we show how the flame responds to three types of perturbation: a large
amplitude single mode primary perturbation, a smaller amplitude single mode
secondary perturbation, and a numerically generated system perturbation with
both single mode and multimode components. Early on, the evolution of the flame
is dominated by the primary perturbation and, differently from single mode
nonreacting RT, the flame propagates as a metastable traveling wave in the form
of bubbles separated by cusp-like spikes. However, the lifetime of this
traveling wave depends on the properties of the secondary and system
perturbations and on the strength of gravity. Once the traveling wave is
destabilized, the flame front bubbles rapidly grow to large scales. We identify
five distinct flame growth solution types, with the symmetry and properties of
each depending on the balance and interactions between the three types of
perturbation. In particular, we show that the primary and secondary modes can
couple to generate a tertiary mode which ultimately dominates the flow.
Depending on the wavenumber of the tertiary mode, the flame may stall, develop
coherent pulsations, or even become a metastable traveling wave again,
behaviors not seen in nonreacting RT.Comment: 25 pages, 15 figures; Submitted to: Physical Review Fluids; Code and
Data Release: see https://doi.org/10.5281/zenodo.834691
CDF W mass anomaly from a dark sector with a Stueckelberg-Higgs portal
We propose an explanation to the new W mass measurement recently reported by
the CDF collaboration, which is larger than the standard model expectation by
about 7 standard deviations. To alleviate the tensions that are imposed on the
electroweak sector by the new W mass measurement, we carry out an analysis in
the Stueckelberg extended standard model where a new neutral gauge boson
appears which mixes with the two neutral gauge bosons in the electroweak sector
both via the Stueckelberg mass terms and via the gauge invariant
Stueckelberg-Higgs portal interaction and spoils the custodial symmetry at the
tree level so that the simple relation between the W boson mass and the Z boson
mass does not hold. We find that such an extension increases the W boson mass
if the new gauge boson mass is larger than the Z boson mass. We further show
that there exists a significant part of the parameter space in the extended
model which includes the CDF mass anomaly and is consistent with the various
observables at the Z pole and consistent with the ATLAS dilepton limits. The
Stueckelberg boson, which resolves the CDF W mass anomaly, should
be searchable in future LHC experiments.Comment: v1, 6 pages, 2 figures. v2, refs adde
- …