307 research outputs found
BattRAE: Bidimensional Attention-Based Recursive Autoencoders for Learning Bilingual Phrase Embeddings
In this paper, we propose a bidimensional attention based recursive
autoencoder (BattRAE) to integrate clues and sourcetarget interactions at
multiple levels of granularity into bilingual phrase representations. We employ
recursive autoencoders to generate tree structures of phrases with embeddings
at different levels of granularity (e.g., words, sub-phrases and phrases). Over
these embeddings on the source and target side, we introduce a bidimensional
attention network to learn their interactions encoded in a bidimensional
attention matrix, from which we extract two soft attention weight distributions
simultaneously. These weight distributions enable BattRAE to generate
compositive phrase representations via convolution. Based on the learned phrase
representations, we further use a bilinear neural model, trained via a
max-margin method, to measure bilingual semantic similarity. To evaluate the
effectiveness of BattRAE, we incorporate this semantic similarity as an
additional feature into a state-of-the-art SMT system. Extensive experiments on
NIST Chinese-English test sets show that our model achieves a substantial
improvement of up to 1.63 BLEU points on average over the baseline.Comment: 7 pages, accepted by AAAI 201
Translating Phrases in Neural Machine Translation
Phrases play an important role in natural language understanding and machine
translation (Sag et al., 2002; Villavicencio et al., 2005). However, it is
difficult to integrate them into current neural machine translation (NMT) which
reads and generates sentences word by word. In this work, we propose a method
to translate phrases in NMT by integrating a phrase memory storing target
phrases from a phrase-based statistical machine translation (SMT) system into
the encoder-decoder architecture of NMT. At each decoding step, the phrase
memory is first re-written by the SMT model, which dynamically generates
relevant target phrases with contextual information provided by the NMT model.
Then the proposed model reads the phrase memory to make probability estimations
for all phrases in the phrase memory. If phrase generation is carried on, the
NMT decoder selects an appropriate phrase from the memory to perform phrase
translation and updates its decoding state by consuming the words in the
selected phrase. Otherwise, the NMT decoder generates a word from the
vocabulary as the general NMT decoder does. Experiment results on the Chinese
to English translation show that the proposed model achieves significant
improvements over the baseline on various test sets.Comment: Accepted by EMNLP 201
Recommended from our members
Uncertainty quantification and its properties for hidden Markov models with application to condition based maintenance
Condition-based maintenance (CBM) can be viewed as a transformation of data gathered from a piece of equipment into information about its condition, and further into decisions on what to do with the equipment. Hidden Markov model (HMM) is a useful framework to probabilistically model the condition of complex engineering systems with partial observability of the underlying states. Condition monitoring and prediction of such type of system requires accurate knowledge of HMM that describes the degradation of such a system with data collected from the sensors mounted on it, as well as understanding of the uncertainty of the HMMs identified from the available data. To that end, this thesis proposes a novel HMM estimation scheme based on the principles of Bayes theorem. The newly proposed Bayesian estimation approach for estimating HMM parameters naturally yields information about model parametric uncertainties via posterior distributions of HMM parameters emanating from the estimation process. In addition, a novel condition monitoring scheme based on uncertain
HMMs of the degradation process is proposed and demonstrated on a large dataset obtained from a semiconductor manufacturing facility. Portion of the data was used to build operating mode specific HMMs of machine degradation via the newly proposed Bayesian estimation process, while the remainder of the data was used for monitoring of machine condition using the uncertain degradation HMMs yielded by Bayesian estimation. Comparison with a traditional signature-based statistical monitoring method showed that the newly proposed approach effectively utilizes the fact that its parameters are uncertain themselves, leading to orders of magnitude fewer false alarms. This methodology is further extended to address the practical issue that maintenance interventions are usually imperfect. We propose both a novel non-ergodic and non-homogeneous HMM that assumes imperfect maintenances and a novel process monitoring method capable of monitoring the hidden states considering model uncertainty. Significant improvement in both the log-likelihood of estimated HMM parameters and monitoring performance were observed, compared to those obtained using degradation HMMs that always assumed perfect maintenance.
Finally, behavior of the posterior distribution of parameters of unidirectional non- ergodic HMMs modeling in this thesis for degradation was theoretically analyzed in terms of their evolution as more data become available in the estimation process. The convergence problem is formulated as a Bernstein-von Mises theorem (BvMT), and under certain regularity conditions, the sequence of posterior distributions is proven to converge to a Gaussian distribution with variance matrix being the inverse of the Fisher information matrix. An example of a unidirectional HMM is presented for which the regularity conditions are verified, and illustrations of expected theoretical results are given using simulation. The understanding of such convergence of posterior distributions
enables one to determine when Bayesian estimation of degradation HMMs is justified and converges toward true model parameters, as well as how much data one then needs to achieve desired accuracy of the resulting model. Understanding of these issues is of utmost important if HMMs are to be used for degradation modeling and monitoring.Operations Research and Industrial Engineerin
Stochastic Modelling and Analysis for Bridges under Spatially Varying Ground Motions
Earthquake is undoubtedly one of the greatest natural disasters that can induce serious structural damage and huge losses of properties and lives. The resulting destructive consequences not only have made structural seismic analysis and design much more important but have impelled the necessity of more realistic representation of ground motions, such as inclusion of ground motion spatial variations in earthquake modelling and seismic analysis and design of structures.
Recorded seismic ground motions exhibit spatial variations in their amplitudes and phases, and the spatial variabilities have an important effect on the responses of structures extended in space, such as long span bridges. Because of the multi-parametric nature and the complexity of the problems, the development of specific design provisions on spatial variabilities of ground motions in modern seismic
codes has been impeded. Eurocode 8 is currently the only seismic standard worldwide that gives a set of detailed guidelines to explicitly tackle spatial variabilities of ground motions in bridge design, providing both a simplified design scheme and an analytical approach. However, there is gap between the code-specified provisions in Eurocode 8 and the realistic representation of spatially varying ground motions (SVGM) and the corresponding stochastic vibration analysis (SVA) approaches. This study is devoted to bridge this gap on modelling of SVGM and development of SVA approaches for structures extended in space under SVGM.
A complete and realistic SVGM representation approach is developed by accounting for the incoherence effect, wave-passage effect, site-response effect, ground motion nonstationarity, tridirectionality, and spectra-compatibility. This effort brings together
various aspects regarding rational seismic scenarios determination, comprehensive methods of accounting for varying site effects, conditional modelling of SVGM nonstationarity, and code-specified ground motion spectra-compatibility.
A comprehensive, systematic, and efficient SVA methodology is derived for long span structures under tridirectional nonstationary SVGM. An absolute-response-oriented scheme of pseudo-excitation method and an improved high precision direct
integration method are proposed to reduce the enormous computational effort of conventional nonstationary SVA. A scheme accounting for tridirectional varying site-response effect is incorporated in the nonstationary SVA scheme systematically.
The proposed highly efficient and accurate SVA approach is implemented and verified in a general finite element analysis platform to make it readily applicable in SVA of complex structures. Based on the proposed SVA approach, parametric studies
of two practical long span bridges under SVGM are conducted.
To account for spatial randomness and variability of soil properties in soil-structure interaction analysis of structures under SVGM, a meshfree-Galerkin approach is proposed within the Karhunen-Loeve expansion scheme for representation of spatial soil properties modelled as a random field. The meshfree shape functions are proposed as a set of basis functions in the Galerkin scheme to solve integral equation of Karhunen-Loeve expansion, with a proposed optimization scheme in treating the compatibility between the target and analytical covariance models. The accuracy and validity of the meshfree-Galerkin scheme are assessed and demonstrated by representation of covariance models for various homogeneous and nonhomogeneous spatial fields.
The developed modelling approaches of SVGM and the derived analytical SVA approaches can be applied to provide more refined solutions for quantitatively assessing code-specified design provisions and developing new design provisions. The proposed meshfree-Galerkin approach can be used to account for spatial randomness and variability of soil properties in soil-structure interaction analysis.4 month
- …