1,384 research outputs found
A non-Gaussian continuous state space model for asset degradation
The degradation model plays an essential role in asset life prediction and condition based maintenance. Various degradation models have been proposed. Within these models, the state space model has the ability to combine degradation data and failure event data. The state space model is also an effective approach to deal with the multiple observations and missing data issues. Using the state space degradation model, the deterioration process of assets is presented by a system state process which can be revealed by a sequence of observations. Current research largely assumes that the underlying system development process is discrete in time or states. Although some models have been developed to consider continuous time and space, these state space models are based on the Wiener process with the Gaussian assumption. This paper proposes a Gamma-based state space degradation model in order to remove the Gaussian assumption. Both condition monitoring observations and failure events are considered in the model so as to improve the accuracy of asset life prediction. A simulation study is carried out to illustrate the application procedure of the proposed model
Bridging SMT and TM with translation recommendation
We propose a translation recommendation framework to integrate Statistical Machine Translation (SMT) output with Translation Memory (TM) systems. The framework recommends SMT outputs to a TM user when it predicts that SMT outputs are more suitable for post-editing than the hits provided by the TM. We describe an implementation of this framework using an SVM binary classifier. We exploit methods to fine-tune the classifier and investigate a variety of features of different types. We rely on automatic MT evaluation
metrics to approximate human judgements in our experiments. Experimental results show that our system can achieve 0.85 precision at 0.89 recall, excluding exact matches. futhermore, it is possible for the end-user to achieve a desired balance between precision and recall by adjusting
confidence levels
Integrating N-best SMT outputs into a TM system
In this paper, we propose a novel frame- work to enrich Translation Memory (TM) systems with Statistical Machine Translation (SMT) outputs using ranking. In order to offer the human translators multiple choices, instead of only using the top SMT output and top TM hit, we merge the N-best output from the SMT system and the k-best hits with highest fuzzy match scores from the TM system. The merged list is then ranked according to the prospective post-editing effort and provided to the translators to aid their work. Experiments show that our ranked output achieve 0.8747 precision at top 1 and 0.8134 precision at top 5. Our
framework facilitates a tight integration between SMT and TM, where full advantage is taken of TM while high quality
SMT output is availed of to improve the productivity of human translators
Improving the post-editing experience using translation recommendation: a user study
We report findings from a user study with professional post-editors using a translation recommendation framework (He et al., 2010) to integrate Statistical Machine Translation (SMT) output with Translation Memory (TM) systems. The framework recommends SMT outputs to a TM user when it predicts that SMT outputs are more suitable for post-editing than the hits provided by the TM. We analyze the effectiveness of the model as well as the reaction of potential users. Based on the performance statistics and the users’comments, we find that translation recommendation can reduce the workload of professional post-editors and improve the acceptance of MT in the localization industry
Transferring Procedural Knowledge across Commonsense Tasks
Stories about everyday situations are an essential part of human
communication, motivating the need to develop AI agents that can reliably
understand these stories. Despite the long list of supervised methods for story
completion and procedural understanding, current AI has no mechanisms to
automatically track and explain procedures in unseen stories. To bridge this
gap, we study the ability of AI models to transfer procedural knowledge to
novel narrative tasks in a transparent manner. We design LEAP: a comprehensive
framework that integrates state-of-the-art modeling architectures, training
regimes, and augmentation strategies based on both natural and synthetic
stories. To address the lack of densely annotated training data, we devise a
robust automatic labeler based on few-shot prompting to enhance the augmented
data. Our experiments with in- and out-of-domain tasks reveal insights into the
interplay of different architectures, training regimes, and augmentation
strategies. LEAP's labeler has a clear positive impact on out-of-domain
datasets, while the resulting dense annotation provides native explainability
LIDER: An Efficient High-dimensional Learned Index for Large-scale Dense Passage Retrieval
Many recent approaches of passage retrieval are using dense embeddings
generated from deep neural models, called "dense passage retrieval". The
state-of-the-art end-to-end dense passage retrieval systems normally deploy a
deep neural model followed by an approximate nearest neighbor (ANN) search
module. The model generates embeddings of the corpus and queries, which are
then indexed and searched by the high-performance ANN module. With the
increasing data scale, the ANN module unavoidably becomes the bottleneck on
efficiency. An alternative is the learned index, which achieves significantly
high search efficiency by learning the data distribution and predicting the
target data location. But most of the existing learned indexes are designed for
low dimensional data, which are not suitable for dense passage retrieval with
high-dimensional dense embeddings. In this paper, we propose LIDER, an
efficient high-dimensional Learned Index for large-scale DEnse passage
Retrieval. LIDER has a clustering-based hierarchical architecture formed by two
layers of core models. As the basic unit of LIDER to index and search data, a
core model includes an adapted recursive model index (RMI) and a dimension
reduction component which consists of an extended SortingKeys-LSH (SK-LSH) and
a key re-scaling module. The dimension reduction component reduces the
high-dimensional dense embeddings into one-dimensional keys and sorts them in a
specific order, which are then used by the RMI to make fast prediction.
Experiments show that LIDER has a higher search speed with high retrieval
quality comparing to the state-of-the-art ANN indexes on passage retrieval
tasks, e.g., on large-scale data it achieves 1.2x search speed and
significantly higher retrieval quality than the fastest baseline in our
evaluation. Furthermore, LIDER has a better capability of speed-quality
trade-off.Comment: Accepted by VLDB 202
- …