28 research outputs found
Neural Reranking for Named Entity Recognition
We propose a neural reranking system for named entity recognition (NER). The
basic idea is to leverage recurrent neural network models to learn
sentence-level patterns that involve named entity mentions. In particular,
given an output sentence produced by a baseline NER model, we replace all
entity mentions, such as \textit{Barack Obama}, into their entity types, such
as \textit{PER}. The resulting sentence patterns contain direct output
information, yet is less sparse without specific named entities. For example,
"PER was born in LOC" can be such a pattern. LSTM and CNN structures are
utilised for learning deep representations of such sentences for reranking.
Results show that our system can significantly improve the NER accuracies over
two different baselines, giving the best reported results on a standard
benchmark.Comment: Accepted as regular paper by RANLP 201
Dynamic Entity Representations in Neural Language Models
Understanding a long document requires tracking how entities are introduced
and evolve over time. We present a new type of language model, EntityNLM, that
can explicitly model entities, dynamically update their representations, and
contextually generate their mentions. Our model is generative and flexible; it
can model an arbitrary number of entities in context while generating each
entity mention at an arbitrary length. In addition, it can be used for several
different tasks such as language modeling, coreference resolution, and entity
prediction. Experimental results with all these tasks demonstrate that our
model consistently outperforms strong baselines and prior work.Comment: EMNLP 2017 camera-ready versio
Introduction to the special issue on deep learning approaches for machine translation
Deep learning is revolutionizing speech and natural language technologies since it is offering an effective way to train systems and obtaining significant improvements. The main advantage of deep learning is that, by developing the right architecture, the system automatically learns features from data without the need of explicitly designing them. This machine learning perspective is conceptually changing how speech and natural language technologies are addressed. In the case of Machine Translation (MT), deep learning was first introduced in standard statistical systems. By now, end-to-end neural MT systems have reached competitive results. This special issue introductory paper addresses how deep learning has been gradually introduced in MT. This introduction covers all topics contained in the papers included in this special issue, which basically are: integration of deep learning in statistical MT; development of the end-to-end neural MT system; and introduction of deep learning in interactive MT and MT evaluation. Finally, this introduction sketches some research directions that MT is taking guided by deep learning.Peer ReviewedPostprint (published version
Aligning Neural Machine Translation Models: Human Feedback in Training and Inference
Reinforcement learning from human feedback (RLHF) is a recent technique to
improve the quality of the text generated by a language model, making it closer
to what humans would generate. A core ingredient in RLHF's success in aligning
and improving large language models (LLMs) is its reward model, trained using
human feedback on model outputs. In machine translation (MT), where metrics
trained from human annotations can readily be used as reward models, recent
methods using minimum Bayes risk decoding and reranking have succeeded in
improving the final quality of translation. In this study, we comprehensively
explore and compare techniques for integrating quality metrics as reward models
into the MT pipeline. This includes using the reward model for data filtering,
during the training phase through RL, and at inference time by employing
reranking techniques, and we assess the effects of combining these in a unified
approach. Our experimental results, conducted across multiple translation
tasks, underscore the crucial role of effective data filtering, based on
estimated quality, in harnessing the full potential of RL in enhancing MT
quality. Furthermore, our findings demonstrate the effectiveness of combining
RL training with reranking techniques, showcasing substantial improvements in
translation quality.Comment: 14 pages, work-in-progres