13,535 research outputs found
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques
specifically developed for analyzing and understanding the inner-workings and
representations acquired by neural models of language. Approaches included:
systematic manipulation of input to neural networks and investigating the
impact on their performance, testing whether interpretable knowledge can be
decoded from intermediate representations acquired by neural networks,
proposing modifications to neural network architectures to make their knowledge
state or generated output more explainable, and examining the performance of
networks on simplified or formal languages. Here we review a number of
representative studies in each category
Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization
Existing methods for CWS usually rely on a large number of labeled sentences
to train word segmentation models, which are expensive and time-consuming to
annotate. Luckily, the unlabeled data is usually easy to collect and many
high-quality Chinese lexicons are off-the-shelf, both of which can provide
useful information for CWS. In this paper, we propose a neural approach for
Chinese word segmentation which can exploit both lexicon and unlabeled data.
Our approach is based on a variant of posterior regularization algorithm, and
the unlabeled data and lexicon are incorporated into model training as indirect
supervision by regularizing the prediction space of CWS models. Extensive
experiments on multiple benchmark datasets in both in-domain and cross-domain
scenarios validate the effectiveness of our approach.Comment: 7 pages, 11 figures, accepted by the 2019 World Wide Web Conference
(WWW '19
Improving the translation environment for professional translators
When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side.
This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project
- …