14,787 research outputs found
Literary machine translation under the magnifying glass : assessing the quality of an NMT-translated detective novel on document level
Several studies (covering many language pairs and translation tasks) have demonstrated that translation quality has improved enormously since the emergence of neural machine translation systems. This raises the question whether such systems are able to produce high-quality translations for more creative text types such as literature and whether they are able to generate coherent translations on document level. Our study aimed to investigate these two questions by carrying out a document-level evaluation of the raw NMT output of an entire novel. We translated Agatha Christie's novel The Mysterious Affair at Styles with Google's NMT system from English into Dutch and annotated it in two steps: first all fluency errors, then all accuracy errors. We report on the overall quality, determine the remaining issues, compare the most frequent error types to those in general-domain MT, and investigate whether any accuracy and fluency errors co-occur regularly. Additionally, we assess the inter-annotator agreement on the first chapter of the novel
Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective
This paper presents a Lisp architecture for a portable NLP system, termed
LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard,
customized and in-house developed NLP tools. Our system facilitates portability
across different institutions and data systems by incorporating an enriched
Common Data Model (CDM) to standardize necessary data elements. It utilizes
UMLS to perform domain adaptation when integrating generic domain NLP tools. It
also features stand-off annotations that are specified by positional reference
to the original document. We built an interval tree based search engine to
efficiently query and retrieve the stand-off annotations by specifying
positional requirements. We also developed a utility to convert an inline
annotation format to stand-off annotations to enable the reuse of clinical text
datasets with inline annotations. We experimented with our system on several
NLP facilitated tasks including computational phenotyping for lymphoma patients
and semantic relation extraction for clinical notes. These experiments
showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape
- …