Search CORE

111,978 research outputs found

Recommended from our members

Language engineering - a champion for European culture

Author: Banus E.
Diver J.
Elio B.
Simpkins N.
Publication venue
Publication date: 01/10/1996
Field of study

Language is key to culture. It is a direct cultural medium as well as a means of recording and providing access to non-lingual elements of culture. Language is also fundamental to a sense of cultural identity. For this reason, it is vital, in a changing Europe, that we preserve the multi-lingual character of our society in order to move successfully towards closer co-operation at a political, economic, and social level. Language engineering is the application of knowledge of language to the development of computer software which can recognise, understand, interpret, and generate human language in all its forms. The paper provides a high level view of the ‘state of the art’ in language engineering and indicates ways in which it will have a profound impact on our culture in the future. It shows how advances in language engineering are an important aid in maintaining cultural diversity in a multi-lingual European society, while enabling the development of social cohesion across cultural and national divides. It addresses issues raised by the prospect of the Multi-lingual Information Society, including education, human communication with technology and information management, as well as aspects of digital cities such as tele-presence in digital libraries, virtual art galleries and electronic museums. The paper raises the issue of language as a factor in cultural domination, showing the contribution that language engineering can make towards countering it. The paper also raises a number of controversial issues concerning the likely benefits arising from the ways in which language is likely to influence the culture of Europe

Open Research Online (The Open University)

Improving the translation environment for professional translators

Author: Augustinus Liesbeth
Bulté Bram
Buysschaert Joost
Coppers Sven
Daems Joke
Heyman Geert
Hoste Veronique
Lefever Els
Luyten Kris
Macken Lieve
Moens Marie-Francine
Pelemans Joris
Rigouts Terryn Ayla
Steurs Frieda
Tezcan Arda
Van den Bergh Jan
van der Lek-Ciudin Iulianna
Van Eynde Frank
Vanallemeersch Tom
Vandeghinste Vincent
Verwimp Lyan
Wambacq Patrick
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

When using computer-aided translation systems in a typical, professional translation workflow, there are several stages at which there is room for improvement. The SCATE (Smart Computer-Aided Translation Environment) project investigated several of these aspects, both from a human-computer interaction point of view, as well as from a purely technological side. This paper describes the SCATE research with respect to improved fuzzy matching, parallel treebanks, the integration of translation memories with machine translation, quality estimation, terminology extraction from comparable texts, the use of speech recognition in the translation process, and human computer interaction and interface design for the professional translation environment. For each of these topics, we describe the experiments we performed and the conclusions drawn, providing an overview of the highlights of the entire SCATE project

Multidisciplinary Digital Publishing Institute

Ghent University Academic Bibliography

Speech Recognition Technology: Improving Speed and Accuracy of Emergency Medical Services Documentation to Protect Patients

Author: Tran Tan
Publication venue: VCU Scholars Compass
Publication date: 01/01/2018
Field of study

Because hospital errors, such as mistakes in documentation, cause one in six deaths each year in the United States, the accuracy of health records in the emergency medical services (EMS) must be improved. One possible solution is to incorporate speech recognition (SR) software into current tools used by EMS first responders. The purpose of this research was to determine if SR software could increase the efficiency and accuracy of EMS documentation to improve the safety of patients of EMS. An initial review of the literature on the performance of current SR software demonstrated that this software was not 99% accurate, and therefore, errors in the medical documentation produced by the software could harm patients. The literature review also identified weaknesses of SR software that could be overcome so that the software would be accurate enough for use in EMS settings. These weaknesses included the inability to differentiate between similar phrases and the inability to filter out background noise. To find a solution, an analysis of natural language processing algorithms showed that the bag-of-words post processing algorithm has the ability to differentiate between similar phrases. This algorithm is best suited for SR applications because it is simple yet effective compared to machine learning algorithms that required a large amount of training data. The findings suggested that if these weaknesses of current SR software are solved, then the software would potentially increase the efficiency and accuracy of EMS documentation. Further studies should integrate the bag-of-words post processing method into SR software and field test its accuracy in EMS settings

VCU Scholars Compass

Terminology Extraction for and from Communications in Multi-disciplinary Domains

Author: Ahmad Khurshid
Musacchio MARIA TERESA
Panizzon Raffaella
Zhang Xiubo
Publication venue
Publication date: 01/01/2016
Field of study

Terminology extraction generally refers to methods and systems for identifying term candidates in a uni-disciplinary and uni-lingual environment such as engineering, medical, physical and geological sciences, or administration, business and leisure. However, as human enterprises get more and more complex, it has become increasingly important for teams in one discipline to collaborate with others from not only a non-cognate discipline but also speaking a different language. Disaster mitigation and recovery, and conflict resolution are amongst the areas where there is a requirement to use standardised multilingual terminology for communication. This paper presents a feasibility study conducted to build terminology (and ontology) in the domain of disaster management and is part of the broader work conducted for the EU project Sland \ub4 ail (FP7 607691). We have evaluated CiCui (for Chinese name \ub4 \u8bcd\u8403, which translates to words gathered), a corpus-based text analytic system that combine frequency, collocation and linguistic analyses to extract candidates terminologies from corpora comprised of domain texts from diverse sources. CiCui was assessed against four terminology extraction systems and the initial results show that it has an above average precision in extracting terms

Archivio istituzionale della ricerca - Università di Trieste

Archivio istituzionale della ricerca - Università di Padova

Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters

Author: Demuth Ilja
Diekmann Daniel
König Maximilian
Sander André
Steinhagen-Thiessen Elisabeth
Publication venue
Publication date: 01/01/2019
Field of study

OBJECTIVES: The secondary use of medical data contained in electronic medical records, such as hospital discharge letters, is a valuable resource for the improvement of clinical care (e.g. in terms of medication safety) or for research purposes. However, the automated processing and analysis of medical free text still poses a huge challenge to available natural language processing (NLP) systems. The aim of this study was to implement a knowledge-based best of breed approach, combining a terminology server with integrated ontology, a NLP pipeline and a rules engine. METHODS: We tested the performance of this approach in a use case. The clinical event of interest was the particular drug-disease interaction "proton-pump inhibitor [PPI] use and osteoporosis". Cases were to be identified based on free text digital discharge letters as source of information. Automated detection was validated against a gold standard. RESULTS: Precision of recognition of osteoporosis was 94.19%, and recall was 97.45%. PPIs were detected with 100% precision and 97.97% recall. The F-score for the detection of the given drug-disease-interaction was 96,13%. CONCLUSION: We could show that our approach of combining a NLP pipeline, a terminology server, and a rules engine for the purpose of automated detection of clinical events such as drug-disease interactions from free text digital hospital discharge letters was effective. There is huge potential for the implementation in clinical and research contexts, as this approach enables analyses of very high numbers of medical free text documents within a short time period

Institutional Repository of the Freie Universität Berlin

Directory of Open Access Journals

MPG.PuRe

The REVERE project:Experiments with the application of probabilistic NLP to systems engineering

Author: C. Rolland
Christine Aguilera
Daniel Jackson
Daniel M. Berry
Keith A. Butler
Leah Goldin
Paul Rayson
W. Emmerich
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2001
Field of study

Despite natural language’s well-documented shortcomings as a medium for precise technical description, its use in software-intensive systems engineering remains inescapable. This poses many problems for engineers who must derive problem understanding and synthesise precise solution descriptions from free text. This is true both for the largely unstructured textual descriptions from which system requirements are derived, and for more formal documents, such as standards, which impose requirements on system development processes. This paper describes experiments that we have carried out in the REVERE1 project to investigate the use of probabilistic natural language processing techniques to provide systems engineering support

Crossref

Lancaster E-Prints

Technology for large-scale translation of clinical practice guidelines : a pilot study of the performance of a hybrid human and computer-assisted approach

Author: Blum RH
Flesch R
Grol R
Gulbrandsen P
House J
Letelier LM
O'Brien S
Rada G
Sharon OB
von Elm E
Publication venue: 'JMIR Publications Inc.'
Publication date: 01/01/2015
Field of study

Background: The construction of EBMPracticeNet, a national electronic point-of-care information platform in Belgium, was initiated in 2011 to optimize quality of care by promoting evidence-based decision-making. The project involved, among other tasks, the translation of 940 EBM Guidelines of Duodecim Medical Publications from English into Dutch and French. Considering the scale of the translation process, it was decided to make use of computer-aided translation performed by certificated translators with limited expertise in medical translation. Our consortium used a hybrid approach, involving a human translator supported by a translation memory (using SDL Trados Studio), terminology recognition (using SDL Multiterm termbases) from medical termbases and support from online machine translation. This has resulted in a validated translation memory which is now in use for the translation of new and updated guidelines. Objective: The objective of this study was to evaluate the performance of the hybrid human and computer-assisted approach in comparison with translation unsupported by translation memory and terminology recognition. A comparison was also made with the translation efficiency of an expert medical translator. Methods: We conducted a pilot trial in which two sets of 30 new and 30 updated guidelines were randomized to one of three groups. Comparable guidelines were translated (a) by certificated junior translators without medical specialization using the hybrid method (b) by an experienced medical translator without this support and (c) by the same junior translators without the support of the validated translation memory. A medical proofreader who was blinded for the translation procedure, evaluated the translated guidelines for acceptability and adequacy. Translation speed was measured by recording translation and post-editing time. The Human Translation Edit Rate was calculated as a metric to evaluate the quality of the translation. A further evaluation was made of translation acceptability and adequacy. Results: The average number of words per guideline was 1,195 and the mean total translation time was 100.2 min/1,000 words. No meaningful differences were found in the translation speed for new guidelines. The translation of updated guidelines was 59 min/1,000 words faster (95% CI 2-115; P=.044) in the computer-aided group. Revisions due to terminology accounted for one third of the overall revisions by the medical proofreader. Conclusions: Use of the hybrid human and computer-aided translation by a non-expert translator makes the translation of updates of clinical practice guidelines faster and cheaper because of the benefits of translation memory. For the translation of new guidelines there was no apparent benefit in comparison with the efficiency of translation unsupported by translation memory (whether by an expert or non-expert translator

Crossref

Ghent University Academic Bibliography

PubMed Central

Seven Dimensions of Portability for Language Documentation and Description

Author: Bird Steven
Simons Gary
Publication venue
Publication date: 01/01/2002
Field of study

The process of documenting and describing the world's languages is undergoing radical transformation with the rapid uptake of new digital technologies for capture, storage, annotation and dissemination. However, uncritical adoption of new tools and technologies is leading to resources that are difficult to reuse and which are less portable than the conventional printed resources they replace. We begin by reviewing current uses of software tools and digital technologies for language documentation and description. This sheds light on how digital language documentation and description are created and managed, leading to an analysis of seven portability problems under the following headings: content, format, discovery, access, citation, preservation and rights. After characterizing each problem we provide a series of value statements, and this provides the framework for a broad range of best practice recommendations.Comment: 8 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

University of Melbourne Institutional Repository