93,432 research outputs found

    Enhanced Integrated Scoring for Cleaning Dirty Texts

    Full text link
    An increasing number of approaches for ontology engineering from text are gearing towards the use of online sources such as company intranet and the World Wide Web. Despite such rise, not much work can be found in aspects of preprocessing and cleaning dirty texts from online sources. This paper presents an enhancement of an Integrated Scoring for Spelling error correction, Abbreviation expansion and Case restoration (ISSAC). ISSAC is implemented as part of a text preprocessing phase in an ontology engineering system. New evaluations performed on the enhanced ISSAC using 700 chat records reveal an improved accuracy of 98% as compared to 96.5% and 71% based on the use of only basic ISSAC and of Aspell, respectively.Comment: More information is available at http://explorer.csse.uwa.edu.au/reference

    TRANSLATING MEDICAL TEXTS FOR LEGAL PURPOSES: A GROWING CHALLENGE FOR COURT TRANSLATORS AND INTERPRETERS

    Get PDF
    Przekład tekstów medycznych obejmuje cały szereg różnego typu tekstów, takich jak wypisy szpitalne, epikryzy, artykuły naukowe w czasopismach medycznych, ulotki informacyjne dla pacjenta (PILs) czy też wskazówki dotyczące stosowania leku (IFU). Wkracza również w sferę zainteresowania zawodowego tłumaczy przysięgłych z racji takich czynników jak np. migracja obywateli lub członkostwo Polski w UE i wynikające z tego procedury implementacji prawa unijnego do polskiego oraz wprowadzania wyrobów medycznych na rynek. Tłumacze przysięgli z konieczności więc mają do czynienia z całym szeregiem tekstów z różnych dziedzin medycyny (oraz dziedzin pokrewnych, takich jak np. farmakologia czy biologia). Trudnością i jednocześnie wyzwaniem dla tłumacza w takiej sytuacji stają się: brak wiedzy medycznej, problemy ze znajomością terminologii medycznej (oraz wszechobecnych skrótów i skrótowców) czy ogólnie pojętego dyskursu medycznego. Pociąga to za sobą rozwój nowego profesjonalnego podejścia do tłumaczenia takich tekstów jak również specyficznych kompetencji (dlatego w artykule pokrótce wyjaśnione zostaną pojęcia takie jak profesjonalizm i kompetencja). Podejście zaprezentowane w artykule będzie podejściem zorientowanym na tłumacza.Medical translation has been an area of an increased growth in the demand for translation services. It is considered to cover an extensive variety of genres, starting from hospital discharge reports, epicrises, specialist articles in medical journals, patient information leaflets (PILs) or instructions for use (IFU). It also has entered the area of activity of court translators due to e.g. migration or Poland’s membership in the EU and resultant EU-law implementation procedures (i.e., implementation of the Medical Devices Directive 93/42/EEC) and commercialisation of medical devices, thus generating the need to deal with an array of texts from the entire realm of various fields of medicine, and related disciplines (pharmacy, pharmacology, biology, etc.). Court translators are therefore facing difficulties and at the same time challenges, among which most important are the lack of medical knowledge, medical terminology (including acronyms and abbreviations) or medical phraseology in general. This entails the development of a new professional approach towards proceeding with such tasks, and requires constant improvement of skills and knowledge as well as special competencies that might be of help for translators (for this reason the notions of professionalism and translation competence shall be briefly elucidated). The focus of the article is placed on translation of medical texts seen from the point of view of translators and the purpose of translation, and not from the perspective of users, thus the approach is translator-centred

    Development of nanocrystalline Fe80Cr20 alloy using combination technique of ball milling and ultrasonic treatment for fuel cell interconnector

    Get PDF
    Solid Oxide Fuel Cell (SOFC) system consists of anode, cathode, electrolyte and interconnect. This research is focused on interconnect material. The objective of this study is to explore the high energy ball milling (milled) combined with ultrasonic treatment (UT) in obtaining smaller crystallite size, finer surface morphology, higher thermal stability and more homogenous nanocrystalline Fe80Cr20 alloys. This condition was motivated by the previous research that some of the grain growth was observed in a high temperature. At first, this process was carried out by high energy ball milling with milling time of 60 h and later, the samples experienced the ultrasonic treatment with frequency of 35 kHz at various periods of 3 h, 3.5 h, 4 h, 4.5 h, and 5 h. Moreover, it was found that there are no works on these combination treatments (milled and UT). Characterization and analysis were carried out to all samples by using X-Ray Diffraction (XRD), Scanning Electron Microscope (SEM) and Energy dispersive X-ray Diffraction (EDS), Thermo Gravimetric Analysis (TGA) and Particle Size Analyzer (PSA). The results showed that the combination treatment samples increases effectively to the solid solubility of Cr to Fe up to 62.1% and decreased the crystallite size up to 2.71 nm at milled and UT 4.5 h sample, these resulted and produces finer surface structure. From EDS results, the combination treatment samples are at suitable composition of 20.05 wt% Cr and 79.95 wt% Fe as compared to other samples. Higher thermal stability was observed on combination treatment sample at 1100 0C up to 12.7 mg or convenient to 63 wt%, 62 wt% and 25 wt% as compared to raw material, UT samples and milled 60 h sample, respectively. The particle size decreased up to 5.23 µm and particle size distribution of combination treatment relatively increased up to 89.57%. It can be concluded that the combination treatment at milled and UT 4.5 h is appropriate to achieve high solid solubility, nano crystallite size, fine surface morphology, high thermal stability and homogenous Fe80Cr20 alloys

    Poznańskie album civium – charakterystyka właściwości graficznych i językowych dokumentu

    Get PDF
    The present article, though far from being exhaustive, makes a contribution to support the standpoint, expressed by many researchers, that there is the growing need for an inclusion of elements of paleography to studies on the history of the Polish language. The article should be viewed as an attempt at an examination of the graphical and linguistic properties of the text under scrutiny (libri iuris civilis or alba civilia of the city of Poznan from the years 1575–1793) that constitutes onomastic material excerpted from historical sources. The present article demonstrates typical characteristics of linguistic features of the document; diversity of the texts is highlighted, writing ductus instantiated in the flow of writing the text is discusses, as well as individual styles of handwriting and the tendency of the city’s scribes to differentiate letters and signs graphically (multifunctionality of signs, the influence of non-Polish handwriting styles, abbreviations)

    The TXM Portal Software giving access to Old French Manuscripts Online

    Get PDF
    Texte intégral en ligne : http://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdfInternational audiencehttp://www.lrec-conf.org/proceedings/lrec2012/workshops/13.ProceedingsCultHeritage.pdf This paper presents the new TXM software platform giving online access to Old French Text Manuscripts images and tagged transcriptions for concordancing and text mining. This platform is able to import medieval sources encoded in XML according to the TEI Guidelines for linking manuscript images to transcriptions, encode several diplomatic levels of transcription including abbreviations and word level corrections. It includes a sophisticated tokenizer able to deal with TEI tags at different levels of linguistic hierarchy. Words are tagged on the fly during the import process using IMS TreeTagger tool with a specific language model. Synoptic editions displaying side by side manuscript images and text transcriptions are automatically produced during the import process. Texts are organized in a corpus with their own metadata (title, author, date, genre, etc.) and several word properties indexes are produced for the CQP search engine to allow efficient word patterns search to build different type of frequency lists or concordances. For syntactically annotated texts, special indexes are produced for the Tiger Search engine to allow efficient syntactic concordances building. The platform has also been tested on classical Latin, ancient Greek, Old Slavonic and Old Hieroglyphic Egyptian corpora (including various types of encoding and annotations)
    corecore