5 research outputs found

    Applying semantic web technologies to knowledge sharing in aerospace engineering

    Get PDF
    This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale

    Acronym extraction from texts written in Estonian

    Get PDF
    Töös toodi kirjanduse põhjal välja mitu viisi selle kohta, kuidas eelnevalt on üritatud lahendada akronüümide vastete leidmise probleemi: käsitsi koostatud andmebaasid, reegli- ja mustripõhised lähenemised ja tugivektormasina kasutamine. Selgitati erinevaid ekstraheerimijaid võrdlevaid karakteristikuid ja toodi välja nendega seotud probleemid. Kirjeldati probleeme, mis tekivad eestikeelsetest tekstidest akronüümide vastete ekstraheerimisel. Töös loodi eestikeelsetest tekstidest akronüümide ja nende vastete ekstraheerija prototüüp, esitati selle eesmärgid, kastutatud algoritm ja programmi testimise tulemused. Põhilised akronüümide ja nende vastete mallid on saadud andmete põhjal, mille seas leidus nii ainult eestikeelseid kui ka tõlgitud tekste (üldiselt olid tekstid tõlgitud inglise keelest ja sisaldasid kohati ingliskeelseid sõnu). Võib ütelda, et kuigi mallid koostati näitepõhiselt, siis vähemasti saadi malle mitme tüüpjuhu kohta. Prototüüp saavutas täpsuseks (precision) 84,2% ja saagiks (recall) 66,6%. Need karakteristikud ei ole päris usaldusväärsed, sest suurema ja juhuslikuma andmevalimi korral ei ole alust arvata, et näitajad ikka sama kõrgeks jäävad. Töös on toodud ka programme edasiarendusvõimalused.The aim of this paper was to give an overview of acronym extraction in general and to try to implement the knowledge on texts written in Estonian. As there is no universal agreement on the definition, it is a vague term. Acronym is an abbreviation formed from the initial components in a phrase [2]. Because of that they can be following: USA meaning „United States of America‟ and Benelux meaning „Belgium-Netherland-Luxembourg‟. Here we identify that there are acronyms and their expansions – „United States of America‟ would be an expansion for USA. The two named acronyms are well known and searching for their expansions is unnecessary, however there are more specific acronyms that one can find while reading long scientific texts. In that case, it would be helpful to get an instantaneous recall of possible acronym expansion candidates. The simplest way to get expansion candidate is to search manually compiled databases. That solution is followed by automated extraction solutions: pattern and rule-based The general solution for automated acronym extraction is to identify the acronyms and recognize their expansions from surrounding text. This problem gets more difficult when dealing with text written in another language (here we try to solve the problem with Estonian language). The increased difficulty is caused by the fact that a lot of texts are translated from English and some of the acronym expansions are translated, while the acronyms are not. The problem gets worse since Estonian translation of a regular English acronym might be a compound noun. Luckily, all the cases are not so extreme and most acronyms are closely preceded or followed by their expansions. There are two metrics that are used to describe acronym extractors – precision and recall. Precision measures how many correct expansions are extracted compared to all expansions found. Recall measures how many expansions were identified compared to what was possible to identify. Lastly, there is an attempt to create prototype extractor for Estonian language using simple regular expressions to match and extract acronyms and their expansions from texts written in Estonian. This attempt is tested on about 30 small articles that contain acronyms. While the main idea was to get the prototype to match expansions without making too many mistakes, the patterns that were compiled are intended to have as high precision as possible (the prototype scored 84.2%) and leaving questionable expansions out. That is the reason the prototype‟s recall score was 66.6% (compared to SVM‟s, which was 84.1%/83.4%)

    DAIRSACC - Do Acronyms Influence Reading Speed and Content Comprehension?

    Get PDF
    Acronyms, initialisms and other types of abbreviations are frequently used in scientific, academic, governmental and administrative setting to shorten lengthy terminology and nomenclature. While they can make a text easier to read for people familiar with the abbreviations, they can add to the text's inherent difficulty and impede comprehension for those who are not familiar with their meaning. The phenomenon of acronym polynymy (multiple definitions associated with the same acronym) can create confusion and add to the cognitive load associated with understanding the text. The current practice of defining acronyms only once, when introduced can result in readers scrolling back and forth in the text looking for acronym definitions, increasing the cognitive load and negatively affect reading speed and content comprehension. The purpose of this research was to study if the presence of a large number of acronyms in a text impedes reading performance. The current study also investigated if providing easy access to acronym definitions via hover text would alleviate comprehension problems caused by unknown acronyms in the text. The hypothesis was that by enabling fast acronym disambiguation, and eliminating the need to scroll for acronym definitions, the hover functionality would enhance reading speed and content comprehension. The results of the experiment are analyzed and recommendations for future investigations of the acronym problem are formulated
    corecore