51 research outputs found
Osittain automatisoitujen menetelmien käyttö suorien anglismien tunnistamiseen suomenkielisissä korpusaineistoissa
The goal of this thesis is to investigate methods that could help with harvesting neologisms and more specifically anglicisms (i.e. English-sourced borrowings) in Finnish language. The work is partially motivated by the Global Anglicism Database project to gather anglicisms from various languages, which can serve both as an anglicism dictionary and researchers as a source of information for studying language contact and borrowing either in depth for a specific language or cross-linguistically.
A systematic way of harvesting anglicisms in current Finnish language from a suitable corpus is devised. The research examines what kinds of data sources suitable for this goal are available, and what would be the criteria for a useful data source; how to use a data source like that to prepare a good list of anglicisms candidates so that there would be as little irrelevant material as possible but so that no anglicisms would not be lost in the process, and how could the candidates be scored so that the more probable anglicisms would appear closer to the top of a candidate list.
Several of Language Bank's Finnish language monolingual corpora are considered. The most important criteria are identified to be the size and genre of the corpus and its annotation. The criteria are explored from the description of corpora on Language Bank's website and available literature and by hands-on examination of the data. Other important measures of corpus suitability are the amount of unannotated foreign language material, amount of noise, and potential anglicism proportion in the corpora. This information is gained via meticulous exploration of random samples of the corpora neologism candidate lists and evaluation on previously gained anglicism set. A combination of two corpora with good coverage of known anglicisms and relatively low amount of noise is chosen as the dataset for the next phase of the anglicism identification process.
Anglicism candidate lists are prepared by a process of removing tokens irrelevant for anglicism harvesting. That includes an identifiable part of foreign language material in the corpus, formally recognizable noise, known lemmas of the words that were present in Finnish language around the time just before the major influx of English borrowings to Finnish language started, and their inflected forms.
Several methods of scoring candidates are devised that would assign better scores to tokens with higher probability to be an anglicism. The score is based on tokens' frequency in the corpus and relative frequency of the character-level n-grams made out of tokens in representative purely English and purely Finnish corpora. The tokens in the candidate list are scored and ordered, and the resulting list is evaluated based on the ranking of a set of previously identified anglicisms. The method is proved to be somewhat effective; the resulting average ranking of known anglicisms is better than it would be in a randomly sorted candidate list
A comparison of Czech and Austrian regulation of unfair competition with special regard to the regulation of false advertisement
59 "podstatnÄ›" vztahuje k jednotlivĂ˝m osobám.164 V pĹ™ĂpadÄ› druhĂ© části skutkovĂ© podstaty, která upravuje oblast B2C se nevyĹľaduje ani soutěžnĂ vztah, ani citelnĂ˝ pĹ™esun poptávky. SpotĹ™ebitel se chránĂ pĹ™ed nekalĂ˝mi obchodnĂmi praktikami zcela.165 ZávÄ›r PoznávanĂ cizĂch právnĂch řádĹŻ mĹŻĹľe bĂ˝t inspirativnĂ nejen pro zákonodárce, ale mĹŻĹľe bĂ˝t takĂ© pĹ™Ănosem pro práci právnĂkĹŻ v praxi. ÄŚeskĂ© a rakouskĂ© právo majĂ mnoho spoleÄŤnĂ©ho. Oba právnà řády spoÄŤĂvajĂ na Ĺ™ĂmskoprávnĂch základech a jsou součástĂ kontinentálnĂ právnĂ kultury. PodobnĂ© rysy mohou bĂ˝t dány takĂ© spoleÄŤnou právnĂ historiĂ. Základem právnĂ Ăşpravy nekalĂ© soutěže jsou v Rakousku i v ÄŚeskĂ© republice mezinárodnĂ smlouvy, zejmĂ©na PaĹ™ĂĹľská unijnĂ Ăşmluva. KromÄ› mezinárodnÄ›právnĂch závazku má nejvÄ›tšà vliv na Ăşpravu nekalĂ© soutěže resp. na sbliĹľovánĂ právnĂch pĹ™edpisĹŻ v tĂ©to oblasti, komunitárnĂ právo. KlĂÄŤovĂ˝ vĂ˝znam majĂ zejmĂ©na dvÄ› smÄ›rnice, smÄ›rnice o klamavĂ© a srovnávacĂ reklamÄ› a smÄ›rnice o klamavĂ˝ch obchodnĂch praktikách. PĹ™i zkoumánĂ souÄŤasnĂ˝ch i v minulosti platnĂ˝ch právnĂch Ăşprav nekalĂ© soutěže lze dospÄ›t k závÄ›ru, Ĺľe právnĂ Ăşprava rakouská má mnoho spoleÄŤnĂ©ho s pĹ™edváleÄŤnou Ăşpravou v ZPNS. DĹŻvodem podobnostĂ je zĹ™ejmÄ› spoleÄŤnĂ˝ vzor, kterĂ˝m byl nÄ›meckĂ˝ zákon proti nekalĂ© soutěži z roku 1909. V Rakousku je právo nekalĂ© soutěže upraveno, stejnÄ› jako..
A magyar Ă©s finn nemzeti alaptanterv összehasonlĂtĂł elemzĂ©se az önszabályozĂł nyelvtanulás szempontjábĂłl
The role of Technetium-99m-Ethambutol scintigraphy in the management of spinal tuberculosis
A Case of Acute Kidney Injury in a Patient with Pulmonary Tuberculosis Receiving Ethambutol Therapy
A rapid method for estimation of the efficacy of potential antimicrobials in humans and animals by agar diffusion assay
- …