149 research outputs found

    Maritime Terminology of the Saudi Arabian Red Sea Coast

    Get PDF
    This thesis will analyse a sample of maritime terminology used along the Saudi Red Sea coast and attempt to understand why lexica are lacking in such terms; an issue which can be linked to the language change was a consequence of the interaction between Arabs and other ethnic communities since the advent of Islam. This change raised alarm among lexicographers and linguists at the time of documenting the terminology, who set off on long journeys to collect the pure language. In their word collecting they selectively documented the language, ignoring a huge amount of spoken registers because their aim was to collect the classical form of Arabic in order to help Muslims gain a deeper understanding of the Qur>[n and |ad\th. This created gaps in Arabic lexicography, which lacks terminology for material culture. The information about maritime material cultural terminology in the mainstream lexica is disappointing. Although a few terms are listed, lexicographers have failed to provide unambiguous definitions. This study demonstrates why a great number of such terms since the classical time period has not been listed in the available lexica, and what the factors are which led to this situation. Hence, this study is based on maritime terms extracted from informal meetings I had with mariners and fishermen on the Red Sea Saudi coast about their life at sea before the introduction of the engine to vessels. The collected terms are to be investigated against their presence in lexica both synchronically and diachronically. Understanding the meanings of such ignored terms is one of the most important puzzles and this study attempts to solve it by investigating the semantic links between words and the conceptual meanings of their roots following a hypothesis based on Ibn F[ris (d. 395/1004); which assumes that all terms derived from Arabic roots should share a general conceptual meaning. While in the absence of maritime terms in lexica a hypothesis devised from Agius’s theoretical framework was applied to search such terms in literary and non-literary works, which assumed to be an alternative source to lexica and examine their occurrence in text and context by reconstructing their origin, function and use.The Royal Embassy of Saudi Arabia Cultural Bureau Londo

    Lexicographical Explorations of Neologisms in the Digital Age. Tracking New Words Online and Comparing Wiktionary Entries with ‘Traditional’ Dictionary Representations

    Get PDF
    This thesis explores neologisms in two distinct but related contexts: dictionaries and newspapers. Both present neologisms to the world, the former through information and elucidation of meaning, the latter through exemplification of real-world use and behaviour. The thesis first explores the representation of new words in a range of different dictionary types and formats, comparing entries from collaborative dictionary Wiktionary with those in expert-produced dictionaries, both those categorised here as ‘corpus-based’ and those termed ‘corpus-informed’. The former represent the most current of the expert-produced dictionary models, drawing on corpora for almost all of the data they include in an entry, while the latter draw on a mixture of old-style citations and Reading Programmes for much of their data, although this is supplemented with corpus information in some areas. The purpose of this part of the study was to compare degrees of comprehensiveness between the expert and collaborative dictionaries as demonstrated by the level and quality of detail included in new-word entries and in the dictionaries’ responsiveness to new words. This is done by comparing the number and quality of components that appear in a dictionary entry, both the standardised elements found in all of the dictionary types, such as the ‘headword’ at the top of the entry, to the non-standardised elements such as Discussion Forums found almost exclusively in Wiktionary. Wiktionary is found to provide more detailed entries on new words than the expert dictionaries, and to be generally more flexible, responding more quickly and effectively to neologisms. This is due in no small part to the way in which every time an entry or discussion is saved, the entire site updates, something which occurs for expert-produced online dictionaries once a quarter at best. The thesis further explores the way in which the same neologisms are used in four UK national newspapers across the course of their neologic life-cycle. In order to do this, a new methodology is devised for the collection of web-based data for context-rich, genre-specific corpus studies. This produced highly detailed, contextualised data that not only showed how certain newspapers are more likely to use less-well established neologisms (the Independent), while others have an overall stronger record of neologism usage across the 14 years of the study (The Guardian). As well as generating findings on the use and behaviour of neologisms in these newspapers, the manual methodology devised here is compared with a similar automated system, to assess which approach is more appropriate for use in this kind of context-rich database/corpus. The ability to accurately date each article in the study, using information which only the manual methods could accurately access, coupled with the more targeted approach it can offer by excluding unwanted texts from the outset made it the more appropriate approach

    An investigation into lemmatization in Southern Sotho

    Get PDF
    Lemmatization refers to the process whereby a lexicographer assigns a specific place in a dictionary to a word which he regards as the most basic form amongst other related forms. The fact that in Bantu languages formative elements can be added to one another in an often seemingly interminable series till quite long words are produced, evokes curiosity as far as lemmatization is concerned. Being aware of the productive nature of Southern Sotho it is interesting to observe how lexicographers go about handling the question of morphological complexities they are normally faced with in the process of arranging lexical items. This study has shown that some difficulties are encountered as far as adhering to the traditional method of alphabetization is concerned. It does not aim at proposing solutions but does point out some considerations which should be borne in mind in the process of lemmatization.African LanguagesM.A. (African Languages

    Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources

    Get PDF
    Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efficiently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.JRC.G.2-Global security and crisis managemen

    Multimedia Sotho-English E-dictionary for Undergraduate Students in Design and Studio Art

    Get PDF
    Published ThesisWhen students study at tertiary institutions they are often confronted with disciplines that are unfamiliar to them. Many of these disciplines are rich in terminology and concepts that students have never been confronted with in their past. In most South African tertiary institutions the language of instruction is English, making it difficult for second language speaking students to grasp the meaning of these terms and concepts. Research has shown that e-dictionaries with multimedia enhancements have greatly facilitated the comprehension of difficult terms and concepts. The inclusion of pictures, videos, animations, cartoons and audio clips into e-dictionaries have been proven to aid students in learning and comprehending new terms and concepts. Aim: Undergraduate students at the Department of Design and Studio Art, CUT-FS could greatly benefit from the development of a multimedia enhanced Sotho-English e-dictionary. Therefore, the aim of this study was to develop a multimedia enhanced Sotho-English e-dictionary that can be used by undergraduate students from the Department of Design and Studio Art, Central University of Technology, Free State. Methods: The study was divided into five phases in order to meet the aims and objectives. Firstly, English art and design terms and concepts were sourced from the relevant literature. The English art and design terms and concepts were for first year students at the Department of Design and Studio Art at the Central University of Technology, Free State (CUT-FS). Secondly, Sotho equivalents of the sourced English art and design terms and concepts were devised. Thirdly, the instructional multimedia aids for the multimedia e-dictionary were designed. Fourthly, the user interface of the e-dictionary was developed. Lastly, the multimedia e-dictionary was tested by undergraduate students at the Department of Design and Studio Art at the CUT-FS. The students were randomly divided into two groups. The control group, Group A, did not have access to a multimedia enhanced e-dictionary while studying art and design terms and concepts. The multimedia group, Group B, had access to a multimedia e-dictionary while studying art and design terms and concepts. Furthermore, purposeful semi-structured interviews were conducted with five Sotho speaking participants of the multimedia group to gather qualitative data about their experience with the multimedia e-dictionary application. Results: The results of the online comprehension test revealed that the multimedia e-dictionary application successfully facilitated learning amongst the multimedia group students. The group of students that had access to the multimedia e-dictionary application significantly outperformed the group of students that did not have access to the multimedia e-dictionary application (p = 0.0007). The semi-structured interviews that were conducted with a few Sotho speaking students that had access to the application also supported the success of the SEADD application


    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium

    CLARIN. The infrastructure for language resources

    Get PDF
    CLARIN, the "Common Language Resources and Technology Infrastructure", has established itself as a major player in the field of research infrastructures for the humanities. This volume provides a comprehensive overview of the organization, its members, its goals and its functioning, as well as of the tools and resources hosted by the infrastructure. The many contributors representing various fields, from computer science to law to psychology, analyse a wide range of topics, such as the technology behind the CLARIN infrastructure, the use of CLARIN resources in diverse research projects, the achievements of selected national CLARIN consortia, and the challenges that CLARIN has faced and will face in the future. The book will be published in 2022, 10 years after the establishment of CLARIN as a European Research Infrastructure Consortium by the European Commission (Decision 2012/136/EU)

    Porównawcza analiza korpusowa angielskiego i polskiego specjalistycznego słownictwa jeździeckiego z zakresu ujeżdżania i terningu koni

    Get PDF
    Niniejsza rozprawa doktorska jest analizą angielskiego i polskiego słownictwa jeździeckiego z zakresu ujeżdżenia i treningu koni przeprowadzoną z zastosowaniem porównawczego korpusu tekstów. Wpisuje się ona w nurt badań języków specjalistycznych, którym dotąd poświęcano niewiele uwagi w językoznawstwie i powiązanych dziedzinach. Specjalistyczny język jeździecki, którego jądro stanowi badane słownictwo, wymaga zarówno teoretycznej (badania), jak i praktycznej (leksykografia) pracy językoznawczej, szczególnie w Polsce, gdzie poświęcono mu jak dotąd pojedyncze artykuły oraz nieaktualny już słownik. Jest to wysoce niewystarczające z uwagi na rosnącą popularność jeździectwa jako sportu i rekreacji w Polsce i na świecie. Niniejsza praca ma stanowić przyczynek do poprawy tego stanu. W opisanej sytuacji wstępne oczekiwania mają charakter ogólny: celem jest formalna i semantyczna charakterystyka dwóch zestawów słownictwa (angielskiego i polskiego) pozyskanych z wiarygodnych źródeł w celu odkrycia zawartego w nim językowego obrazu przedmiotowej dziedziny. Następnie badane jest występowanie terminów w korpusie złożonym odpowiednio z angielskich i polskich tekstów z zakresu ujeżdżenia i treningu koni. Każdy z dwóch podkorpusów podzielony jest dodatkowo na dwie części według rozróżnienia ważnego dla danego języka: podkorpus angielski zawiera część dotyczącą jeździectwa klasycznego (angielskiego) oraz jeździectwa w stylu western (amerykańskiego), zaś podkorpus polski – część oryginalną oraz złożoną z tłumaczeń. Pozwala to na porównanie występowania słownictwa w zależności od obszarów językowo-kulturowych oraz weryfikację jakości i aktualności źródeł słownictwa w kontekście planowanego projektu leksykograficznego, do którego wprowadzeniem teoretycznym ma być niniejsza praca. Rozprawa składa się z czterech rozdziałów teoretycznych oraz dwóch badawczych. Rozdział 1 przedstawia historię badań języków specjalistycznych, a rozdział 2 opisuje współczesne funkcje tych języków. Rozdział 3 przybliża pojęcie języka specjalistycznego, omawiając jego nazwy stosowane w językoznawstwie, jego stosunek do języka ogólnego, powiązane pojęcia wiedzy i specjalisty oraz typologie języków specjalistycznych. Rozdział 4 przedstawia badania języków specjalistycznych podejmowane przez szereg powiązanych dyscyplin: językoznawstwo, terminologię, dydaktykę, leksykografię, translatorykę i planowanie językowe. Rozdział 5 stanowi bezpośrednie wprowadzenie do badań, opisując rozwój przedmiotowej dziedziny specjalistycznej, tj. jeździectwa, oraz jej stan obecny, z naciskiem na grupę użytkowników. Rozdział 6 zawiera właściwe badanie podzielone na etapy: wytyczenie zakresu tematycznego i utworzenie dwóch zestawów słownictwa, ich wstępną formalną i semantyczną charakterystykę, uformowanie korpusu, badanie występowania terminów w korpusie za pomocą oprogramowania WordSmith 5.0 oraz analizę frekwencyjną, formalną i semantyczną wyników. Przeprowadzone badanie wykazuje znaczną zależność występowania słownictwa od obszaru kulturowego, reprezentowanego przez style jeździeckie i języki narodowe. Angielski podkorpus jeździectwa klasycznego zawiera więcej terminów niż podkorpus jeździectwa westernowego, co odzwierciedla pozajęzykową wiedzę o tradycji i charakterze obu tych stylów. Z kolei znaczna część terminów polskich jest nieobecna w tekstach, potwierdzając przewidywane niedostatki jednego z wykorzystanych źródeł słownictwa oraz zasadność planowanego projektu leksykograficznego. Podkorpus polskich tłumaczeń wykazuje wyższe nasycenie słownictwem niż podkorpus oryginalny, sugerując różnicę jakościową między zagranicznym a polskim piśmiennictwem jeździeckim przy jednoczesnej wysokiej jakości przekładów. Niniejsza rozprawa stanowi zatem kolejny dowód na powiązanie języków specjalistycznych z ich dziedzinami, a tym samym na pożyteczność z jednej strony opisowych badań językoznawczych przed praktycznymi pracami terminologiczno-leksykograficznymi, z drugiej zaś uwzględniania wiedzy pozajęzykowej i opisu pojęć w badaniach językoznawczych