Search CORE

656 research outputs found

Two Uummarmiutun modals – including a brief comparison with Utkuhikšalingmiutut cognates

Author: Berthelin Signe Rix
Publication venue
Publication date: 01/01/2017
Field of study

The paper is concerned with the meaning of two modal postbases in Uummarmiutun, hungnaq ‘probably’ and ȓukȓau ‘should’. Uummarmiutun is an Inuktut dialect spoken in the Western Arctic. The analyses are founded on knowledge shared by native speakers of Uummarmiutun. Their statements and elaborations are quoted throughout the paper to show how they have explained the meaning nuances of modal expressions in their language. The paper also includes a comparison with cognates in Utkuhikšalingmiutut, which belongs to the eastern part of the Western Canadian dialect group (Dorais, 2010). Using categories from Cognitive Functional Linguistics (Boye, 2005, 2012), the paper shows which meanings are covered by hungnaq and ȓukȓau. This allows us to discover subtle differences between the meanings of Uummarmiutun hungnaq and ȓukȓau and their Utkuhikšalingmiutut cognates respectively

PhilPapers

University of Toronto: Journal Publishing Services

The Role of Antonymy on Semantic Change

Author: Kentner Ashley M
Publication venue: 'Purdue University (bepress)'
Publication date: 10/04/2015
Field of study

The role of antonymy in semantic change is investigated via the etymology of sets of English antonyms. The results show a developmental pattern wherein two words sharing an antonym tend to exhibit similar trajectories of semantic development. Metaphorical extension is proposed as the primary mechanism that produces this regularity with antonymy playing a secondary role. These results further support semantic change as regular, even in contexts not involving grammaticalization, and that furthermore, metaphor is not peripheral to language use. (See Lakoff & Johnson, 1980; Traugott & Dasher, 2002; Hopper & Traugott, 2003.) There are also implications for formal and cognitive representations that rely on antonymous relationships for modeling aspects of gradable predicates (such as Paradis, 2001; Kennedy & McNally, 2005)

Purdue E-Pubs

An Algorithm For Building Language Superfamilies Using Swadesh Lists

Author: Mutabazi Bill
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 23/04/2020
Field of study

The main contributions of this thesis are the following: i. Developing an algorithm to generate language families and superfamilies given for each input language a Swadesh list represented using the international phonetic alphabet (IPA) notation. ii. The algorithm is novel in using the Levenshtein distance metric on the IPA representation and in the way it measures overall distance between pairs of Swadesh lists. iii. Building a Swadesh list for the author\u27s native Kinyarwanda language because a Swadesh list could not be found even after an extensive search for it. Adviser: Peter Reves

DigitalCommons@University of Nebraska

Recommended from our members

Identifying and Modeling Code-Switched Language

Author: Soto Martinez Victor
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Code-switching is the phenomenon by which bilingual speakers switch between multiple languages during written or spoken communication. The importance of developing language technologies that are able to process code-switched language is immense, given the large populations that routinely code-switch. Current NLP and Speech models break down when used on code-switched data, interrupting the language processing pipeline in back-end systems and forcing users to communicate in ways which for them are unnatural. There are four main challenges that arise in building code-switched models: lack of code-switched data on which to train generative language models; lack of multilingual language annotations on code-switched examples which are needed to train supervised models; little understanding of how to leverage monolingual and parallel resources to build better code-switched models; and finally, how to use these models to learn why and when code-switching happens across language pairs. In this thesis, I look into different aspects of these four challenges. The first part of this thesis focuses on how to obtain reliable corpora of code-switched language. We collected a large corpus of code-switched language from social media using a combination of sets of anchor words that exist in one language and sentence-level language taggers. The newly obtained corpus is superior to other corpora collected via different strategies when it comes to the amount and type of bilingualism in it. It also helps train better language tagging models. We also have proposed a new annotation scheme to obtain part-of-speech tags for code-switched English-Spanish language. The annotation scheme is composed of three different subtasks including automatic labeling, word-specific questions labeling and question-tree word labeling. The part-of-speech labels obtained for the Miami Bangor corpus of English-Spanish conversational speech show very high agreement and accuracy. The second section of this thesis focuses on the tasks of part-of-speech tagging and language modeling. For the first task, we proposed a state-of-the-art approach to part-of-speech tagging of code-switched English-Spanish data based on recurrent neural networks.Our models were tested on the Miami Bangor corpus on the task of POS tagging alone, for which we achieved 96.34% accuracy, and joint part-of-speech and language ID tagging,which achieved similar POS tagging accuracy (96.39%) and very high language ID accuracy (98.78%). For the task of language modeling, we first conducted an exhaustive analysis of the relationship between cognate words and code-switching. We then proposed a set of cognate-based features that helped improve language modeling performance by 12% relative points. Furthermore, we showed that these features can also be used across language pairs and still obtain performance improvements. Finally, we tackled the question of how to use monolingual resources for code-switching models by pre-training state-of-the-art cross-lingual language models on large monolingual corpora and fine-tuning them on the tasks of language modeling and word-level language tagging on code-switched data. We obtained state-of-the-art results on both tasks

Columbia University Academic Commons

Language control in bilingual production: Insights from error rate and error type in sentence production

Author: Martin Clara D.
Nozari Nazbanou
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2021
Field of study

First published online: 16 October 2020Most research showing that cognates are named faster than non-cognates has focused on isolated word production which might not realistically reflect cognitive demands in sentence production. Here, we explored whether cognates elicit interference by examining error rates during sentence production, and how this interference is resolved by language control mechanisms. Twenty highly proficient Spanish–English bilinguals described visual scenes with sentence structures ‘NP1-verb-NP2’ (NP = noun-phrase). Half the nouns and half the verbs were cognates and two manipulations created high control demands. Both situations that demanded higher inhibitory control pushed the cognate effect from facilitation towards interference. These findings suggest that cognates, similar to phonologically similar words within a language, can induce not only facilitation but robust interference.We thank Michael Freund and Nicholas McCloskey for their help with data collection. This work was supported in part by the Therapeutic Cognitive Neuroscience Fund endowed to the Cognitive Neurology division of the Neurology Department at Johns Hopkins University. C.D. Martin was supported by the Spanish Ministry of Economy and Competitiveness (SEV-2015-490; PSI2017-82941-P; Europa-Excelencia ERC2018-092833), the Basque Government (PIBA18-29), and the European Research Council (ERC-2018-COG-819093). N. Nozari was also supported by a NSF grant (NSF BCS-1949631)

Archivo Digital para la Docencia y la Investigación

The effectsof using cognatesto teach english vocabulary to spanish speakers

Author: Robles Sánchez Lizbeth Mercedes
Publication venue: 'Universidad de Cuenca'
Publication date: 13/07/2022
Field of study

Esta síntesis de investigación tuvo como objetivo examinar los efectos del uso de cognados español-inglés en el aprendizaje de vocabulario de inglés de hispanohablantes. Un total de 21 estudios empíricos recolectados ayudaron a respaldar y responder preguntas sobre los efectos del uso de los cognados para enseñar vocabulario en inglés a hispanohablantes, la categoría más efectiva para enseñar vocabulario en inglés, las ventajas y desventajas de usar cognados y las perspectivas de maestros y estudiantes sobre el uso de cognados como una forma de adquirir léxico en inglés. Los resultados de este análisis revelaron que, a través del uso de cognados, específicamente cognados idénticos y similares, la comprensión y el desarrollo del vocabulario en inglés fue efectivo para los hispanohablantes. De igual manera, se pudo evidenciar que los cognados no solo ayudan en el aprendizaje y ampliación del léxico, sino también en el procesamiento del habla, inferencia de significado, reconocimiento de palabras, procesamiento de palabras, adquisición de léxico y confianza de los estudiantes. Por ende, tanto profesores como alumnos coinciden en que el uso de cognados en el aula es fundamental. Se proporcionan recomendaciones para futuras investigaciones sobre los efectos del uso de cognados español-inglés para enseñar vocabulario en inglés a hispanohablantes y algunas implicaciones prácticas. Es importante mencionar que también se propusieron recomendaciones para futuras investigaciones sobre los falsos cognados debido al continuo debate entre autores sobre su posible eficacia en la adquisición de vocabularioThis research synthesis aimed to examine the effects of the use of Spanish-English cognates on the learning of English vocabulary by Spanish speakers. A totalof 21 empirical studies were collected to answer and support questions about the effects of using cognates to teach English vocabulary toSpanish speakers, the most effective category to teachEnglish vocabulary, the advantages and disadvantages of using cognates, and the teacher and student perspectives on the use of cognates as a way of acquiring lexicon in English. The results of this analysis revealed that through the use of cognates, specifically identical and similar cognates, the comprehension and development of vocabulary in English were effective for Spanish speakers. Similarly, it was possible to identify that cognates not only help in learning and expanding lexicon but also in speech processing, meaning inference, word recognition, word processing, lexicon acquisition, and student confidence. Therefore, both teachers and students agree that the use of cognates in the classroom is essential. Recommendations for future research on the effects of using Spanish-English cognates to teach English vocabulary to Spanish speakers and some practical implications are provided. It is worth mentioning that recommendations for future research on false cognates were proposed due to the ongoing debate among authors about their possible efficacy on vocabulary acquisitionLicenciado en Pedagogía del Idioma InglésCuenc

Repositorio de la Universidad de Cuenca

Bilingual access of homonym meanings : individual differences in bilingual access of homonym meanings

Author: Fontes Ana Beatriz Arêas da Luz
Schwartz Ana Isabel
Publication venue
Publication date: 01/01/2015
Field of study

The goal of the present study was to identify the cognitive processes that underlie lexical ambiguity resolution in a second language (L2). We examined which cognitive factors predict the efficiency in accessing subordinate meanings of L2 homonyms in a sample of highly-proficient, Spanish–English bilinguals. The predictive ability of individual differences in (1) homonym processing in the L1, (2) working memory capacity and (3) sensitivity to cross-language form overlap were examined. In two experiments, participants were presented with cognate and noncognate homonyms as either a prime in a lexical decision task (Experiment 1) or embedded in a sentence (Experiment 2). In both experiments speed and accuracy in accessing subordinate meanings in the L1 was the strongest predictor of speed and accuracy in accessing subordinate meanings in the L2. Sensitivity to cross-language form overlap predicted performance in lexical decision while working memory capacity predicted processing in sentence comprehension

Lume 5.8

Automatic Identification of False Friends in Parallel Corpora: Statistical and Semantic Approach

Author: Nakov Svetlin
Publication venue: Institute of Mathematics and Informatics Bulgarian Academy of Sciences
Publication date: 01/01/2009
Field of study

False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available

Bulgarian Digital Mathematics Library at IMI-BAS

Foundation, Implementation and Evaluation of the MorphoSaurus System: Subword Indexing, Lexical Learning and Word Sense Disambiguation for Medical Cross-Language Information Retrieval

Author: Markó Kornél Géza
Publication venue
Publication date: 05/03/2009
Field of study

Im medizinischen Alltag, zu welchem viel Dokumentations- und Recherchearbeit gehört, ist mittlerweile der überwiegende Teil textuell kodierter Information elektronisch verfügbar. Hiermit kommt der Entwicklung leistungsfähiger Methoden zur effizienten Recherche eine vorrangige Bedeutung zu. Bewertet man die Nützlichkeit gängiger Textretrievalsysteme aus dem Blickwinkel der medizinischen Fachsprache, dann mangelt es ihnen an morphologischer Funktionalität (Flexion, Derivation und Komposition), lexikalisch-semantischer Funktionalität und der Fähigkeit zu einer sprachübergreifenden Analyse großer Dokumentenbestände. In der vorliegenden Promotionsschrift werden die theoretischen Grundlagen des MorphoSaurus-Systems (ein Akronym für Morphem-Thesaurus) behandelt. Dessen methodischer Kern stellt ein um Morpheme der medizinischen Fach- und Laiensprache gruppierter Thesaurus dar, dessen Einträge mittels semantischer Relationen sprachübergreifend verknüpft sind. Darauf aufbauend wird ein Verfahren vorgestellt, welches (komplexe) Wörter in Morpheme segmentiert, die durch sprachunabhängige, konzeptklassenartige Symbole ersetzt werden. Die resultierende Repräsentation ist die Basis für das sprachübergreifende, morphemorientierte Textretrieval. Neben der Kerntechnologie wird eine Methode zur automatischen Akquise von Lexikoneinträgen vorgestellt, wodurch bestehende Morphemlexika um weitere Sprachen ergänzt werden. Die Berücksichtigung sprachübergreifender Phänomene führt im Anschluss zu einem neuartigen Verfahren zur Auflösung von semantischen Ambiguitäten. Die Leistungsfähigkeit des morphemorientierten Textretrievals wird im Rahmen umfangreicher, standardisierter Evaluationen empirisch getestet und gängigen Herangehensweisen gegenübergestellt

Digitale Bibliothek Thüringen