4,178 research outputs found

    Natural language processing

    Get PDF
    Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems

    Multilinguality in Temporal Annotation: A Case of Korean

    Get PDF
    PACLIC 20 / Wuhan, China / 1-3 November, 200

    Semantic Types, Lexical Sorts and Classifiers

    Get PDF
    We propose a cognitively and linguistically motivated set of sorts for lexical semantics in a compositional setting: the classifiers in languages that do have such pronouns. These sorts are needed to include lexical considerations in a semantical analyser such as Boxer or Grail. Indeed, all proposed lexical extensions of usual Montague semantics to model restriction of selection, felicitous and infelicitous copredication require a rich and refined type system whose base types are the lexical sorts, the basis of the many-sorted logic in which semantical representations of sentences are stated. However, none of those approaches define precisely the actual base types or sorts to be used in the lexicon. In this article, we shall discuss some of the options commonly adopted by researchers in formal lexical semantics, and defend the view that classifiers in the languages which have such pronouns are an appealing solution, both linguistically and cognitively motivated

    Proceedings of the First Workshop on Computing News Storylines (CNewsStory 2015)

    Get PDF
    This volume contains the proceedings of the 1st Workshop on Computing News Storylines (CNewsStory 2015) held in conjunction with the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2015) at the China National Convention Center in Beijing, on July 31st 2015. Narratives are at the heart of information sharing. Ever since people began to share their experiences, they have connected them to form narratives. The study od storytelling and the field of literary theory called narratology have developed complex frameworks and models related to various aspects of narrative such as plots structures, narrative embeddings, charactersโ€™ perspectives, reader response, point of view, narrative voice, narrative goals, and many others. These notions from narratology have been applied mainly in Artificial Intelligence and to model formal semantic approaches to narratives (e.g. Plot Units developed by Lehnert (1981)). In recent years, computational narratology has qualified as an autonomous field of study and research. Narrative has been the focus of a number of workshops and conferences (AAAI Symposia, Interactive Storytelling Conference (ICIDS), Computational Models of Narrative). Furthermore, reference annotation schemes for narratives have been proposed (NarrativeML by Mani (2013)). The workshop aimed at bringing together researchers from different communities working on representing and extracting narrative structures in news, a text genre which is highly used in NLP but which has received little attention with respect to narrative structure, representation and analysis. Currently, advances in NLP technology have made it feasible to look beyond scenario-driven, atomic extraction of events from single documents and work towards extracting story structures from multiple documents, while these documents are published over time as news streams. Policy makers, NGOs, information specialists (such as journalists and librarians) and others are increasingly in need of tools that support them in finding salient stories in large amounts of information to more effectively implement policies, monitor actions of โ€œbig playersโ€ in the society and check facts. Their tasks often revolve around reconstructing cases either with respect to specific entities (e.g. person or organizations) or events (e.g. hurricane Katrina). Storylines represent explanatory schemas that enable us to make better selections of relevant information but also projections to the future. They form a valuable potential for exploiting news data in an innovative way.JRC.G.2-Global security and crisis managemen

    ๋‹จ์–ด์ž„๋ฒ ๋”ฉ์„ ์ด์šฉํ•œ ์ผ๋ณธ์–ด์™€ ํ•œ๊ตญ์–ด์—์„œ์˜ ์˜์–ด ์™ธ๋ž˜์–ด ์˜๋ฏธ๋ถ„์„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ธ๋ฌธ๋Œ€ํ•™ ์–ธ์–ดํ•™๊ณผ, 2021. 2. ์‹ ํšจํ•„.์ „ ์„ธ๊ณ„์ ์œผ๋กœ ํ™œ๋ฐœํ•œ ๋ฌธํ™” ๊ต๋ฅ˜๊ฐ€ ์ด๋ฃจ์–ด์ง์— ๋”ฐ๋ผ ์™ธ๋ž˜์–ด๊ฐ€ ์ผ๋ฐ˜์ ์œผ๋กœ ์ž์ฃผ ์‚ฌ์šฉ๋˜๋Š”๋ฐ, ์™ธ๋ž˜์–ด์˜ ์ˆ˜์šฉ ๊ณผ์ •์—์„œ ๋‹ค์–‘ํ•œ ์–ธ์–ด์  ํ˜„์ƒ์ด ์ผ์–ด๋‚œ๋‹ค. ์™ธ๋ž˜์–ด๊ฐ€ ์ˆ˜์šฉ๋จ์— ๋”ฐ๋ผ ์›๋ž˜ ์ฐจ์šฉ์ฃผ์— ์กด์žฌํ–ˆ๋˜ ๋‹จ์–ด๊ฐ€ ์‚ฌ๋ผ์ง€๊ธฐ๋„ ํ•˜๊ณ , ์ฐจ์šฉ์–ด์˜ ์ ‘๋ฏธ์‚ฌ์™€ ๋‹จ์–ด๊ฐ€ ์ฐจ์šฉ์ฃผ์˜ ๋‹จ์–ด์™€ ๊ฒฐํ•ฉํ•˜์—ฌ ์ƒˆ๋กœ์šด ๋‹จ์–ด๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ๋„ ํ•˜๋ฉฐ, ์ฐจ์šฉ์–ด์˜ ์ „์น˜์‚ฌ๊ฐ€ ์™ธ๋ž˜์–ด๋กœ์„œ ๊ทธ๋Œ€๋กœ ์‚ฌ์šฉ๋˜๊ธฐ๋„ ํ•œ๋‹ค. ๋˜ํ•œ, ์™ธ๋ž˜์–ด ์ž์ฒด๋Š” ์ฐจ์šฉ์ฃผ์˜ ์–ธ์–ด์  ์ œ์•ฝ์œผ๋กœ ์ธํ•ด ์™ธ๋ž˜์–ด์˜ ์ •์ฐฉ ๊ณผ์ •์—์„œ ํ˜•ํƒœ, ์Œ์šด ๋ฐ ์˜๋ฏธ ๋ณ€ํ™”๋ฅผ ๊ฒช๋Š”๋‹ค. ์ด์™€ ๊ฐ™์ด, ์™ธ๋ž˜์–ด์˜ ์ˆ˜์šฉ ๊ณผ์ •์—์„œ ์ฐจ์šฉ์ฃผ์™€ ์ฐจ์šฉ์–ด์˜ ๋‹ค์–‘ํ•œ ๋ณ€ํ™”๊ฐ€ ์ผ์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์— ์™ธ๋ž˜์–ด๋Š” ์—ญ์‚ฌ์–ธ์–ดํ•™์˜ ํ˜•ํƒœ๋ก , ์Œ์šด๋ก , ์˜๋ฏธ๋ก ๊ณผ ๊ฐ™์€ ์—ฌ๋Ÿฌ ๋ถ„์•ผ์—์„œ ์ค‘์š”ํ•˜๊ฒŒ ์—ฐ๊ตฌ๋˜๋Š” ์ฃผ์ œ ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ์™ธ๋ž˜์–ด๋Š” ์ฃผ๋กœ ์ฐจ์šฉ์ฃผ์˜ ๋‹จ์–ด๋กœ๋Š” ํ‘œํ˜„ํ•  ์ˆ˜ ์—†๋Š” ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ์™ธ๊ตญ ์ œํ’ˆ๋ช…์ด๋‚˜ ๊ฐœ๋…์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค. ๊ทธ๋Ÿฐ๋ฐ ํ•œํŽธ์œผ๋กœ๋Š” ์ด๋ฏธ ๊ณ ์œ ์–ด๋กœ ์กด์žฌํ•˜๋Š” ๋‹จ์–ด๋ฅผ ์ข€ ๋” ๊ณ ๊ธ‰์Šค๋Ÿฝ๊ณ  ํ•™์ˆ ์ ์ธ ์ด๋ฏธ์ง€๋กœ ๋ฐ”๊พธ๊ธฐ ์œ„ํ•ด ์™ธ๋ž˜์–ด๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ๋„ ํ•˜๋Š”๋ฐ, ์ด๋Ÿฌํ•œ ์™ธ๋ž˜์–ด์˜ ์‚ฌํšŒ์–ธ์–ดํ•™์  ์—ญํ• ์€ ์ตœ๊ทผ ํŠนํžˆ ์ฃผ๋ชฉ์„ ๋ฐ›๊ณ  ์žˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ์™ธ๋ž˜์–ด ์„ ํ–‰์—ฐ๊ตฌ๋Š” ์™ธ๋ž˜์–ด์˜ ๋งŽ์€ ์˜ˆ๋ฅผ ์ˆ˜์ง‘ํ•˜๊ณ  ์–ธ์–ด๋ณ€ํ™” ํŒจํ„ด์„ ์ •๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ์ง„ํ–‰๋˜์—ˆ๋‹ค. ์ตœ๊ทผ ๋ง๋ญ‰์น˜ ๊ธฐ๋ฐ˜์˜ ์ •๋Ÿ‰์  ์—ฐ๊ตฌ์—์„œ๋Š” ๋‹จ์–ด ๊ธธ์ด์™€ ๊ฐ™์€ ์–ธ์–ดํ•™์ ์ธ ์š”์ธ๋“ค์ด ์™ธ๋ž˜์–ด๊ฐ€ ์ฐจ์šฉ์ฃผ์— ์„ฑ๊ณต์ ์œผ๋กœ ์ •์ฐฉํ•˜๋Š” ๊ณผ์ •์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š”์ง€ ํ†ต๊ณ„์ ์œผ๋กœ ์—ฐ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๋งŽ์ด ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋Ÿฌํ•œ ๋‹จ์–ด์˜ ๋นˆ๋„๊ธฐ๋ฐ˜ ์—ฐ๊ตฌ๋Š” ๋‹จ์–ด์˜ ๋ณต์žกํ•œ ์˜๋ฏธ ์ •๋ณด๋ฅผ ์ •๋Ÿ‰ํ™”ํ•˜๋Š” ๋ฐ์—๋Š” ์–ด๋ ค์›€์ด ์žˆ์–ด ์™ธ๋ž˜์–ด ์˜๋ฏธ ํ˜„์ƒ์— ๋Œ€ํ•œ ์ •๋Ÿ‰์  ๋ถ„์„์—ฐ๊ตฌ๋Š” ์•„์ง ์ง„ํ–‰๋˜์ง€ ์•Š์•˜๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์™ธ๋ž˜์–ด์™€ ๊ด€๋ จ๋œ ์˜๋ฏธ ํ˜„์ƒ์„ ์ •๋Ÿ‰์ ์œผ๋กœ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•œ ๋‹จ์–ด์ž„๋ฒ ๋”ฉ(Word Embedding) ๊ธฐ๋ฐ˜์˜ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ๋ฐฉ๋ฒ•์€ ๋”ฅ ๋Ÿฌ๋‹ ๋ฐฉ๋ฒ•๊ณผ ์–ธ์–ด ๋น…๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์–ด์˜ ์˜๋ฏธ ๋ฌธ๋งฅ ์ •๋ณด๋ฅผ ๋ฒกํ„ฐ ๊ฐ’์œผ๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์„ ํ™œ์šฉํ•˜์—ฌ ์™ธ๋ž˜์–ด์™€ ๊ด€๋ จ๋œ ์˜๋ฏธ ํ˜„์ƒ์˜ ์„ธ ๊ฐ€์ง€ ์ฃผ์ œ, ์–ดํœ˜ ๊ฒฝ์Ÿ, ์˜๋ฏธ์  ์ ์‘, ์‚ฌํšŒ์  ์˜๋ฏธ ๊ธฐ๋Šฅ๊ณผ ๋ฌธํ™”์  ๊ฒฝํ–ฅ ๋ณ€ํ™”์— ์ดˆ์ ์„ ๋งž์ถ”์–ด ์—ฐ๊ตฌ๋ฅผ ์ง„ํ–‰ํ•˜์˜€๋‹ค. ์ฒซ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์™ธ๋ž˜์–ด์™€ ์ฐจ์šฉ์ฃผ์˜ ๋™์˜์–ด ๊ฐ„์˜ ์–ดํœ˜๊ฒฝ์Ÿ์— ์ค‘์ ์„ ๋‘”๋‹ค. ๋นˆ๋„๊ธฐ๋ฐ˜์˜ ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” ์–ดํœ˜ ๊ฒฝ์Ÿ์˜ ์œ ํ˜•(๋‹จ์–ด ๋Œ€์ฒด ๋˜๋Š” ์˜๋ฏธ ๋ถ„ํ™”)์„ ๊ตฌ๋ณ„ํ•  ์ˆ˜ ์—†๋‹ค. ์–ดํœ˜ ๊ฒฝ์Ÿ์˜ ์œ ํ˜•์„ ํŒ๋‹จํ•˜๋ ค๋ฉด ์™ธ๋ž˜์–ด์™€ ์ฐจ์šฉ์ฃผ ๋™์˜์–ด ๊ฐ„์˜ ๋ฌธ๋งฅ ๊ณต์œ  ์ƒํƒœ๋ฅผ ํŒŒ์•…ํ•ด์•ผ ํ•œ๋‹ค. ๋ฌธ๋งฅ ๊ณต์œ  ์ƒํƒœ๋ฅผ ์ •๋Ÿ‰์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ์—ฐ๊ตฌ๋Š” ๊ธฐํ•˜ํ•™์  ๊ฐœ๋…์„ ์ ์šฉํ•œ๋‹ค. ์ œ์•ˆ๋œ ๊ธฐํ•˜ํ•™์  ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฐ˜ ๋ชจ๋ธ์€ ์™ธ๋ž˜์–ด์™€ ์ˆ˜์šฉ์–ธ์–ด์˜ ๋™์˜์–ด ์‚ฌ์ด์—์„œ ๋ฐœ์ƒํ•˜๋Š” ์–ดํœ˜ ๊ฒฝ์Ÿ์„ ์ •๋Ÿ‰์ ์œผ๋กœ ํŒ๋‹จํ•จ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ผ๋ณธ์–ด์™€ ํ•œ๊ตญ์–ด์—์„œ์˜ ์˜์–ด ์™ธ๋ž˜์–ด์˜ ์˜๋ฏธ ์ ์‘์— ์ค‘์ ์„ ๋‘”๋‹ค. ์˜์–ด ์™ธ๋ž˜์–ด๋Š” ์ฐจ์šฉ์ฃผ์— ์ •์ฐฉํ•˜๋Š” ๊ณผ์ •์„ ํ†ตํ•ด ์˜๋ฏธ ์ ์‘์„ ๊ฒช๋Š”๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์™ธ๋ž˜์–ด์™€ ์˜์–ด ๊ณ ์œ ์–ด์™€์˜ ์˜๋ฏธ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๊ธฐ ์œ„ํ•ด ๋ณ€ํ™˜ ํ–‰๋ ฌ ๋ฐฉ๋ฒ•์„ ์ ์šฉํ•˜์—ฌ ์˜์–ด ์™ธ๋ž˜์–ด์˜ ์ผ๋ณธ์–ด์™€ ํ•œ๊ตญ์–ด์—์„œ์˜ ์˜๋ฏธ ์ ์‘ ์ฐจ์ด๋ฅผ ๋ถ„์„ํ•˜์˜€๋‹ค. ๋˜ํ•œ, ์˜์–ด ๋‹จ์–ด์˜ ๋‹ค์˜์„ฑ์ด ์˜๋ฏธ์ ์‘์— ์ฃผ๋Š” ์˜ํ–ฅ์„ ํ†ต๊ณ„์ ์œผ๋กœ ๋ถ„์„ํ•˜์˜€๋‹ค. ์„ธ ๋ฒˆ์งธ ์—ฐ๊ตฌ๋Š” ์ผ๋ณธ๊ณผ ํ•œ๊ตญ์˜ ์ตœ์‹  ๋ฌธํ™”์  ๊ฒฝํ–ฅ์„ ๋ฐ˜์˜ํ•˜๋Š” ์™ธ๋ž˜์–ด์˜ ์‚ฌํšŒ ์˜๋ฏธ์  ์—ญํ• ์— ์ดˆ์ ์„ ๋งž์ถ˜๋‹ค. ์ผ๋ณธ๊ณผ ํ•œ๊ตญ ์‚ฌํšŒ์˜ ๋ฏธ๋””์–ด์—์„œ๋Š” ์ƒˆ๋กœ์šด ๋ฌธํ™”์ ์ธ ๊ฒฝํ–ฅ์ด๋‚˜ ์ด์Šˆ๊ฐ€ ์ƒ๊ฒผ์„ ๋•Œ ์™ธ๋ž˜์–ด๋ฅผ ์ž์ฃผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ, ์™ธ๋ž˜์–ด๊ฐ€ ์ผ๋ณธ๊ณผ ํ•œ๊ตญ์˜ ๋ฌธํ™”์  ๊ฒฝํ–ฅ์„ ๋ฐ˜์˜ํ•˜๋Š” ์—ญํ• ์„ ๊ฐ€์งˆ ๊ฒƒ์ด ์˜ˆ์ƒ๋œ๋‹ค. ๋ณธ ์—ฐ๊ตฌ๋Š” ์ด๋Ÿฌํ•œ ์™ธ๋ž˜์–ด๊ฐ€ ๋ฌธํ™”์  ๊ฒฝํ–ฅ์˜ ๋ณ€ํ™”๋ฅผ ๋ฐ˜์˜ํ•˜๋Š” ์ง€ํ‘œ๋กœ์„œ์˜ ์—ญํ• ์„ ํ•œ๋‹ค๋Š” ๊ฐ€์„ค์„ ์ œ์•ˆํ•œ๋‹ค. ์ด ๊ฐ€์„ค์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ฌธ๋งฅ ์ž„๋ฒ ๋”ฉ ๋ชจ๋ธ(BERT)์„ ์‚ฌ์šฉํ•˜๊ณ  ์‹œ๊ฐ„์— ๋”ฐ๋ฅธ ์™ธ๋ž˜์–ด์˜ ๋ฌธ๋งฅ ๋ณ€ํ™”๋ฅผ ์ถ”์ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์‹คํ—˜ ๊ฒฐ๊ณผ, ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด ์™ธ๋ž˜์–ด์˜ ๋ฌธ๋งฅ ๋ณ€ํ™” ์ถ”์ ์„ ํ†ตํ•ด ๋ฌธํ™”์  ๊ฒฝํ–ฅ์˜ ๋ณ€ํ™”๋ฅผ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ์ผ๋ณธ์–ด์™€ ํ•œ๊ตญ์–ด ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ์ด๊ฒƒ์€ ์ „์‚ฐ ๋‹ค๊ตญ์–ด ๋Œ€์กฐ ์–ธ์–ด์—ฐ๊ตฌ์˜ ๊ฐ€๋Šฅ์„ฑ์„ ๋ณด์—ฌ์ค€๋‹ค. ์ด๋Ÿฌํ•œ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ ๊ธฐ๋ฐ˜์˜ ์˜๋ฏธ ๋ถ„์„ ๋ฐฉ๋ฒ•์€ ๋‹ค์–ธ์–ด ๊ณ„์‚ฐ์˜๋ฏธ๋ก  ๋ฐ ๊ณ„์‚ฐ์‚ฌํšŒ์–ธ์–ดํ•™์˜ ๋ฐœ์ „์— ๋งŽ์€ ๊ธฐ์—ฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋œ๋‹ค.Through cultural exchanges with foreign countries, a lot of foreign words have entered another country with a foreign culture. These foreign words, loanwords, have broadly prevailed in languages all over the world. Historical linguistics has actively studied the loanword because loanword can trigger the linguistic change within the recipient language. Loanwords affect existing words and grammar: native words become obsolete, foreign suffixes and words coin new words and phrases by combining with the native words in the recipient language, and foreign prepositions are used in the recipient language. Loanwords themselves also undergo language changes-morphological, phonological, and semantic changes-because of linguistic constraints of recipient languages through the process of integration and adaptation in the recipient language. Several fields of linguistics-morphology, phonology, and semantics-have studied these changes caused by the invasion of loanwords. Mainly loanwords introduce to the recipient language a completely new foreign product or concept that can not be expressed by the recipient language words. However, people often use loanwords for giving prestigious, luxurious, and academic images. These sociolinguistic roles of loanwords have recently received particular attention in sociolinguistics and pragmatics. Most previous works of loanwords have gathered many examples of loanwords and summarized the linguistic change patterns. Recently, corpus-based quantitative studies have started to statistically reveal several linguistic factors such as the word length influencing the successful integration and adaptation of loanwords in the recipient language. However, these frequency-based researches have difficulties quantifying the complex semantic information. Thus, the quantitative analysis of the loanword semantic phenomena has remained undeveloped. This research sheds light on the quantitative analysis of the semantic phenomena of loanwords using the Word Embedding method. Word embedding can effectively convert semantic contextual information of words to vector values with deep learning methods and big language data. This study suggests several quantitative methods for analyzing the semantic phenomena related to the loanword. This dissertation focuses on three topics of semantic phenomena related to the loanword: Lexical competition, Semantic adaptation, and Social semantic function and the cultural trend change. The first study focuses on the lexical competition between the loanword and the native synonym. Frequency can not distinguish the types of a lexical competition: Word replacement or Semantic differentiation. Judging the type of lexical competition requires to know the context sharing condition between loanwords and the native synonyms. We apply the geometrical concept to modeling the context sharing condition. This geometrical word embedding-based model quantitatively judges what lexical competitions happen between the loanwords and the native synonyms. The second study focus on the semantic adaptation of English loanwords in Japanese and Korean. The original English loanwords undergo semantic change (semantic adaptation) through the process of integration and adaptation in the recipient language. This study applies the transformation matrix method to compare the semantic difference between the loanwords and the original English words. This study extends this transformation method for a contrastive study of the semantic adaptation of English loanwords in Japanese and Korean. The third study focuses on the social semantic role of loanwords reflecting the current cultural trend in Japanese and Korean. Japanese and Korean society frequently use loanwords when new trends or issues happened. Loanwords seem to work as signals alarming the cultural trend in Japanese and Korean. Thus, we propose the hypothesis that loanwords have a role as an indicator of the cultural trend change. This study suggests the tracking method of the contextual change of loanwords through time with the pre-trained contextual embedding model (BERT) for verifying this hypothesis. This word embedding-based method can detect the cultural trend change through the contextual change of loanwords. Throughout these studies, we used our methods in Japanese and Korean data. This shows the possibility for the computational multilingual contrastive linguistic study. These word embedding-based semantic analysis methods will contribute a lot to the development of computational semantics and computational sociolinguistics in various languages.Abstract i Contents iv List of Tables viii List of Figures xi 1 Introduction 1 1.1 Overview of Loanword Study 1 1.2 Research Topics in this Dissertation 6 1.2.1 Lexical Competition between Loanword and Native Synonym 6 1.2.2 Semantic Adaptation of Loanwords 8 1.2.3 Social Semantic Function and the Cultural Trend Change 11 1.3 Methodological Background 14 1.3.1 The Vector Space Model 14 1.3.2 The Bag of Words Model 15 1.3.3 Neural Network and Neural Probabilistic Language Model 15 1.3.4 Distributional Model and Word2vec 18 1.3.5 The Contextual Word Embedding and BERT 21 1.4 Summary of this Chapter 23 2 Word Embeddings for Lexical Changes Caused by Lexical Competition between Loanwords and Native Words 25 2.1 Overview 25 2.2 Related Works 28 2.2.1 Lexical Competition in Loanword 28 2.2.2 Word Embedding Model and Semantic Change 30 2.3 Selection of Loanword and Korean Synonym Pairs 31 2.3.1 Viable Loanwords 31 2.3.2 Previous Approach: The Relative Frequency 31 2.3.3 New Approach: The Proportion Test 32 2.3.4 Technical Challenges for Performing the Proportion Test 32 2.3.5 Filtering Procedures 34 2.3.6 Handling Errors 35 2.3.7 Proportion Test and Questionnaire Survey 36 2.4 Analysis of Lexical Competition 38 2.4.1 The Geometrical Model for Analyzing the Lexical Competition 39 2.4.2 Word Embedding Model for Analyzing Lexical Competition 44 2.4.3 Result and Discussion 44 2.5 Conclusion and Future Work 48 3 Applying Word Embeddings to Measure the Semantic Adaptation of English Loanwords in Japanese and Korean 51 3.1 Overview 51 3.2 Methodology 54 3.3 Data and Experiment 55 3.4 Result and Discussion 58 3.4.1 Japanese 59 3.4.2 Korean 63 3.4.3 Comparison of Cosine Similarities of English Loanwords in Japanese and Korean 68 3.4.4 The Relationship Between the Number of Meanings and Cosine Similarities 75 3.5 Conclusion and Future Works 77 4 Detection of the Contextual Change of Loanwords and the Cultural Trend Change in Japanese and Korean through Pre-trained BERT Language Models 78 4.1 Overview 78 4.2 Related Work 81 4.2.1 Loanwords and Cultural Trend Change 81 4.2.2 Word Embeddings and Semantic Change 81 4.2.3 Contextualized Embedding and Diachronic Semantic Representation 82 4.3 The Framework 82 4.3.1 Sense Representation 82 4.3.2 Tracking the Contextual Changes 85 4.3.3 Evaluation of Frame Work 86 4.3.4 Discussion for Framework 89 4.4 The Cultural Trend Change Analysis through Loanword Contextual Change Detection 89 4.4.1 Methodology 89 4.4.2 Result and Discussion 91 4.5 Conclusion and Future Work 96 5 Conclusion and Future Works 97 5.1 Summary 97 5.2 Future Works 99 5.2.1 Revealing Statistical Law 99 5.2.2 Computational Contrastive Linguistic Study 100 5.2.3 Application to Other Semantics Tasks 100 A List of Loanword Having One Synset and One Definition in Korean CoreNet in Chapter 2 112 Abstract (In Korean) 118Docto
    • โ€ฆ
    corecore