18,511 research outputs found

    An Approach toward Register Classification of Book Samples in the Balanced Corpus of Contemporary Written Japanese

    Get PDF

    フクゴウ ドウシ ~キル ニオケル ブンポウカ ノ カテイ ニツイテ ノ イチ シアン

    Get PDF
    The purpose of this study is to investigate the Japanese compound verb ‘-kiru’ from its frequency of appearance and its syntactic features. First, I extract ‘-kiru’ from BCCWJ(Balanced Corpus of Contemporary Written Japanese). Next I classify the meanings of ‘-kiru’ into 1.CUT, 2.END, 3. COMPLETION, 4.LIMIT, 3/4.COMPLETION / LIMIT(Middle of 3 and 4), and 5.SINGLE WORD. As the verb separates from its original meaning, the variation of a preceding clause verb increases, and the degree of co-occurrence with the grammar form also goes up. From this fact, it is suggested that the grammaticalization process of ’-kiru’ develops through its semantic derivation

    名詞述語文「AがBだ」の使用率 : 意味と構造の面から

    Get PDF
    Noun predicate sentences with the structure "A ga B da" have certain distinctive features when compared with sentences with the structure "A wa B da". Data from the Japanese corpus (BCCWJ: The Balanced Corpus of Contemporary Written Japanese), make it clear that when a noun predicate indicates quantity or degree, the noun predicate sentence-form "A ga B da" is more frequently used than the form "A wa B da". When the relation between a subject and its predicate indicates framework and content, "A ga B da" sentences are again more frequently used. From the viewpoint of structure, "A ga B da" is more frequently used in subordinate clauses. "A ga B da" can be analyzed as the structure that is formed after the omission of the topic in a double-subject construction

    モクテキ オ アラワス ナイ タメニ ノ ジッタイ

    Get PDF
    This paper describes the use of the expression nai?tameni in Japanese, by using the “Balanced Corpus of Contemporary Written Japanese(BCCWJ)”(the monitor version in 2008). The tameni clause in Japanese has a “purpose” interpretation, which is determined by the controllability of the predicate that precedes tameni. In this sense, nai?tameni is a non?canonical expression, because the negative nai is a sort of stative predicate, and not a controllable one. This paper shows the variations of the nai?tameni construction in the BCCWJ, and analyzes this phenomenon from the following four viewpoints : (1) the frequency in the BCCWJ ; (2) the occurrence of the following particles ; (3) the properties of the preceding verbs of nai?tameni ; and (4) the mismatch of the agents of the main and subordinate clauses

    Genre Attribute-related Annotations on Fiction Samples in the Balanced Corpus of Contemporary Written Japanese

    Get PDF
    目白大学国立国語研究所 研究系Mejiro UniversityResearch Department, NINJAL我々は『現代日本語書き言葉均衡コーパス』の書籍サンプルに含まれるすべての小説サンプルについて,小説の内容に関するジャンルや舞台設定等の分類情報(「推理」「SF」「アドベンチャー」「ロマンス」など)を付与した。分類情報の策定にあたっては,小説サンプルの取得された各書籍について,書店や出版社の分類情報をはじめ,小説の内容を表すと複数作業者が判断した特徴語句を広く収集し,結果を整理した。各小説サンプルには様々な分類項目を重複して付与した。本稿の作業により,これまで分類されていなかった小説の分類情報が付与された。新たに付与された分類情報により,分類別の語彙分布や文体特徴が確認できるようになった。本稿では,作業手順と情報付与結果を報告する。We categorized genres and settings (e.g., "Mystery," "Science," "Adventure," "Romance," and "Historical") for all fiction works in book samples from the Balanced Corpus of Contemporary Written Japanese. To design the descriptive genre attributes, we explored the classification items of bookshops and publishers. We also newly defined the classification items by exploring characteristic words and phrases in the fiction contents. Thus, we annotated the designed classification items of genre attributes in a multi-label classification setting. The work described in this study enabled the assignment of new classification information for fiction samples in the Balanced Corpus of Contemporary Written Japanese. The genre attributes enabled us to confirm the distribution of vocabulary and stylistic features. We reported the annotation procedures and results of the classification items of the genre attributes.application/pdfdepartmental bulletin pape

    カゾク ニ カンスル ニホンゴ ゴイ ノ カテゴリーカ

    Get PDF
    The purpose of this study is to clarify the process of categorization in mind when Japanese speakers use the terms relating to a family or family members. This paper presents Japanese expressions composed of two parts: Hontoh-no (real / true) and family terms, Kazoku (family), Haha-oya (mother), Chichi-oya (father), Kodomo (child), Musume (daughter), and Musuko (son), and examines these expressions in the contexts in which they appear. Japanese Hontoh-no means that the topic is a real or true member of the category classified by the following word, and we can say categorization is happening here. The source of the data is the Balanced Corpus of Contemporary Written Japanese (BCCWJ), which was developed by the National Institute for Japanese Language and Linguistics, and contains more than one hundred million words of contemporary written Japanese. Close examination of the examples of Hontoh-no and family terms in their contexts show that the term Hontoh-no represents two kinds of forces in categorization, “distinguishing force” from some standard to be a category member and “unifying force” aiming at some salient feature of the category. The “unifying force” applies more often to Kazoku (family) and Haha-oya (mother), than to other family terms, Chichi-oya (father), Kodomo (child), Musume (daughter), and Musuko (son). This fact indicates that “family” and “mother” have clearer ideal images than other family members like “father” or “children” in Japanese society

    Social stereotypes in communicative formulae: Sociometric approach

    Full text link
    The focus of the article is centered on society through the prism of communication. Modern data extraction and information retrieval methods allow building a new vision of communicative process. The article is focused on the example of language idiom representations and omnibus survey, which help concentrate on the most stable human society ways of expression. There is also an attempt to carry out a comparative analysis of social features of the East and the West with the help of on-line national languages' corpuses. Omnibus survey results testify to the fact that low income people are reluctant to admit the influence of idioms on their day-to-day communicative practices, while rich people stress the significance of stable communicative formulae in their life. Societies are described through their attitude to labor, expressed in the idioms with a 'hand' component. With the help of electronic linguistic corpuses (Corpus of the Internet and business Chinese), KOTONOHA (Balanced Corpus of Contemporary Written Japanese), BNC (British National Corpus), COCA (Corpus of Contemporary American English) the research analyzes labor stereotypes on the basis of idiom frequency indexes. In practice the results of this study can be implemented in a special socio-cultural dictionary, where the most frequent idioms are given as social stereotypes and the most powerful symbolic tools of influence and manipulation. The results of these findings are relevant to multicultural societies, migration adaptation practices and global business development. The research results have been processed into a database, marked with the Rospatent Certificate No 2013620397, dated 03/13/2013
    corecore