37 research outputs found

    Construction of the corpus of senmyō: one of the oldest materials of Japanese language

    Get PDF
    Of the oldest texts written in native Japanese that still exist today, waka (poems) and senmyō (imperial edicts) from the 8th century comprise the largest part. In this period, texts were usually written in Classical Chinese, but waka and senmyō were written in native Japanese using kanji (Chinese characters). Therefore, they are valuable materials of Old Japanese for linguists. We worked on construction of the corpus of senmyō mainly for the purpose of language research. Our corpus adheres to the writing style of the original text and is created under a unified design as part of the diachronic corpus covering from the eighth century to the present (CHJ)

    コロケーション強度を用いた中古語の語認定

    Get PDF
    国立国語研究所 コーパス開発センター 非常勤研究員Adjunct Researcher, Center for Corpus Development, National Institute for Japanese Language and Linguistics中古和文において,どこからどこまでを一語と認めるかという語認定には,従来明確な尺度がなく,既存の辞書の見出し語をあたっても,立項基準は感覚的・主観的なものであると言わざるを得ない。語と語の結びつきの強さ(コロケーション強度)を具体的な数値で示すダイス係数を取り上げ,「名詞+あり/なし/よし/あし」の組み合わせを例に,語認定の一つの客観的基準として,ダイス係数が有効であることを論じた。It has long been a serious problem for researchers of Early Middle Japanese to determine whether a set phrase like kai-nashi should be classified as one word or a combination of separate words. There is no definite criterion, and some phrases are listed in dictionaries as a word while others are neglected, all depending on the judgment of the editor. In this paper, the Dice coefficient is introduced as a solution. The Dice coefficient is an index for estimating collocation strength, i.e., how strongly two words are connected with each other. In combination with a morphological analysis dictionary (Chuko-Wabun UniDic), the Dice coefficient works as one criterion for word identification

    『日本語歴史コーパス』のための書籍活字の電子化 : 小学館新全集『今昔物語集』を事例として

    Get PDF
    国立国語研究所 コーパス開発センター 非常勤研究員(元)東京農工大学 博士課程[former] Adjunct Researcher, Center for Corpus Development, NINJALDoctoral Student, Tokyo University of Agriculture and Technology国立国語研究所で計画されている『日本語歴史コーパス』の構築にあたっては活字書籍化された古典資料のコーパス化を基本とし,その際には国内規格JIS X0213文字集合を用いて活字を電子化することが予定されている。本稿ではJIS X0213を古典資料の活字書籍に適用した場合の効果を検証するため,小学館新全集『今昔物語集』での漢字活字を調査し,のべ字数にして99.86%の活字がJIS X0213でカバーできることを明らかにし,JIS X0213の有効性を確認した。また,JIS X0213では表現できない活字に関しては,コーパスとしての利便性を鑑み,「〓」表示せずJIS X0213の範囲内の別字で代用しつつ,原資料での字形の情報を保持する方針を考案した。別字代用によりほぼ9割の外字は解消されるが,「〓」表示を完全になくすためには,文字レベルではなく,語の表記というレベルでの代用を考えなければならなくなる。末尾には小学館新全集『今昔物語集』で代用処理の対象となる特殊活字の一覧を付した。Digitizing characters not included in the standard set is an urgent problem for electronic corpora of historical documents. Such non-standard characters have hitherto been replaced with the symbol "〓" in digital corpora, which is quite inconvenient for users. In constructing the Corpus of Historical Japanese, the current Japanese standard for character codes, JIS X0213, will be adopted for the digitization of printed documents. This paper first examines the efficacy of JIS X0213 for typeset versions of old texts. A thorough investigation of the Shogakukan (SNKBZ) edition of the Konjaku Monogatarishu found that JIS X0213 covers 99.86% of the total character tokens. The paper then proposes a substitution system for the remaining 0.14% of the characters not covered by JIS X0213. The idea is to replace these non-standard characters with similar characters that are included in JIS X0213 while retaining information about the original characters for reference. All the non-standard characters in the Shogakukan (SNKBZ) edition of the Konjaku Monogatarishu are listed at the end of the paper along with their replacements

    Phase II trial of aflibercept with FOLFIRI as a second‐line treatment for Japanese patients with metastatic colorectal cancer

    Get PDF
    Aflibercept targets vascular endothelial growth factor. The present study involved assessing the efficacy, safety and pharmacokinetics of aflibercept plus 5‐fluorouracil/levofolinate/irinotecan (FOLFIRI) as a second‐line treatment for metastatic colorectal cancer (mCRC) in Japanese patients. Aflibercept (4 mg/kg) plus FOLFIRI was administered every 2 weeks in 62 patients with mCRC until disease progression, unacceptable toxicity or patient withdrawal. Tumors were imaged every 6 weeks. The primary endpoint was objective response rate (ORR); secondary endpoints were progression‐free survival, overall survival, safety, and pharmacokinetics of aflibercept, irinotecan and 5‐fluorouracil. A total of 60 patients were evaluated for ORR; 50 had received prior bevacizumab. The ORR was 8.3% (95% confidence interval [CI]: 1.3%‐15.3%), and the disease control rate (DCR) was 80.0% (69.9%‐90.1%). The median progression‐free survival was 5.42 months (4.14‐6.70 months) and the median overall survival was 15.59 months (11.20‐19.81 months). No treatment‐related deaths were observed, and no significant drug‐drug interactions were found. The most common treatment‐emergent adverse events were neutropenia and decreased appetite. Free aflibercept had a mean maximum concentration (coefficient of variation) of 73.2 μg/mL (15%), clearance of 0.805 L/d (22%) and volume of distribution of 6.2 L (18%); aflibercept bound with vascular endothelial growth factor had a clearance of 0.162 L/d (9%) (N = 62). Aflibercept did not significantly affect the pharmacokinetics of irinotecan or 5‐fluorouracil: The clearance was 11.1 L/h/m2 (28%) for irinotecan and, at steady state, 72.6 L/h/m2 (56%) for 5‐fluorouracil (N = 10). Adding aflibercept to FOLFIRI was shown to be beneficial and well‐tolerated in Japanese patients with mCRC. ClinicalTrials.gov Identifier: NCT01882868
    corecore