Search CORE

5 research outputs found

コーパス日本語学のための言語資源 : 形態素解析用電子化辞書の開発とその応用

Author: Atsushi YAMADA
Hanae KOISO
Hideki OGURA
Kiyotaka UCHIMOTO
Nobuaki MINEMATSU
Toshinobu OGISO
Yasuharu DEN
伝康晴
内元清貴
小木曽智信
小椋秀樹
小磯花絵
山田篤
峯松信明
Publication venue: 国書刊行会
Publication date: 25/10/2007
Field of study

千葉大学国立国語研究所国立国語研究所京都高度技術研究所東京大学情報通信研究機構国立国語研究所Chiba UniversityThe National Institute for Japanese LanguageThe National Institute for Japanese LanguageASTEMThe University of TokyoNational Institute of Information and Communications TechnologyThe National Institute for Japanese Languageコーパス日本語学への応用を指向した形態素解析用電子化辞書UniDicを開発した。大規模コーパスに対する形態論情報付与作業には,計算機を用いた形態素解析システムの利用が不可欠であるが,既存の形態素解析システム用辞書には,コーパス日本語学への応用を考える上でさまざまな不都合がある。1つは,単位の認定がある場合には長く,ある場合には短いといった不揃いがあることであり,もう1つは,異表記や異形態に対して同一の見出しが与えられないということである。言語研究で重要な要件となる,このような単位の斉一性や見出しの同一性への対処といったことを中心に,本電子化辞書の設計方針とそれを実装した辞書データベースシステムについて述べる。さらに,この設計の有用性を示すため,表記や語形の変異に関するコーパス分析の事例を紹介する。In this paper, we describe the design and the implementation of an electronic dictionary for morphological analysis, UniDic, which aims particularly at application to Japanese corpus linguistics. It has been indispensable for the development of a large-scale corpus to utilize an automatic morphological analyzer on computer. The existing dictionaries for morphological analyzers, however, reveal lots of problems when used in corpus linguistics, such as unevenness in defining a unit and failure in handling allomorphs and orthographic variants. Our dictionary, in contrast, deals with the uniformity of units and the identity of indexes, which are important requirements for linguistic analysis of corpora. We adopt multi-level definition of word units, consisting of short-, middle-, and long-unit words, and structured representation of indexes, composed of lemma, word form, orthography, and pronunciation. We develop a database system that straight-forwardly implements this design of the dictionary and a friendly user-interface for dictionary builders to be capable of searching and registering entries with grasping the complex structure of the indexes. We also show how this structured representation benefits us in analyzing morphologically annotated corpora, presenting case studies that investigate the variation of word form in spoken language corpus and the variation of orthography in written language corpus

Academic Repository of the National Institute for Japanese Language and Linguistics / 国立国語研究所学術情報リポジトリ

Improvement of Methanol Synthesis Catalyst by Increasing the Molecular Pore Size

Author: Higuchi K.
Inoue M.
Kazumasa Ogura
Masahiko Tatsumi
Toshinobu Yasutake
Wakatsuki T.
Publication venue: 'Society of Chemical Engineers, Japan'
Publication date: 01/01/2009
Field of study

Crossref

コーパス日本語学のための言語資源 : 形態素解析用電子化辞書の開発とその応用

Author: Atsushi YAMADA
Hanae KOISO
Hideki OGURA
Kiyotaka UCHIMOTO
Nobuaki MINEMATSU
Toshinobu OGISO
Yasuharu DEN
伝康晴
内元清貴
小木曽智信
小椋秀樹
小磯花絵
山田篤
峯松信明
Publication venue: 国書刊行会
Publication date: 25/03/2019
Field of study

Institutional Repositories DataBase (IRDB)

Silica and titanium dioxide nanoparticles cause pregnancy complications in mice

Author: A Nel
AC Enders
B Fadeel
C Albrecht
C Lam
CA Poland
D Knopp
DJ Barker
DM Bowman
DT Wigle
F Tian
FA Hills
G Girardi
G Girardi
G Konstantatos
G Koren
H Nabeshi
H Nabeshi
Haruhiko Kamada
Hiromi Nabeshi
Hisae Aoshima
I Cetin
Itaru Yanagihara
J Rossant
JL Mills
K Donaldson
K Takeda
Kazuma Higashisaka
Kazuya Mimura
Kazuya Nagano
Kiyoshi Shishido
KM Godfrey
Kohei Yamashita
KR Martin
KS Hougaard
L Li
L Myatt
M Chu
M Gasperowicz
M Hirashima
M Kibschull
M Lundqvist
M Saunders
M Shimizu
MA Augustin
Masatoshi Nozaki
N Hossain
N Sadrieh
Norio Itoh
P Filipe
P Redecha
P Wick
RA Petros
RE Watson
RG Tardiff
RH Derksen
S Hussain
Shigeru Saito
Shin-ichi Tsunoda
Tadanori Mayumi
Takayoshi Imazawa
Tokuyuki Yoshida
Tomoaki Yoshikawa
Toshinobu Ogura
VE Kagan
X He
X Liu
Y Li
Yasuhiro Abe
Yasuo Tsutsumi
Yasuo Yoshioka
Youko Monobe
Yuichi Kawai
Yuki Morishita
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref