Search CORE

17 research outputs found

BCCWJ-TimeBank: Temporal and Event Information Annotation on Japanese Text

Author: Asahara Masayuki
Imada Mizuho
Konishi Hikari
Maekawa Kikuo
Yasuda Sachi
Publication venue: Department of English, National Chengchi University
Publication date: 01/01/2013
Field of study

Webを母集団とした超大規模コーパスの開発 : 収集と組織化

Author: Hikari KONISHI
Kikuo MAEKAWA
Masayuki ASAHARA
Mizuho IMADA
Sachi YASUDA
今田水穂
保田祥
前川喜久雄
小西光
浅原正幸
Publication venue: 国立国語研究所
Publication date: 01/05/2014
Field of study

国立国語研究所コーパス開発センター国立国語研究所コーパス開発センタープロジェクト研究員国立国語研究所コーパス開発センタープロジェクト研究員国立国語研究所コーパス開発センター非常勤研究員国立国語研究所言語資源研究系Center for Corpus Development, NINJALPostdoctoral Research Fellow, Center for Corpus Development, NINJALPostdoctoral Research Fellow, Center for Corpus Development, NINJALAdjunct Researcher, Center for Corpus Development, NINJALDepartment of Corpus Studies, NINJAL国立国語研究所コーパス開発センターでは2011年より超大規模コーパスプロジェクトとして,Webを母集団とした100億語規模のコーパスの構築を進めている。構築にあたっては,工程を収集・組織化・利活用・保存の四つに分割して実装を進めている。本論文ではそのうち最初の2工程について報告する。収集に関しては,2012年第4四半期より3か月ごとに1億URLのクロールを繰り返し実施している。また組織化に関しては,2013年第3四半期までの約1年間に収集されたWebページの文抽出・形態素解析・係り受け解析を実施した。これらの作業に生じた問題とその解決法を示した後,2013年末において構築されたコーパスデータの基礎統計量を示し,本コーパスを用いてどのような理論的・応用的研究が可能になると考えられるかを論じる。In 2011, the National Institute for Japanese Language and Linguistics launched a corpus compilation project with the aim of constructing a ten-billion-word Web corpus. The project was split into the following four sub-projects: page collection, linguistic annotation, release, and preservation. During the page collection stage, crawling began during the fourth quarter of 2012. We crawled 100 million URLs every three months as fixed-point observations. During the linguistic annotation, normalization (HTML tag removal and character encoding conversion), Japanese morphological analysis (word segmentation and part-of-speech tagging), and Japanese dependency analysis were performed on the data that were crawled in the timespan of one year, specifically from the fourth quarter of 2012 to the third quarter of 2013. In this paper, we present the basic statistics of the crawled data and discuss possible theoretical and practical implications of the language resources. Additionally, we address issues encountered during the page collection and linguistic annotation stages, and offer tentative solutions

Academic Repository of the National Institute for Japanese Language and Linguistics / 国立国語研究所学術情報リポジトリ

The combined effect of the T2DM susceptibility genes is an important risk factor for T2DM in non-obese Japanese: a population based case-control study

Author: AC Janssens
BF Voight
C Hu
D Yach
E Zeggini
E Zeggini
FJ Tsai
GV Dedoussis
H Unoki
JC Chan
JT Tan
K Miyake
K Yasuda
Kimiko Yamakawa-Kobayashi
KM Waters
LJ Scott
LK Billings
M Imamura
M Stumvoll
M Xu
MA Lazar
Maki Natsume
MG Wolfs
Nobuhiko Kasezawa
Q Qi
R Saxena
R Sladek
S Cauchi
S O'Rahilly
S Omori
S Wild
Sachi Nakano
SE Kahn
Shingo Aoki
SM Willems
T Yamauchi
TM Frayling
Tomoko Inamori
Toshinao Goda
X Sim
XO Shu
Y Yazaki
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Type 2 diabetes mellitus (T2DM) is a complex endocrine and metabolic disorder. Recently, several genome-wide association studies (GWAS) have identified many novel susceptibility loci for T2DM, and indicated that there are common genetic causes contributing to the susceptibility to T2DM in multiple populations worldwide. In addition, clinical and epidemiological studies have indicated that obesity is a major risk factor for T2DM. However, the prevalence of obesity varies among the various ethnic groups. We aimed to determine the combined effects of these susceptibility loci and obesity/overweight for development of T2DM in the Japanese. Methods Single nucleotide polymorphisms (SNPs) in or near 17 susceptibility loci for T2DM, identified through GWAS in Caucasian and Asian populations, were genotyped in 333 cases with T2DM and 417 control subjects. Results We confirmed that the cumulative number of risk alleles based on 17 susceptibility loci for T2DM was an important risk factor in the development of T2DM in Japanese population (<it>P </it>< 0.0001), although the effect of each risk allele was relatively small. In addition, the significant association between an increased number of risk alleles and an increased risk of T2DM was observed in the non-obese group (<it>P </it>< 0.0001 for trend), but not in the obese/overweight group (<it>P </it>= 0.88 for trend). Conclusions Our findings indicate that there is an etiological heterogeneity of T2DM between obese/overweight and non-obese subjects.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Sexual system of a symbiotic pedunculate barnacle Poecilasma kaempferi (Cirripedia : Thoracica)

Author: KANEKO Atsushi
SAWADA Kota
YAMAGUCHI Sachi
YASUDA Keiko
YOSHIDA Sachi
YUSA Yoichi
Publication venue: 'Informa UK Limited'
Publication date: 03/03/2014
Field of study

Graduate University for Advanced Studies [SOKENDAI] Institutional Repository

Sexual system of a symbiotic pedunculate barnaclePoecilasma kaempferi(Cirripedia : Thoracica)

Author: KANEKO Atsushi
SAWADA Kota
YAMAGUCHI Sachi
YASUDA Keiko
YOSHIDA Sachi
YUSA Yoichi
Publication venue: 'Informa UK Limited'
Publication date: 11/08/2015
Field of study

Institutional Repositories DataBase (IRDB)

BCCWJ-TimeBank: Temporal and Event Information Annotation on Japanese Text

Author: Asahara Masayuki
Imada Mizuho
Konishi Hikari
Maekawa Kikuo
Yasuda Sachi
Publication venue: Department of English, National Chengchi University
Publication date: 26/08/2015
Field of study

Temporal information extraction can be split into the following three tasks: tem-poral expression extraction, time normalisa-tion, and temporal ordering relation resolu-tion. This paper describes a time expression and temporal ordering annotation schema for Japanese, employing the Balanced Cor-pus of Contemporary Written Japanese, or BCCWJ. The annotation is aimed at allow-ing the development of better Japanese tem-poral ordering relation resolution tools. The annotation schema is based on an ISO anno-tation standard – TimeML. We extract verbal and adjective event expressions as ⟨EVENT⟩ in a subset of BCCWJ. Then, we annotate temporal ordering relation ⟨TLINK ⟩ on the above pairs of event and time expressions by previous work. We identify several issues in the annotation.

CiteSeerX

Institutional Repositories DataBase (IRDB)

Dwarf males in the epizoic barnacle Octolasmis unguisiformis and their implication for sexual system evolution.

Author: SAWADA Kota
YAMAGUCHI Sachi
YASUDA Keiko
YOSHIDA Ryuta
YUSA Yoichi
Publication venue: 'Wiley'
Publication date: 11/08/2015
Field of study

Institutional Repositories DataBase (IRDB)

Dwarf males in the epizoic barnacle Octolasmis unguisiformis and their implication for sexual system evolution.

Author: SAWADA Kota
YAMAGUCHI Sachi
YASUDA Keiko
YOSHIDA Ryuta
YUSA Yoichi
Publication venue: 'Wiley'
Publication date: 01/06/2015
Field of study

Graduate University for Advanced Studies [SOKENDAI] Institutional Repository

Rapid construction of Drosophila RNAi transgenes using pRISE, a P-element-mediated transformation vector exploiting an in vitro recombination system

Author: Kunio Yasuda
MIKI D
Sachi Inagaki
SONTHEIMER E J
Takefumi Kondo
Yuji Kageyama
Publication venue: 'Genetics Society of Japan'
Publication date: 01/01/2006
Field of study

Crossref

Novel recombinant feline interferon carrying N-glycans with reduced allergy risk produced by a transgenic silkworm system

Author: Hideyo Yasuda
Masahiro Tomita
Sachi Minagawa
Satoshi Sekiguchi
Takenori Igarashi
Yoshio Miura
Yuzuru Nakaso
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2018
Field of study

Abstract Background The generation of recombinant proteins for commercialisation must be cost-effective. Despite the cost-effective production of recombinant feline interferon (rFeIFN) by a baculovirus expression system, this rFeIFN carries insect-type N-glycans, with core α 1,3 fucosyl residues that act as potential allergens. An alternative method of production may yield recombinant glycoproteins with reduced antigenicity. Results A cDNA clone encoding the fifteenth subtype of FeIFN-α (FeIFN-α15) was isolated from a Japanese domestic cat. This clone encoded a protein of 189 amino acids with a molecular mass of 21.1 kDa. The rFeIFN-α15 was expressed using a transgenic silkworm system, which was expected to yield an N-glycan structure with reduced antigenicity compared with the protein produced by the baculovirus system. The resulting rFeIFN-α15 accumulated in the sericin layer of silk fibres and was easily extracted and purified by column chromatography. The N-terminal amino acid sequence of purified rFeIFN-α15 was identical to the mature form of natural sequence. Moreover, its N-glycans did not include detectable core α 1,3 fucosyl residues. Its anti-vesicular stomatitis virus activity (2.6 × 108 units/mg protein) was comparable to that of the baculovirus-expressed rFeIFN. Conclusions The lower allergy risk of rFeIFN produced by the transgenic silkworm system than by the baculovirus expression system is due to the former lacking core α 1,3 fucosyl residues in its N-glycans. The rFeIFN-α15 produced by the transgenic silkworm system may be a prospective candidate for the next generation of rFeIFN in veterinary medicine

Directory of Open Access Journals