Search CORE

33 research outputs found

The expressions of bHLH gene HES1 and HES5 in advanced ovarian serous adenocarcinomas and their prognostic significance: a retrospective clinical study

Author: A Behboudi
A Ström
AB Miller
Bingjian Lü
D Innocenzi
DA Ross
Feng Ye
H Axelson
IN Colaluca
J Jensen
J Liu
J Zhou
JA Bridgewater
Jing Ye
K Kimura
K Politi
M Kunnimalaiyaan
M Miyazaki
N Ishimura
O Hopfer
P Beatus
S Breit
S Lee
S Mozzetti
S Mungamuri
SJ Park
T Ito
Weiguo Lü
Xiaoduan Chen
Xing Xie
Xinyu Wang
Y Fu
Y Nefedova
Y Nefedova
Y Shi
Yajuan Fu
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Collocation Translation Acquisition Using Monolingual Corpora,” Association for Computational Linguistics 2004

Author: Yajuan Lü
Publication venue
Publication date
Field of study

Collocation translation is important for machine translation and many other NLP tasks. Unlike previous methods using bilingual parallel corpora, this paper presents a new method for acquiring collocation translations by making use of monolingual corpora and linguistic knowledge. First, dependency triples are extracted from Chinese and English corpora with dependency parsers. Then, a dependency triple translation model is estimated using the EM algorithm based on a dependency correspondence assumption. The generated triple translation model is used to extract collocation translations from two monolingual corpora. Experiments show that our approach outperforms the existing monolingual corpus based methods in dependency triple translation and achieves promising results in collocation translation extraction.

CiteSeerX

Improving Tree-to-Tree Translation with Packed Forests

Author: Qun Liu
Yajuan Lü
Yang Liu
Publication venue
Publication date: 01/01/2009
Field of study

Current tree-to-tree models suffer from parsing errors as they usually use only 1-best parses for rule extraction and decoding. We instead propose a forest-based tree-to-tree model that uses packed forests. The model is based on a probabilistic synchronous tree substitution grammar (STSG), which can be learned from aligned forest pairs automatically. The decoder finds ways of decomposing trees in the source forest into elementary trees using the source projection of STSG while building target forest in parallel. Comparable to the state-of-the-art phrase-based system Moses, using packed forests in tree-to-tree translation results in a significant absolute improvement of 3.6 BLEU points over using 1-best trees.

CiteSeerX

Crossref

Improving Statistical Machine Translation Performance by Training Data Selection and Optimization

Author: Jin Huang
Qun Liu
Yajuan Lü
Publication venue
Publication date
Field of study

Parallel corpus is an indispensable resource for translation model training in statistical machine translation (SMT). Instead of collecting more and more parallel training corpora, this paper aims to improve SMT performance by exploiting full potential of the existing parallel corpora. Two kinds of methods are proposed: offline data optimization and online model optimization. The offline method adapts the training data by redistributing the weight of each training sentence pairs. The online method adapts the translation model by redistributing the weight of each predefined submodels. Information retrieval model is used for the weighting scheme in both methods. Experimental results show that without using any additional resource, both methods can improve SMT performance significantly.

CiteSeerX

Learning Chinese Bracketing Knowledge Based on a Bilingual Language Model

Author: Muyun Yang
Sheng Li
Tiejun Zhao
Yajuan Lü
Publication venue
Publication date: 01/01/2002
Field of study

This paper proposes a new method for automatic acquisition of Chinese bracketing knowledge from English-Chinese sentencealigned bilingual corpora. Bilingual sentence pairs are first aligned in syntactic structure by combining English parse trees with a statistical bilingual language model. Chinese bracketing knowledge is then extracted automatically. The preliminary experiments show automatically learned knowledge accords well with manually annotated brackets. The proposed method is particularly useful to acquire bracketing knowledge for a less studied language that lacks tools and resources found in a second language more studied. Although this paper discusses experiments with Chinese and English, the method is also applicable to other language pairs

CiteSeerX

Crossref