136,215 research outputs found
Estimating Performance of Pipelined Spoken Language Translation Systems
Most spoken language translation systems developed to date rely on a
pipelined architecture, in which the main stages are speech recognition,
linguistic analysis, transfer, generation and speech synthesis. When making
projections of error rates for systems of this kind, it is natural to assume
that the error rates for the individual components are independent, making the
system accuracy the product of the component accuracies.
The paper reports experiments carried out using the SRI-SICS-Telia Research
Spoken Language Translator and a 1000-utterance sample of unseen data. The
results suggest that the naive performance model leads to serious overestimates
of system error rates, since there are in fact strong dependencies between the
components. Predicting the system error rate on the independence assumption by
simple multiplication resulted in a 16\% proportional overestimate for all
utterances, and a 19\% overestimate when only utterances of length 1-10 words
were considered.Comment: 10 pages, Latex source. To appear in Proc. ICSLP '9
Capturing translational divergences with a statistical tree-to-tree aligner
Parallel treebanks, which comprise paired source-target parse trees aligned at sub-sentential level, could be useful
for many applications, particularly data-driven machine translation. In this paper, we focus on how translational
divergences are captured within a parallel treebank using a fully automatic statistical tree-to-tree aligner. We
observe that while the algorithm performs well at the phrase level, performance on lexical-level alignments
is compromised by an inappropriate bias towards coverage rather than precision. This preference for high precision
rather than broad coverage in terms of expressing translational divergences through tree-alignment stands in
direct opposition to the situation for SMT word-alignment models. We suggest that this has implications not only
for tree-alignment itself but also for the broader area of induction of syntaxaware models for SMT
Acquiring and Applying Knowledge in Transnational Teams: The Roles of Cosmopolitans and Locals
This paper examines the roles of cosmopolitans and locals in transnational teams that work on knowledge-intensive projects. I propose that cosmopolitan and local team members can help their teams to acquire and apply knowledge more effectively, by bringing both internal and external knowledge to their teams and enabling them to more successfully transform this knowledge into improved project performance. Findings from a study of 96 project teams at an international development agency reveal that the roles of cosmopolitans and locals were complex and sometimes valuable, but cosmopolitans offered greater benefits than locals and too many of each could hurt. Implications for theory and research on international management, virtual teams, exploration and exploitation, and organizational knowledge are discussed
Recommended from our members
Successful features: Verb raising and adverbs in L2 acquisition under an Organic Grammar approach
Large-scale Hierarchical Alignment for Data-driven Text Rewriting
We propose a simple unsupervised method for extracting pseudo-parallel
monolingual sentence pairs from comparable corpora representative of two
different text styles, such as news articles and scientific papers. Our
approach does not require a seed parallel corpus, but instead relies solely on
hierarchical search over pre-trained embeddings of documents and sentences. We
demonstrate the effectiveness of our method through automatic and extrinsic
evaluation on text simplification from the normal to the Simple Wikipedia. We
show that pseudo-parallel sentences extracted with our method not only
supplement existing parallel data, but can even lead to competitive performance
on their own.Comment: RANLP 201
- …