136,215 research outputs found

    Estimating Performance of Pipelined Spoken Language Translation Systems

    Full text link
    Most spoken language translation systems developed to date rely on a pipelined architecture, in which the main stages are speech recognition, linguistic analysis, transfer, generation and speech synthesis. When making projections of error rates for systems of this kind, it is natural to assume that the error rates for the individual components are independent, making the system accuracy the product of the component accuracies. The paper reports experiments carried out using the SRI-SICS-Telia Research Spoken Language Translator and a 1000-utterance sample of unseen data. The results suggest that the naive performance model leads to serious overestimates of system error rates, since there are in fact strong dependencies between the components. Predicting the system error rate on the independence assumption by simple multiplication resulted in a 16\% proportional overestimate for all utterances, and a 19\% overestimate when only utterances of length 1-10 words were considered.Comment: 10 pages, Latex source. To appear in Proc. ICSLP '9

    Capturing translational divergences with a statistical tree-to-tree aligner

    Get PDF
    Parallel treebanks, which comprise paired source-target parse trees aligned at sub-sentential level, could be useful for many applications, particularly data-driven machine translation. In this paper, we focus on how translational divergences are captured within a parallel treebank using a fully automatic statistical tree-to-tree aligner. We observe that while the algorithm performs well at the phrase level, performance on lexical-level alignments is compromised by an inappropriate bias towards coverage rather than precision. This preference for high precision rather than broad coverage in terms of expressing translational divergences through tree-alignment stands in direct opposition to the situation for SMT word-alignment models. We suggest that this has implications not only for tree-alignment itself but also for the broader area of induction of syntaxaware models for SMT

    Acquiring and Applying Knowledge in Transnational Teams: The Roles of Cosmopolitans and Locals

    Get PDF
    This paper examines the roles of cosmopolitans and locals in transnational teams that work on knowledge-intensive projects. I propose that cosmopolitan and local team members can help their teams to acquire and apply knowledge more effectively, by bringing both internal and external knowledge to their teams and enabling them to more successfully transform this knowledge into improved project performance. Findings from a study of 96 project teams at an international development agency reveal that the roles of cosmopolitans and locals were complex and sometimes valuable, but cosmopolitans offered greater benefits than locals and too many of each could hurt. Implications for theory and research on international management, virtual teams, exploration and exploitation, and organizational knowledge are discussed

    Large-scale Hierarchical Alignment for Data-driven Text Rewriting

    Full text link
    We propose a simple unsupervised method for extracting pseudo-parallel monolingual sentence pairs from comparable corpora representative of two different text styles, such as news articles and scientific papers. Our approach does not require a seed parallel corpus, but instead relies solely on hierarchical search over pre-trained embeddings of documents and sentences. We demonstrate the effectiveness of our method through automatic and extrinsic evaluation on text simplification from the normal to the Simple Wikipedia. We show that pseudo-parallel sentences extracted with our method not only supplement existing parallel data, but can even lead to competitive performance on their own.Comment: RANLP 201
    corecore