3,580 research outputs found

    An open source rule induction tool for transfer-based SMT

    Get PDF
    In this paper we describe an open source tool for automatic induction of transfer rules. Transfer rule induction is carried out on pairs of dependency structures and their node alignment to produce all rules consistent with the node alignment. We describe an efficient algorithm for rule induction and give a detailed description of how to use the tool

    A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

    Get PDF
    Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency. Despite the vast amount of research published to date, the interest of the community in this problem has not decreased, and no single method appears to be strongly dominant across language pairs. Instead, the choice of the optimal approach for a new translation task still seems to be mostly driven by empirical trials. To orientate the reader in this vast and complex research area, we present a comprehensive survey of word reordering viewed as a statistical modeling challenge and as a natural language phenomenon. The survey describes in detail how word reordering is modeled within different string-based and tree-based SMT frameworks and as a stand-alone task, including systematic overviews of the literature in advanced reordering modeling. We then question why some approaches are more successful than others in different language pairs. We argue that, besides measuring the amount of reordering, it is important to understand which kinds of reordering occur in a given language pair. To this end, we conduct a qualitative analysis of word reordering phenomena in a diverse sample of language pairs, based on a large collection of linguistic knowledge. Empirical results in the SMT literature are shown to support the hypothesis that a few linguistic facts can be very useful to anticipate the reordering characteristics of a language pair and to select the SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic

    Getting Past the Language Gap: Innovations in Machine Translation

    Get PDF
    In this chapter, we will be reviewing state of the art machine translation systems, and will discuss innovative methods for machine translation, highlighting the most promising techniques and applications. Machine translation (MT) has benefited from a revitalization in the last 10 years or so, after a period of relatively slow activity. In 2005 the field received a jumpstart when a powerful complete experimental package for building MT systems from scratch became freely available as a result of the unified efforts of the MOSES international consortium. Around the same time, hierarchical methods had been introduced by Chinese researchers, which allowed the introduction and use of syntactic information in translation modeling. Furthermore, the advances in the related field of computational linguistics, making off-the-shelf taggers and parsers readily available, helped give MT an additional boost. Yet there is still more progress to be made. For example, MT will be enhanced greatly when both syntax and semantics are on board: this still presents a major challenge though many advanced research groups are currently pursuing ways to meet this challenge head-on. The next generation of MT will consist of a collection of hybrid systems. It also augurs well for the mobile environment, as we look forward to more advanced and improved technologies that enable the working of Speech-To-Speech machine translation on hand-held devices, i.e. speech recognition and speech synthesis. We review all of these developments and point out in the final section some of the most promising research avenues for the future of MT

    Spoken dialog systems based on online generated stochastic finite-state transducers

    Full text link
    This is the author’s version of a work that was accepted for publication in Speech Communication. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Speech Communication 83 (2016) 81–93. DOI 10.1016/j.specom.2016.07.011.In this paper, we present an approach for the development of spoken dialog systems based on the statistical modelization of the dialog manager. This work focuses on three points: the modelization of the dialog manager using Stochastic Finite-State Transducers, an unsupervised way to generate training corpora, and a mechanism to address the problem of coverage that is based on the online generation of synthetic dialogs. Our proposal has been developed and applied to a sport facilities booking task at the university. We present experimentation evaluating the system behavior on a set of dialogs that was acquired using the Wizard of Oz technique as well as experimentation with real users. The experimentation shows that the method proposed to increase the coverage of the Dialog System was useful to find new valid paths in the model to achieve the user goals, providing good results with real users. © 2016 Elsevier B.V. All rights reserved.This work is partially supported by the project ASLP-MULAN: Audio, Speech and Language Processing for Multimedia Analytics (MINECO TIN2014-54288-C4-3-R).Hurtado Oliver, LF.; Planells Lerma, J.; Segarra Soriano, E.; Sanchís Arnal, E. (2016). Spoken dialog systems based on online generated stochastic finite-state transducers. Speech Communication. 83:81-93. https://doi.org/10.1016/j.specom.2016.07.011S81938

    Cross-language Ontology Learning: Incorporating and Exploiting Cross-language Data in the Ontology Learning Process

    Get PDF
    Hans Hjelm. Cross-language Ontology Learning: Incorporating and Exploiting Cross-language Data in the Ontology Learning Process. NEALT Monograph Series, Vol. 1 (2009), 159 pages. © 2009 Hans Hjelm. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/10126

    Advances in computational modelling for personalised medicine after myocardial infarction

    Get PDF
    Myocardial infarction (MI) is a leading cause of premature morbidity and mortality worldwide. Determining which patients will experience heart failure and sudden cardiac death after an acute MI is notoriously difficult for clinicians. The extent of heart damage after an acute MI is informed by cardiac imaging, typically using echocardiography or sometimes, cardiac magnetic resonance (CMR). These scans provide complex data sets that are only partially exploited by clinicians in daily practice, implying potential for improved risk assessment. Computational modelling of left ventricular (LV) function can bridge the gap towards personalised medicine using cardiac imaging in patients with post-MI. Several novel biomechanical parameters have theoretical prognostic value and may be useful to reflect the biomechanical effects of novel preventive therapy for adverse remodelling post-MI. These parameters include myocardial contractility (regional and global), stiffness and stress. Further, the parameters can be delineated spatially to correspond with infarct pathology and the remote zone. While these parameters hold promise, there are challenges for translating MI modelling into clinical practice, including model uncertainty, validation and verification, as well as time-efficient processing. More research is needed to (1) simplify imaging with CMR in patients with post-MI, while preserving diagnostic accuracy and patient tolerance (2) to assess and validate novel biomechanical parameters against established prognostic biomarkers, such as LV ejection fraction and infarct size. Accessible software packages with minimal user interaction are also needed. Translating benefits to patients will be achieved through a multidisciplinary approach including clinicians, mathematicians, statisticians and industry partners
    corecore