110 research outputs found

    Developing Intelligent MultiMedia applications

    Get PDF

    Idioms in example-based machine translation

    Get PDF
    Machine Translation (MT) has progressed in parallel with idiom research throughout the years, since they are both interdisciplinary fields. However, most researchers and MT systems regard idioms as a thorn in MT\u27;s flesh. When it comes to idiom translation, it becomes really a difficult task for human translators, let alone for MT systems. The construction of an idiom database is complex and time-consuming, since there are not idiom corpora widely available and must be either manually constructed or consist of real examples carefully filtered. We incorporated both cases into our data sets and proved that idiom processing based on syntactic patterns of the topological field model is thoroughly feasible

    JTEC panel report on machine translation in Japan

    Get PDF
    The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems

    The EAGLES/ISLE initiative for setting standards: the Computational Lexicon Working Group for Multilingual Lexicons

    Get PDF
    ISLE (International Standards for Language Engineering), a transatlantic standards oriented initiative under the Human Language Technology (HLT) programme, is a continuation of the long standing EAGLES (Expert Advisory Group for Language Engineering Standards) initiative, carried out by European and American groups within the EU-US International Research Co-operation, supported by NSF and EC. The objective is to support HLT R&D international and national projects, and HLT industry, by developing and promoting widely agreed and urgently demanded HLT standards and guidelines for infrastructural language resources, tools, and HLT products. ISLE targets the areas of multilingual computational lexicons (MCL), natural interaction and multimodality (NIMM), and evaluation. For MCL, ISLE is working to: extend EAGLES work on lexical semantics, necessary to establish inter-language links; design standards for multilingual lexicons; develop a prototype tool to implement lexicon guidelines; create EAGLES-conformant sample lexicons and tag corpora for validation purposes; develop standardised evaluation procedures for lexicons. For NIMM, a rapidly innovating domain urgently requiring early standardisation, ISLE work is targeted to develop guidelines for: creation of NIMM data resources; interpretative annotation of NIMM data, including spoken dialogue; annotation of discourse phenomena. For evaluation, ISLE is working on: quality models for machine translation systems; maintenance of previous guidelines - in an ISO based framework. We concentrate in the paper on the Computational Lexicon Working Group, describing in detail the proposals of guidelines for the "Multilingual ISLE Lexical Entry" (MILE). We highlight some methodological principles applied in previous EAGLES, and followed in defining MILE. We also provide a description of the EU SIMPLE semantic lexicons built on the basis of previous EAGLES recommendations. Their importance is given by the fact that these lexicons are now enlarged to real-size lexicons within National Projects in 8 EU countries, thus building a really large infrastructural platform of harmonised lexicons in Europe. We will stress the relevance of standardised language resources also for the humanities applications. Numerous theories, approaches, systems are taken into account in ISLE, as any recommendation for harmonisation must build on the major contemporary approaches. Results will be widely disseminated, after validation in collaboration with EU and US HLT R&D projects, and industry. EAGLES work towards de facto standards has already allowed the field of Language Resources to establish broad consensus on key issues for some well-established areas - and will allow similar consensus to be achieved for other important areas through the ISLE project - providing thus a key opportunity for further consolidation and a basis for technological advance. EAGLES previous results in many areas have in fact already become de facto widely adopted standards, and EAGLES itself is a well-known trademark and a point of reference for HLT projects.Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney

    Investigating 'Aspect' in NMT and SMT: translating the English simple past and present perfect

    Get PDF
    One of the important differences between English and French grammar is related to how their verbal systems handle aspectual information. While the English simple past tense is aspectually neutral, the French and Spanish past tenses are linked with a particular imperfective/perfective aspect. This study examines what Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) learn about 'aspect'and how this is reflected in the translations they produce. We use their main knowledge sources, phrase-tables (SMT) and encoding vectors (NMT), to examine what kind of aspectual information they encode. Furthermore, we examine whether this encoded 'knowledge'is actually transferred during decoding and thus reflected in the actual translations. Our study is based on the translations of the English simple past and present perfect tenses into French and Spanish

    Transfer and architecture : views from chart parsing

    Get PDF
    The objective of this report is to describe the embedding of a transfer module within an alternative architectural approach for machine translation of spontaneous spoken language. The approach is cognitively oriented, i.e. it adapts some of the assumed properties of human language comprehension and production. The aspects to be modeled will include incrementality and robustness with respect to disturbances caused by the environment and performance phenomena of speech. Interaction between software modules is used to reduce ambiguity. The transfer stage of a translation system clearly has to obey these requirements to be an integral part of such a system. This paper outlines the kind of demands to be placed on the transfer module. Relations between the basic formalisms representing linguistic knowledge on the one hand and transfer on the other hand are demonstrated as well as the consequences for algorithms and data structures

    Probing the adult initial state of non-native Greek: a case study*

    Get PDF
    This is a case study on the initial state of Greek as a second language within the Universal Grammar framework. We administered three oral and four written tasks to an adult Italian-English bilingual with little exposure to Greek. The results showed above chance-level performance on subject-verb agreement and on articles across tasks, indicating the presence of the functional categories Inflection and Determiner. These results support Schwartz and Sprouse’s (1994) ‘Full Transfer/Full Access’ hypothesis and disprove theories which suggest that that the mental grammar of the L2 initial state contains lexical categories only (Vainikka Young‑Scholten 1994). However, the findings revealed low scores in nominal agreement, suggesting that this a problematic area in L2 Greek

    From feature to paradigm: deep learning in machine translation

    No full text
    In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed- forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This manuscript focuses on describing how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring. Then, we report the new neural MT approach together with a description of the foundational related works and recent approaches on using subword, characters and training with multilingual languages, among others. Finally, we include an analysis of the corresponding challenges and future work in using deep learning in MTPostprint (author's final draft
    corecore