10 research outputs found

    Proceedings of the 17th Annual Conference of the European Association for Machine Translation

    Get PDF
    Proceedings of the 17th Annual Conference of the European Association for Machine Translation (EAMT

    An empirical examination of interdisciplinary collaboration within the practice of localisation and development of international software

    Get PDF
    Acceptance on international markets is an important selling proposition for software products and a key to new markets. The adaptation of software products for specific markets is called software localisation. Practitioner reports and research suggests that activities of developers and translators do not mesh seamlessly, leading to problems such as disproportionate cost, lack of quality, and delayed product release. Yet, there is little research on localisation as a comprehensive activity and its human factors. This thesis examines how software localisation is handled in practice, how the localisation process is integrated into development, and how software developers and localisers work individually and collaboratively on international software. The research aims to understand how localisation issues around the above-mentioned classifications of cost, quality and time issues are caused. Qualitative and quantitative data is gathered through semi-structured interviews and an online survey. The interviews focused on the individual experiences of localisation and development professionals in a range of relevant roles. The online survey measured cultural competence, attitude towards and self-efficacy in localisation, and properties of localisation projects. Interviews were conducted and analysed following Straussian Grounded Theory. The survey was statistically analysed to test a number of hypotheses regarding differences between localisers and developers, as well as relationships between project properties and software quality. Results suggest gaps in knowledge, procedure and motivation between developers and translators, as well as a lack of cross-disciplinary knowledge and coordination. Further, a grounded theory of interdisciplinary collaboration in software localisation explains how collaboration strategies and conflicts reciprocally affect each other and are affected by external influences. A number of statistically significant differences between developers and localisers and the relevance of certain project properties to localisation were confirmed. The findings give new insights into interdisciplinary issues in the development of international software and suggest new ways to handle interdisciplinary collaboration in general

    I. Magyar Szåmítógépes Nyelvészeti Konferencia

    Get PDF

    Tune your brown clustering, please

    Get PDF
    Brown clustering, an unsupervised hierarchical clustering technique based on ngram mutual information, has proven useful in many NLP applications. However, most uses of Brown clustering employ the same default configuration; the appropriateness of this configuration has gone predominantly unexplored. Accordingly, we present information for practitioners on the behaviour of Brown clustering in order to assist hyper-parametre tuning, in the form of a theoretical model of Brown clustering utility. This model is then evaluated empirically in two sequence labelling tasks over two text types. We explore the dynamic between the input corpus size, chosen number of classes, and quality of the resulting clusters, which has an impact for any approach using Brown clustering. In every scenario that we examine, our results reveal that the values most commonly used for the clustering are sub-optimal

    Statistical machine translation system and computational domain adaptation

    Get PDF
    Statističko strojno prevođenje temeljeno na frazama jedan je od mogućih pristupa automatskom strojnom prevođenju. U radu su predloĆŸene metode za poboljĆĄanje kvalitete strojnog prijevoda prilagodbom određenih parametara u modelu sustava za statističko strojno prevođenje. Ideja rada bila jest izgraditi sustave za statističko strojno prevođenje temeljeno na frazama za hrvatski i engleski jezik. Sustavi su trenirani za dva jezična smjera, na dvije domene, na paralelnim korpusima različitih veličina i obiljeĆŸja za hrvatsko-engleski i englesko-hrvatski jezični par, nakon čega proveden postupak ugađanja sustava. IstraĆŸeni su hibridni sustavi koji objedinjuju značajke obiju domena. Time je ispitan izravan utjecaj adaptacije domene na kvalitetu automatskog strojnog prijevoda hrvatskog jezika, a nova saznanja mogu koristiti pri izgradnji novih sustava. Provedena je automatska i ljudska evaluacija (vrednovanje) strojnih prijevoda, a dobiveni rezultati uspoređeni su s rezultatima strojnih prijevoda dobivenih primjenom postojećih web servisa za statističko strojno prevođenje.Phrase-based statistical machine translation is one of possible automatic machine translation approaches. This work proposes methods for increasing the quality of machine translation by adapting certain parameters in the statistical machine translation model. The idea was to build phrase-based statistical machine translation systems for Croatian and English language. The systems were be trained for two directions, on two domains, on parallel corpora of different sizes and characteristics for Croatian-English and English-Croatian language pair, after which the tuning procedure was conducted. Afterwards, hybrid systems which combine features of both domains were investigated. Thereby the direct impact of domain adaptation on the quality of automatic machine translation of Croatian language was explored, whereas new findings can be utilised for building new systems. Automatic and human evaluation of machine translations were carried out, while obtained results were compared with results obtained from applying existing statistical machine translation web services

    An automatic evaluation method for localization oriented lexicalised EBMT system

    No full text
    corecore