11 research outputs found

    Improving Polish to English Neural Machine Translation with Transfer Learning: Effects of Data Volume and Language Similarity

    Full text link
    This paper investigates the impact of data volume and the use of similar languages on transfer learning in a machine translation task. We find out that having more data generally leads to better performance, as it allows the model to learn more patterns and generalizations from the data. However, related languages can also be particularly effective when there is limited data available for a specific language pair, as the model can leverage the similarities between the languages to improve performance. To demonstrate, we fine-tune mBART model for a Polish-English translation task using the OPUS-100 dataset. We evaluate the performance of the model under various transfer learning configurations, including different transfer source languages and different shot levels for Polish, and report the results. Our experiments show that a combination of related languages and larger amounts of data outperforms the model trained on related languages or larger amounts of data alone. Additionally, we show the importance of related languages in zero-shot and few-shot configurations

    EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

    Full text link
    The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions. Alternatively, zero-shot translation can be accomplished by pivoting through a third language (e.g., English). In our work, we observe that both direct and pivot translations are noisy and achieve less satisfactory performance. We propose EBBS, an ensemble method with a novel bi-level beam search algorithm, where each ensemble component explores its own prediction step by step at the lower level but they are synchronized by a "soft voting" mechanism at the upper level. Results on two popular multilingual translation datasets show that EBBS consistently outperforms direct and pivot translations as well as existing ensemble techniques. Further, we can distill the ensemble's knowledge back to the multilingual model to improve inference efficiency; profoundly, our EBBS-based distillation does not sacrifice, or even improves, the translation quality

    La traduction multilingue : analyse d'une prouesse technologique

    Get PDF
    Neural machine translation (NMT) systems have made tangible progress in recent years, making them usable for an increasing number of domains and language pairs. The development of neural systems is based on machine learning algorithms and requires large electronic corpora of parallel texts, aligned at the sentence level. Such resources however only exist for a small number of language pairs and domains. To overcome this problem, a recent proposal is to develop so-called “multilingual” translation systems. These developments have been driven in particular by major Internet players, who need to develop automatic language processing tools for as many languages as possible. The main characteristic of these systems is to process multiple languages, both on the source and target sides, with a single translation engine. In this paper, we present the general principles underlying these systems and the innovations that have made them possible, before discussing their main strengths and weaknesses

    One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era

    Full text link
    OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demonstrated to be one small step for generative AI (GAI), but one giant leap for artificial general intelligence (AGI). Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. Such unprecedented attention has also motivated numerous researchers to investigate ChatGPT from various aspects. According to Google scholar, there are more than 500 articles with ChatGPT in their titles or mentioning it in their abstracts. Considering this, a review is urgently needed, and our work fills this gap. Overall, this work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges. Moreover, we present an outlook on how ChatGPT might evolve to realize general-purpose AIGC (a.k.a. AI-generated content), which will be a significant milestone for the development of AGI.Comment: A Survey on ChatGPT and GPT-4, 29 pages. Feedback is appreciated ([email protected]

    Застосування нейронних мереж в задачах розпізнавання об’єктів

    Get PDF
    Актуальність дослідження. Штучний інтелект – один із найголовніших напрямів розвитку сфери інформаційних технологій у сьогоденні. Штучні нейронні мережі є відгалуженням від глобального напрямку штучного інтелекту. Нейронні мережі можуть використовуватись для великого спектру задач, у тому числі для обробки зображень. Технологія обробки зображень для розпізнавання образів може забезпечити розпізнавання положення тіла людини з достатньо високою точністю, що дозволить відслідковувати її рухи та підраховувати кількість активності людини. Метою дослідження є створення веб-застосунку з використанням нейронної мережі для визначення кількісних та якісних параметрів активності людини. Об'єкт дослідження – веб-застосунок для оцінки якості та кількості виконання фізичного навантаження на основі технологій використання штучних нейронних мереж. Предмет дослідження – нейронна мережа для розпізнавання положення тіла людини Методи дослідження – розробка веб-застосунку на основі нейронної мережі та аналіз компонентів для реалізації системи. Наукова новизна одержаних результатів: запропонована система використання технології розпізнавання обʼєктів за допомогою нейронних мереж для контролю якості та кількості виконання вправ. Практичне значення одержаних результатів: У результаті виконання дипломної роботи розроблено веб-застосунок для оцінки виконання фізичного навантаження. Проведено аналіз технологій для реалізації створеного рішення. Отримані результати можуть бути використані для реалізації проекта із самостійного контролю кількості та правильності фізичної активності людини. Апробація результатів дисертації: Савка М. С., Філіпова Н.Ю. Обумовленність вибору відеокарт замість центральних процесорів при навчанні нейронних мереж: матеріали IІІ Всеукраїнської науково-технічної конференції «Технології кіно та аудіовізуальних систем» (9-10 грудня 2019 р). Київ, 2019 с. 23-25.This paper describes the development of a web application using a neural network to determine the quantitative and qualitative parameters of human activity. The first part is an analytical review of the prerequisites for the use of neural networks. The first section analyzes the historical preconditions for the creation of artificial intelligence systems. The principles of construction of neural network systems are considered. A common feature of many types of controlled and uncontrolled models of deep learning is that these models have many layers of latent neurons that learn in combination with the back propagation and error gradients of stochastic gradient descent. A fundamental advantage of neural networks among standard approaches to computer algorithms is the ability of the network to classify data very accurately, adjusting the strength of the connection between their neurons, ie changing the values of weights. The second section discusses the tools for developing and training neural networks. It is advisable to use the Python programming language to develop a solution for object recognition. The TensorFlow library for Python is considered. TensorFlow provides a set of working tools for developing and learning models using Python. The expediency of using JavaScript together with Flask to build the interface of the project website is considered. The use of the BlazePose pose detection model is considered. BlazePose displays 33 key points according to the following order. The use of graphics processors for the neural network learning process is conditioned. The third section develops a web application for estimating the quality and quantity of physical activity based on the use of artificial neural networks. The capabilities of the Flask framework were used to implement the server part of the web application. The BlazePose model was used as a model for recognizing the position of the human body. The user interface of the web application includes the number of repetitions performed, as well as the status of the position of the hands during the exercise. Added debug mode with display of recognized parts of the human body for better visibility, as well as for the process of tracking errors in recognition

    Tradurre la sicurezza alimentare come elemento competitivo per l’internazionalizzazione. Traduzione del Manuale HACCP di Olitalia S.r.l.

    Get PDF
    This dissertation thesis has been written as a result of the project “Language Toolkit”, promoted by the Department of Interpretation and Translation of Forlì in collaboration with the Romagna Chamber of Commerce. The aim of the project is to act as a bridge between local companies and Specialized Translation students, fostering business internationalisation and creating new potential job opportunities. This dissertation describes the 300-hour internship carried out by the author at the Forlì-based company Olitalia S.r.l., which processes and markets a plethora of vegetable oils. The task assigned to the trainee is the translation into British English of the HACCP Plan of Olitalia. This thesis consists of five chapters aimed at describing and analysing in-depth the work carried out during the internship at Olitalia. The first chapter focuses on the description of the company, the food safety discipline and HACCP method. The second chapter outlines the level of internationalisation and digitalisation of small and medium-sized enterprises located in Emilia-Romagna and the role of translation in multilingual business communication. The third chapter provides a theoretical background of the main technologies for translation which can be used as a support for the internationalisation process. The fourth chapter shows the preparatory work performed before starting the translation, with a particular focus on Nord’s text analysis and all the resources created and used to translate the HACCP Plan. Finally, the fifth chapter describes al the macro-strategies and micro-strategies employed to carry out the translation
    corecore