Search CORE

1,640 research outputs found

Translation error annotation : building an annotation module for east asian languages

Author: Silva Beatriz Barrote
Publication venue
Publication date: 12/09/2022
Field of study

In this thesis it is proposed an annotation module to be applied in the context of Machine Translation (MT) concerning the East Asian languages of Japanese, Korean and Mandarin for the purpose of assessing MT output quality through annotation. The annotation module was created based on a data-driven analysis over Customer Support content in these languages previously annotated with the Unbabel Error Typology, which is a general typology in the sense that it is not conceived for any specific groups of languages. As such, this work also explores how applying translation error typologies inadequate to certain languages or content types can have an impact on how annotation reflects the quality of a translation. For the purpose of testing the effectiveness of the proposed annotation module, an annotation experiment for the languages under analysis was conducted. This experiment consisted of, for each language, annotating the same content using three different error typologies: the Unbabel Error Typology, the MQM-compliant error taxonomy for the translation direction of English to Chinese proposed by Ye and Toral (2020) and the annotation module proposed on this thesis. Furthermore, each dataset was annotated by two annotators. This allowed a comparison of Inter-annotator agreement (IAA) scores, which constitutes an important metric in terms of evaluating the effectiveness of an error typology. In light of this, each of the tested typologies was analyzed based on the obtained IAA scores and a further in-depth analysis of concrete annotations which lead to an understanding over their strengths and limitations. With this work it was possible to demonstrate that, if on one hand using error typologies inadequate for the content annotated has a negative impact on the quality of said annotations, on the other hand applying an error typology specific to the content being annotated can result in more consistent annotations.O trabalho desenvolvido no âmbito desta tese teve como objetivo principal a criação de um módulo de anotação para erros de tradução no contexto da Tradução Automática (TA) que fosse aplicável a Japonês, Coreano e Mandarim e compatível com o Multidimensional Quality Metrics (MQM) framework (Lommel et al., 2014). Este módulo foi criado com base numa análise de dados reais sobre traduções previamente anotadas dentro da empresa Unbabel seguindo uma tipologia geral concebida para anotação de vários pares linguísticos sem foco em grupos de línguas específicos. Ao mesmo tempo que permitiu verificar as consequências de anotar erros com uma tipologia pouco adequada à língua ou ao conteúdo traduzido, esta análise constituiu um ponto de partida importante para a criação do módulo de anotação proposto nesta tese. A Secção 2 desta tese concentrou-se em apresentar a Unbabel como instituição e os processos de qualidade em vigor dentro da empresa. A Secção 3 focou-se em apresentar o estado da arte em TA e processos de qualidade, com atenção especial às línguas sob análise nesta tese, bem como as tipologias de anotação de erros de tradução utilizadas para comparação de resultados. A análise dos dados disponíveis, descrita na Secção 4, foi feita em duas fases principais. Na primeira fase foi analisado um conjunto de 342 segmentos correspondentes ao par linguístico Inglês-Chinês (Simplificado), previamente anotados com a Unbabel Error Typology, a tipologia para anotação de erros de tradução utilizada para todos os pares linguísticos até junho de 2022. Esta análise demonstrou que uma percentagem significativa dos erros cometidos durante o processo de anotação podiam ser atribuídos não só à falta de claridade das diretrizes de anotação relativamente a características específicas presentes neste par linguístico como também à falta de alguns tipos de erros na tipologia. Na segunda fase de análise de dados foi possível confirmar e fundamentar a existência destes problemas. Nesta fase foi analisada uma amostra de dados mais abrangente que incluiu quatro pares linguísticos: Inglês-Japonês, Inglês-Coreano, Inglês-Chinês (Simplificado) e Inglês-Chinês (Tradicional). Para cada par linguístico foi analisado um total de cerca de 570 a 1900 segmentos e, com a exceção de Inglês-Coreano, todos os dados correspondiam às anotações de mais de um anotador. Esta análise permitiu concluir que os anotadores de todos os pares linguísticos mencionados cometeram vários erros, em especial no processo de escolha da categoria certa para cada erro de tradução mas também relativamente à seleção dos erros e atribuição da severidade certa a cada um. Através dos dados analisados foi possível determinar que tipos de erros seria necessário incluir numa tipologia de anotação de erros de tradução adaptada às línguas mencionadas e que tipo de instruções deveriam ser clarificadas nas diretrizes de anotação. Deste modo, após a conclusão da segunda fase de análise de dados foi possível começar a criar o módulo de anotação proposto nesta tese, denominado East Asian Languages Annotation Module for the Unbabel Quality Framework. O East Asian Languages Annotation Module for the Unbabel Quality Framework foi criado à imagem da Unbabel Error Typology e adaptado às características da nova versão que entrou em vigor na empresa em junho de 2022. No entanto, devido ao facto de ser um módulo de anotação adaptado às línguas asiáticas previamente mencionadas, várias categorias de erros existentes na Unbabel Error Typology foram removidos devido a corresponderem a componentes linguísticos que não fazem parte das línguas em questão. Do mesmo modo, foi adicionado um total de cinco novos tipos de erros ao módulo com base no que foi julgado necessário durante a fase de análise de dados. A versão final do East Asian Languages Annotation Module for the Unbabel Quality Framework conta com um total de 39 tipos de erros, em contraste com os 47 que fazem parte da Unbabel Error Typology. De forma complementar à criação do módulo de anotação foram também elaboradas diretrizes específicas para o módulo que, para além da definição de cada tipo de erro com exemplos, incluem também uma secção dedicada a casos difíceis (Tricky Cases) e esquemas (Decision Trees) para auxiliar na escolha da severidade e tipo de erro adequado para cada caso. Após a criação do módulo de anotação foi necessário testar se o mesmo pode ser aplicado com sucesso. Para este fim foi levado a cabo um estudo de comparação entre o East Asian Languages Annotation Module for the Unbabel Quality Framework e duas outras tipologias, descrito na Secção 5. Assim, foram conduzidas três fases de anotação com cerca de um mês de intervalo entre cada. Para cada tipologia dois anotadores por par linguístico anotaram entre 1100 e 4900 palavras cada um e, de modo a obter uma comparação precisa, dentro de cada par linguístico o conteúdo anotado com cada tipologia manteve-se o mesmo. A primeira fase de anotações foi efetuada utilizando a Unbabel Error Typology. Devido ao facto de os anotadores já estarem familiarizados com esta tipologia e já possuírem as diretrizes de anotação relativas à mesma, não foi necessário prestar apoio adicional aos anotadores nesta fase. A segunda ronda de anotações foi levada a cabo com a tipologia para anotação de erros de tradução para o par linguístico Inglês-Mandarim proposta por Ye e Toral (2020). Para esta fase de anotação foram criadas diretrizes específicas para esta tipologia com base no trabalho desenvolvido por Ye e Toral (2020) de modo a facilitar o processo de anotação. É importante referir que, apesar de esta tipologia ter sido criada para anotação de erros de tradução para o par linguístico Inglês-Mandarim, durante a fase de teste das tipologias esta foi utilizada para anotar todos os quatro pares linguísticos a serem analisados. Além disso, devido ao facto de ser uma tipologia nova, durante esta fase foi mantida a comunicação com os anotadores para esclarecimento de dúvidas. É necessário salientar que esta tipologia também foi importante na criação do East Asian Languages Annotation Module devido ao facto de conter tipos de erros específicos em relação à anotação do par linguístico para o qual foi criada e que serviram de base para novos tipos de erros propostos no módulo de anotação. A terceira e última fase de anotação foi feita com o East Asian Languages Annotation Module for the Unbabel Quality Framework proposto nesta tese. Nesta fase foram fornecidas aos anotadores as diretrizes que foram criadas de forma complementar ao módulo e, tal como durante a segunda fase, foi dada aos anotadores a possibilidade de comunicar as suas dúvidas. Os resultados das três fases de anotação descritas acima foram analisados da perspetiva do nível de acordo entre os anotadores, medido através da metodologia de Inter-annotator Agreement (IAA), em contraste com os valores equivalentes da métrica manual de qualidade MQM (Lommel et al., 2014), bem como através de uma análise detalhada das anotações de ambos anotadores para todos os pares linguísticos. No contexto da testagem de tipologias de anotação de erros de tradução uma análise dos valores de IAA obtidos, pois um elevado nível de concordância entre os anotadores reflete a clareza de uma tipologia. Adicionalmente, a análise detalhada das anotações em consonância com os valores de IAA permite avaliar que fatores influenciam a flutuação dos mesmos. Adicionalmente, o feedback que os anotadores forneceram em relação a cada tipologia também foi alvo de reflexão em contraste com os resultados obtidos. Deste modo, com a combinação de todos estes dados foi possível determinar os pontos fortes e as fraquezas de cada tipologia bem como entender que direção deverá seguir o trabalho futuro em torno do East Asian Languages Annotation Module for the Unbabel Quality Framework em termos do seu aperfeiçoamento. Com este trabalho foi possível demonstrar o impacto negativo de utilizar uma tipologia de erros pouco adequada ao conteúdo a ser anotado bem como provar que, por outro lado, uma tipologia criada para a anotação de um grupo específico de línguas pode melhorar a consistência das anotações relativas a componentes linguísticos próprios das línguas para as quais a tipologia é direcionada

Universidade de Lisboa: Repositório.UL

Scope Management of Non-Functional Requirements

Author: Daneva M.
Kassab M.
Ormandjieva O.
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2007
Field of study

In order to meet commitments in software projects, a realistic assessment must be made of project scope. Such an assessment relies on the availability of knowledge on the user-defined project requirements and their effort estimates and priorities, as well as their risk. This knowledge enables analysts, managers and software engineers to identify the most significant requirements from the list of requirements initially defined by the user. In practice, this scope assessment is applied to the Functional Requirements (FRs) provided by users who are unaware of, or ignore, the Non-Functional Requirements (NFRs). This paper presents ongoing research which aims at managing NFRs during the software development process. Establishing the relative priority of each NFR, and obtaining a rough estimate of the effort and risk associated with it, is integral to the software development process and to resource management. Our work extends the taxonomy of the NFR framework by integrating the concept of the "hardgoal". A functional size measure of NFRs is applied to facilitate the effort estimation process. The functional size measurement method we have chosen is COSMICFFP, which is theoretically sound and the de facto standard in the software industry

CiteSeerX

Crossref

University of Twente Research Information

The czech broadcast conversation corpus

Author: Kolář Jáchym
Švec Jan
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2009
Field of study

This paper presents the final version of the Czech Broadcast Conversation Corpus that will shortly be released at the Linguistic Data Consortium (LDC). The corpus contains 72 recordings of a radio discussion program, which yields about 33 hours of transcribed conversational speech from 128 speakers. The release does not only include verbatim transcripts and speaker information, but also structural metadata (MDE) annotation that involves labeling of sentence-like unit boundaries, marking of non-content words like filled pauses and discourse markers, and annotation of speech disfluencies. The MDE annotation is based on the LDC's annotation standard for English, with changes applied to accommodate phenomena that are specific for Czech. In addition to its importance to speech recognition, speaker diarization, and structural metadata extraction research, the corpus is also useful for linguistic analysis of conversational Czech

DSpace at University of West Bohemia

Key stage 2, level 6, 2014 mathematics tests: mathematics mark schemes: paper 1 and paper 2

Author
Publication venue: Standards and Testing Agency
Publication date: 01/01/2014
Field of study

Digital Education Resource Archive

"Beware of deception": Detecting Half-Truth and Debunking it through Controlled Claim Editing

Author: Bhatnagar Varad
Bhattacharyya Pushpak
Madaan Nishtha
Mehta Sameep
Singamsetty Sandeep
Publication venue
Publication date: 15/08/2023
Field of study

The prevalence of half-truths, which are statements containing some truth but that are ultimately deceptive, has risen with the increasing use of the internet. To help combat this problem, we have created a comprehensive pipeline consisting of a half-truth detection model and a claim editing model. Our approach utilizes the T5 model for controlled claim editing; "controlled" here means precise adjustments to select parts of a claim. Our methodology achieves an average BLEU score of 0.88 (on a scale of 0-1) and a disinfo-debunk score of 85% on edited claims. Significantly, our T5-based approach outperforms other Language Models such as GPT2, RoBERTa, PEGASUS, and Tailor, with average improvements of 82%, 57%, 42%, and 23% in disinfo-debunk scores, respectively. By extending the LIAR PLUS dataset, we achieve an F1 score of 82% for the half-truth detection model, setting a new benchmark in the field. While previous attempts have been made at half-truth detection, our approach is, to the best of our knowledge, the first to attempt to debunk half-truths

arXiv.org e-Print Archive

Exploiting Transformer-based Multitask Learning for the Detection of Media Bias in News Articles

Author: Aizawa Akiko
Gipp Bela
Götz-Hahn Franz
Krieger Jan-David
Mitrović Jelena
Ruas Terry
Spinde Timo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/11/2022
Field of study

Media has a substantial impact on the public perception of events. A one-sided or polarizing perspective on any topic is usually described as media bias. One of the ways how bias in news articles can be introduced is by altering word choice. Biased word choices are not always obvious, nor do they exhibit high context-dependency. Hence, detecting bias is often difficult. We propose a Transformer-based deep learning architecture trained via Multi-Task Learning using six bias-related data sets to tackle the media bias detection problem. Our best-performing implementation achieves a macro

F_{1}

of 0.776, a performance boost of 3\% compared to our baseline, outperforming existing methods. Our results indicate Multi-Task Learning as a promising alternative to improve existing baseline models in identifying slanted reporting

arXiv.org e-Print Archive

Pattern based fact extraction from Estonian free-texts

Author: Petmanson Timo
Publication venue: Tartu Ülikool
Publication date: 01/01/2012
Field of study

Vabatekstide töötlus on üks keerulisemaid probleeme arvutiteaduses. Tekstide täpne analüüs on tihti mitmestimõistetavuse tõttu arvutite jaoks keeruline või võimatu. Sellegipoolest on võimalik teatud fakte eraldada. Käesolevas töös uurime mustripõhiseid meetodeid faktide tuletamiseks eesti keelsetest tekstidest. Rakendame oma metoodikat reaalsetel tekstidel ning analüüsime tulemusi. Kirjeldame lühidalt aktiivõppe metoodikat, mis võimaldab suuri korpuseid kiiremini märgendada. Lisaks oleme implementeerinud prototüüplahenduse korpuste märgendamiseks ning mustripõhise faktituletuse läbiviimiseks.Natural language processing is one of the most difficult problems, since words and language constructions have often ambiguous meaning that cannot be resolved without extensive cultural background. However, some facts are easier to deduce than the others. In this work, we consider unary, binary and ternary relations between the words that can be deduced form a single sentence. The relations represented by sets of patterns are combined with basic machine learning methods, that are used to train and deploy patterns for fact extraction. We also describe the process of active learning, which helps to speed up annotating relations in large corpora. Other contributions include a prototype implementation with plain-text preprocessor, corpus annotator, pattern miner and fact extractor. Additionally, we provide empirical study about the efficiency of the prototype implementation with several relations and corpora

DSpace at Tartu University Library

Generic bidirectional typing for dependent type theories

Author: Felicissimo Thiago
Publication venue
Publication date: 23/10/2023
Field of study

Bidirectional typing is a discipline in which the typing judgment is decomposed explicitly into inference and checking modes, allowing to control the flow of type information in typing rules and to specify algorithmically how they should be used. Bidirectional typing has been fruitfully studied and bidirectional systems have been developed for many type theories. However, the formal development of bidirectional typing has until now been kept confined to specific theories, with general guidelines remaining informal. In this work, we give a generic account of bidirectional typing for a general class of dependent type theories. This is done by first giving a general definition of type theories (or equivalently, a logical framework), for which we define declarative and bidirectional type systems. We then show, in a theory-independent fashion, that the two systems are equivalent. This equivalence is then explored to establish the decidability of typing for weak normalizing theories, yielding a generic type-checking algorithm that has been implemented in a prototype and used in practice with many theories

arXiv.org e-Print Archive

Recommended from our members

A Computational Model of Non-Cooperation in Natural Language Dialogue

Author: Plüss Brian
Publication venue
Publication date: 20/05/2014
Field of study

A common assumption in the study of conversation is that participants fully cooperate in order to maximise the effectiveness of the exchange and ensure communication flow. This assumption persists even in situations in which the private goals of the participants are at odds: they may act strategically pursuing their agendas, but will still adhere to a number of linguistic norms or conventions which are implicitly accepted by a community of language users. However, in naturally occurring dialogue participants often depart from such norms, for instance, by asking inappropriate questions, by avoiding to provide adequate answers or by volunteering information that is not relevant to the conversation. These are examples of what we call linguistic non-cooperation. This thesis presents a systematic investigation of linguistic non-cooperation in dialogue. Given a specific activity, in a specific cultural context and time, the method proceeds by making explicit which linguistic behaviours are appropriate. This results in a set of rules: the global dialogue game. Non-cooperation is then measured as instances in which the actions of the participants are not in accordance with these rules. The dialogue game is formally defined in terms of discourse obligations. These are actions that participants are expected to perform at a given point in the dialogue based on the dialogue history. In this context, non-cooperation amounts to participants failing to act according to their obligations. We propose a general definition of linguistic non-cooperation and give a specific instance for political interview dialogues. Based on the latter, we present an empirical method which involves a coding scheme for the manual annotation of interview transcripts. The degree to which each participant cooperates is automatically determined by contrasting the annotated transcripts with the rules in the dialogue game for political interviews. The approach is evaluated on a corpus of broadcast political interviews and tested for correlation with human judgement on the same corpus. Further, we describe a model of conversational agents that incorporates the concepts and mechanisms above as part of their dialogue manager. This allows for the generation of conversations in which the agents exhibit varying degrees of cooperation by controlling how often they favour their private goals instead of discharging their discourse obligations

Open Research Online (The Open University)