121 research outputs found

    Segmentation et indexation d'objets complexes dans les images de bandes dessinées

    Get PDF
    In this thesis, we review, highlight and illustrate the challenges related to comic book image analysis in order to give to the reader a good overview about the last research progress in this field and the current issues. We propose three different approaches for comic book image analysis that are composed by several processing. The first approach is called "sequential'' because the image content is described in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing. Simple elements such as panel text and balloon are extracted first, followed by the balloon tail and then the comic character position in the panel. The second approach addresses independent information extraction to recover the main drawback of the first approach : error propagation. This second method is called “independent” because it is composed by several specific extractors for each elements of the image without any dependence between them. Extra processing such as balloon type classification and text recognition are also covered. The third approach introduces a knowledge-driven and scalable system of comics image understanding. This system called “expert system” is composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology. This expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order of panels and text, the relations between the speech balloons and their speakers and the comic character identification.Dans ce manuscrit de thèse, nous détaillons et illustrons les différents défis scientifiques liés à l'analyse automatique d'images de bandes dessinées, de manière à donner au lecteur tous les éléments concernant les dernières avancées scientifiques en la matière ainsi que les verrous scientifiques actuels. Nous proposons trois approches pour l'analyse d'image de bandes dessinées. La première approche est dite "séquentielle'' car le contenu de l'image est décrit progressivement et de manière intuitive. Dans cette approche, les extractions se succèdent, en commençant par les plus simples comme les cases, le texte et les bulles qui servent ensuite à guider l'extraction d'éléments plus complexes tels que la queue des bulles et les personnages au sein des cases. La seconde approche propose des extractions indépendantes les unes des autres de manière à éviter la propagation d'erreur due aux traitements successifs. D'autres éléments tels que la classification du type de bulle et la reconnaissance de texte y sont aussi abordés. La troisième approche introduit un système fondé sur une base de connaissance a priori du contenu des images de bandes dessinées. Ce système permet de construire une description sémantique de l'image, dirigée par les modèles de connaissances. Il combine les avantages des deux approches précédentes et permet une description sémantique de haut niveau pouvant inclure des informations telles que l'ordre de lecture, la sémantique des bulles, les relations entre les bulles et leurs locuteurs ainsi que les interactions entre les personnages

    Extracting speech text from comics

    Get PDF
    Overall, it has been challenging to find solutions able to correctly extract distinct types of text balloons from any sort of comics, but in particulary from complex comic books. The challenge comes from the fact that there is no general extraction algorithm in the literature capable of handling any text balloons without making any assumption regarding color depth of the image, orientation or language of the text. Even worse, it is the fact that the comics art evolves over time, so that there is some degree of unpredictability associated to comics. This means that, an algorithm may work well for comic books released twenty years ago, but not so well for current comic books, even considering they belong to the same category or series. With this dissertation it is intended to present a possible solution to this problem, by introducing an algorithm capable of extracting text balloons from comic book pages. The presented algorithm, here called CCD (components and corners detection), relies in the concept of corner detection to identify text snippets inside balloon candidates. So, after discarding a significant number of regions that are not considered as tentative text balloons for one reason or another, we look at the shape of the holes of the remaining regions to check if they still hold a significant number of corners capable to make a candidate be classified as text balloon.No geral, tem sido desafiante encontrar soluções capazes de extrair correctamente distintos tipos de balões de texto a partir de qualquer tipo de banda desenhada, mas particularmente da mais complexa. O desafio provém do facto de que não existe na literatura um algoritmo capaz de lidar com quaisquer balões de texto sem fazer qualquer suposição em relação à profundidade de cor da imagem, orientação ou linguagem do texto. Pior ainda, é o facto de que a arte da banda desenhada evolui ao longo do tempo, o que faz com que exista um certo grau de imprevisibilidade associado aos livros. Isto significa que, um algoritmo pode funcionar bem para livros de banda desenhada lançados há vinte anos atrás, mas não tão bem para livros mais actuais, mesmo considerando que eles pertencem à mesma categoria ou série. Com esta dissertação pretende-se apresentar uma possível solução para este problema, ao introduzir um algoritmo capaz de extrair balões de texto de páginas de banda desenhada. O algoritmo apresentado, aqui designado por CCD (components and corners detection), baseia-se no conceito de detecção de cantos para identificar trechos de texto dentro de componentes candidatos a balão. Assim sendo, depois de descartar um número significativo de regiões que não são consideradas balões de texto por um ou outro motivo, olhamos para a forma dos buracos das restantes regiões para verificar se ainda possuem um número significativo de cantos que seja capaz de fazer com que um candidato seja classificado como balão de texto

    Advanced document data extraction techniques to improve supply chain performance

    Get PDF
    In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information

    A galaxy of wor(l)ds: the translation of fictive vernacular in the Star Wars transmedia narrative in Brazil

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Inglês: Estudos Linguísticos e Literários, Florianópolis, 2020.Com as recentes mudanças de cenário nas publicações de materiais da saga Star Wars no Brasil (que começou com a mudança do titular da propriedade intelectual em 2012), a franquia tornou-se uma narrativa transmídia no país. Diante desse contexto, a presente pesquisa tem como objetivo descrever as práticas tradutórias adotadas para lidar com materiais de Star Wars. Considerando que uma narrativa transmídia é um todo composto formado pela expansão narrativa em múltiplos episódios em diferentes plataformas midiáticas, a presente pesquisa visa, em última instância, investigar as práticas de tradução adotadas e seus impactos para a integridade dessa narrativa transmídia no Brasil. A investigação das práticas de tradução centra-se no dispositivo narrativo baseado na linguagem verbal denominado Vernáculo Fictício, um conceito proposto nesta tese. Os Estudos Descritivos da Tradução ofereceram as bases teóricas para analisar os pares selecionados de textos fontes e suas traduções. Os Estudos de Tradução com base em Corpus fornecem os procedimentos e ferramentas teóricas e metodológicas para conduzir a análise dos dados, para cujo fim foi criado um corpus paralelo computadorizado. O corpus paralelo é composto por pares alinhados de textos fontes e traduções nas mídias livro, quadrinho e filme (apenas os componentes verbais das duas últimas mídias são incluídos no corpus paralelo). Ele é composto por dois pares por mídia, totalizando seis títulos e doze textos. A análise revela duas tendências principais nas práticas adotadas para traduzir o Vernáculo Fictício no corpus. A primeira tendência envolve imprimir a composição de itens fictícios fonte nos textos de chegada. A segunda diz respeito ao aproveitamento dos recursos da língua-alvo para traduzir itens fictícios, mesmo às custas de, ocasionalmente, anular sua função de criação de mundo.Abstract: With the recent change in the publication scenario of materials from the Star Wars saga in Brazil (upon the change of intellectual property holder in 2012), the franchise has become a transmedia narrative in the country. In view of this context, the present research aims to describe the translation practices adopted to deal with Star Wars materials. Considering that a transmedia narrative is a composite whole formed by narrative expansion across multiples instalments in different media platforms, the present research ultimately aims to investigate the adopted translation practices and their impact on the wholeness of the transmedia narrative in Brazil. The investigation of translation practices focuses on the language-based narrative device called Fictive Vernacular, a concept developed in this thesis. Descriptive Translation Studies offered the theoretical foundations to analyse the selected pairs of source and translated texts. Corpus-based Translation Studies provide the theoretical and methodological procedures and tools to conduct the data analysis, for which end a computerised parallel corpus was created. The parallel corpus is composed of aligned pairs of source and target books, comics and films (only the verbal components of the last two are included in the parallel corpus). It comprises of two pairs per media, adding up to six instalments and twelve texts in total. Analysis reveals two main tendencies in the practices of translating the Fictive Vernacular in the corpus. The first tendency involves imprinting the makeup of source fictive items into the target texts. The second concerns drawing on the resources of the target language to render fictive items, even at the expense of occasionally irrupting their world-building function

    Saudi-Arab Emerging Video Game Cultures, Archetypes, Narratives, and User Experiences

    Get PDF
    Arab representation in media has been a major focus of many works of renowned scholars, such as Edward Said (1978), Shaheen (2000), Karim (2005) and others. Journalism, film, television, and ancient literature have all been studied in these works. A recent addition to the study of Arab representation is the medium of video games. This was first examined by Reichmuth and Werning (2006) and Machin and Suleiman (2006) and extended by many works that are discussed in this thesis. The vast majority of the literature on Arab representation in video games focuses on Western video games and the reaction of Arab developers to these representations. Lack of specificity is another characteristic of this field. Both characteristics manifest in repeated comparative studies, where scholars select one local culture as an archetype, then embark on a comparative study of the global gaming community. In so doing, there is an unfair generalisation of Arab identity across broad and diverse regions, in terms of ethnic, ideological, national, historical, and even linguistic components. The present investigation critiques the shortcomings of this previous literature, while testing some alternative methods and approaches needed to re-examine the lack of access, language barriers and the aforementioned generalisations that have limited this field until now. Rather than assuming a single archetype for Saudi culture, this thesis departs from previous scholarship by examining the various aspects of the transformation process leading to what could be called an emergent “Saudiness”. Specifically, this study examines the construction and depiction of Saudi-Arab identity through the narratives and audiovisual content of video games, paying close attention to recent developments in Saudi cultural and media policy and the mandates set forth by the Vision 2030 development plan (SCEDA, 2016). Using theories on participatory culture (Jenkins, 2009) and spreadable media (Jenkins, Ford, and Green, 2013) as well as a content analysis of previously understudied material shared by a cohort of Saudi gamers, this research investigates the particular markers and strategies used to distinguish the spectrum of cultural aspects and elements with which Saudi gamers identify. To achieve this, the analysis focuses on three distinct archetypes of Saudi Arabs in video games: (a) the Saudis in Western video games, as suggested by previous works; (b) the Saudi citizen archetype, as recommended by state policy; and (c) the Saudi culture, as represented by Saudi gamers and Saudi game producers -- who in many cases reject the idea of a single archetype. In sum, this research sheds new light on the interactions between centralised and decentralised media in Saudi Arabia, as well as the Saudi gamers\u27 sense of agency, demonstrating how Saudis perceive Saudi representations in video games as part of a complex spectrum of interactions within a larger global gaming community

    The Adaptive Contexts of Videogame Adaptations and Franchises across Media

    Get PDF
    Videogame adaptations have been a staple of cinema and television since the 1980s and have had a consistent presence despite receiving overwhelmingly negative reactions. Recognising the perseverance of videogame adaptations, I examine some of the key issues and debates surrounding the genre with in-depth analysis of the source material, the machinations of the film and videogame industries, and the films themselves, specifically relating to three prominent onscreen videogame adaptations. Following an introduction to the various theories and areas of study already performed in this field, all of which I incorporate into an intricate, blended methodology, I explore issues of fidelity, localisation, and evolution that occur when adapting Sonic the Hedgehog out of the confines of its limited narrative. In examining adaptations of Mortal Kombat and Street Fighter, I explore how cinematic genres (such as the Hong Kong martial arts and American action movies) have influenced the creation of videogames and the production of their film and television adaptations. Finally, I delve into the history of zombie horror films, which influenced the Resident Evil franchise. As this became the longest-running (and, by extension, most successful) live-action videogame franchise, I explore the complex production of videogame adaptations, their critical and financial reception, and their ability to evolve into multimedia franchises. Overall, my work is designed to take videogame adaptations seriously by examining them through in-depth analysis, exploring how they convey the gameplay mechanics of their source material, analysing why they remain so popular despite their negative reputation, and by establishing an academic framework by which to discuss them with the same reverence afforded to literary adaptations

    The mapping of localized contents in the videogame inFamous 2: a multimodal corpus-based analysis

    Get PDF
    Tese (doutorado) - Universidade Federal de Santa Catarina, Centro de Comunicação e Expressão, Programa de Pós-Graduação em Estudos da Tradução, Florianópolis, 2016.Esta tese tem por objetivo analisar a prática de localização em videogames a partir das áreas de estudos da tradução e análise de corpus multimodal. Além disso, este estudo se pauta na investigação do game inFamous 2 (PS3) e sua versão localizada para o Português Brasileiro observando como este é atravessado por especificidades culturais de ordem linguística e da própria mídia. Como objetivo secundário, mas não menos relevante, esta investigação tem por meta também desenvolver uma representação sistematizada de conteúdos localizados nas interações entre personagens no jogo analisado de forma a oferecer um aparato de pesquisa baseada no uso de computadores para a análise de dados em formato de texto e em formato a partir do software ELAN. O aparato metodológico e de análise dos dados coletados orienta-se pela observação dos aspectos de ordem de consistência terminológica, de percepção cultural, narrativa e prosódia semântica entre os pares linguísticos Inglês dos Estados Unidos da América e Português do Brasil (En-US e Pt-Br) nos diálogos entre personagens do jogo analisado. Quanto aos resultados obtidos, o sistema de anotação utilizado mostrou leves mudanças em termos de uso da linguagem avaliativa que acompanhava as linhas de diálogos de alguns personagens, os quais fazem parte do enredo do jogo. O perfil lexical observado nas linhas de diálogos associadas aos personagens demonstrou uma atenuação na prosódia semântica em termos de perfil collocacional e registro de linguagem Por fim, as discussões e interpretações expandidas dos dados coletados visam refletir acerca da prática de localização como campo de expertise tradutória dentro da área de estudos da tradução.Abstract : This dissertation aims at analyzing how localization practices operate in video games from the perspective of the fields of Translation Studies and multimodal corpus-based research. Furthermore, it intends to investigate how the video game inFamous 2 (PS3) and its localized counterpart is constrained by implicit cultural specificities in their Brazilian-Portuguese translation and specificities of its very own media. As a secondary objective, but not less important, this study aims at developing a systemic representation of localized content in video game interactions by providing a computer-assisted framework of annotation for written and audio-visual data. This analysis was based on the annotated data performed by the software ELAN. The methodological and analytical framework also have the purpose of observing and describing the aspects of terminological consistency, cultural awareness, narrative and semantic prosodic nature within the linguistic pairs En-US and Pt-Br in the dialogue lines belonging to the characters in the game. As for the results obtained, the annotation framework used informed slight changes in terms of the evaluative language that accompanied the dialogue lines of the certain characters, who are part of the game?s plot. The lexical profile observed in the dialogue lines associated to these characters displayed an attenuation in the semantic prosodic features in terms of collocation profile and language register. Finally, the discussions and interpretations expanded from the data collected are used to systematically map the practices involved in digital game localization by drawing upon the practical aspects of this field of expertise in translation studies

    Digital tools in media studies: analysis and research. An overview

    Get PDF
    Digital tools are increasingly used in media studies, opening up new perspectives for research and analysis, while creating new problems at the same time. In this volume, international media scholars and computer scientists present their projects, varying from powerful film-historical databases to automatic video analysis software, discussing their application of digital tools and reporting on their results. This book is the first publication of its kind and a helpful guide to both media scholars and computer scientists who intend to use digital tools in their research, providing information on applications, standards, and problems

    Digital Tools in Media Studies

    Get PDF
    Digital tools are increasingly used in media studies, opening up new perspectives for research and analysis, while creating new problems at the same time. In this volume, international media scholars and computer scientists present their projects, varying from powerful film-historical databases to automatic video analysis software, discussing their application of digital tools and reporting on their results. This book is the first publication of its kind and a helpful guide to both media scholars and computer scientists who intend to use digital tools in their research, providing information on applications, standards, and problems

    The Economy Of Typography (the Arrangement or Mode of Operation of Typography)

    Get PDF
    The thesis will show that the current research into legibility and readability regarding certain aspects or characters of type is incomplete, and will demonstrate what further research is necessary to complete the analysis of these aspects or characters in the economy of typography in continuous text. Chapter 1 will show that the development of reading depends on the legibility of the typography and characters ‘recognizing patterns, planning strategy, and feeling’ in other words reading and writing are interdependent all depend in some part on the construction of the characters and their relationship to each other. It will also show that readable writing is desirable and important for the reader’s sake. Chapter 2 will deal with the practical presentation of the characters of what the reading public read, and the role played by legibility and readability of typography in conveying their message. Printers and designers will also have a working knowledge and experience of legibility and readability which is incorporated into typograhy presentations, and this also is taken into account in chapter 2. Chapter 3 reviews the criteria and methods used in typography readability and legibility research. The research will show that readability is the ease with which the eye can absorb the message and move along the line, and legibility is based on the ease with which one letter can be identified from another. Chapter 4 entitled Analysis and Recommendations concludes the thesis with a summary of chapters 1, 2 and 3 before presenting a comparative analysis of current research into legibility, with particular emphasis on misreading or misrecognition of characters, and provides illustrations of the conclusions reached by way of bar chart and tables. Appendix One of the thesis contains a comprehensive list of the research into legibility and readability. Appendix Two contains the graphics of Benjamin Sherbow showing typography layout supportive of type spacing matters discussed in chapter 2. The thesis has an extensive bibliography of the works referred to throughout the thesis