519 research outputs found

    A survey of comics research in computer science

    Full text link
    Graphical novels such as comics and mangas are well known all over the world. The digital transition started to change the way people are reading comics, more and more on smartphones and tablets and less and less on paper. In the recent years, a wide variety of research about comics has been proposed and might change the way comics are created, distributed and read in future years. Early work focuses on low level document image analysis: indeed comic books are complex, they contains text, drawings, balloon, panels, onomatopoeia, etc. Different fields of computer science covered research about user interaction and content generation such as multimedia, artificial intelligence, human-computer interaction, etc. with different sets of values. We propose in this paper to review the previous research about comics in computer science, to state what have been done and to give some insights about the main outlooks

    Using EPUB 3 and the open web platform for enhanced presentation and machine-understandable metadata for digital comics

    Get PDF
    Various methods are needed to extract information from current (digital) comics. Furthermore, the use of different (proprietary) formats by comic distribution platforms causes an overhead for authors. To overcome these issues, we propose a solution that makes use of the EPUB 3 specification, additionally leveraging the Open Web Platform to support animations, reading assistance, audio and multiple languages in a single format, by using our JavaScript library comicreader.js. We also provide administrative and descriptive metadata in the same format by introducing a new ontology: Dicera. Our solution is complementary to the current extraction methods, on the one hand because they can help with metadata creation, and on the other hand because the machine-understandable metadata alleviates their use. While the reading system support for our solution is currently limited, it can offer all features needed by current comic distribution platforms. When comparing comics generated by our solution to EPUB 3 textbooks, we observed an increase in file size, mainly due to the use of images. In future work, our solution can be further improved by extending the presentation features, investigating different types of comics, studying the use of new EPUB 3 extensions, and by incorporating it in digital book authoring environments

    Extracting speech text from comics

    Get PDF
    Overall, it has been challenging to find solutions able to correctly extract distinct types of text balloons from any sort of comics, but in particulary from complex comic books. The challenge comes from the fact that there is no general extraction algorithm in the literature capable of handling any text balloons without making any assumption regarding color depth of the image, orientation or language of the text. Even worse, it is the fact that the comics art evolves over time, so that there is some degree of unpredictability associated to comics. This means that, an algorithm may work well for comic books released twenty years ago, but not so well for current comic books, even considering they belong to the same category or series. With this dissertation it is intended to present a possible solution to this problem, by introducing an algorithm capable of extracting text balloons from comic book pages. The presented algorithm, here called CCD (components and corners detection), relies in the concept of corner detection to identify text snippets inside balloon candidates. So, after discarding a significant number of regions that are not considered as tentative text balloons for one reason or another, we look at the shape of the holes of the remaining regions to check if they still hold a significant number of corners capable to make a candidate be classified as text balloon.No geral, tem sido desafiante encontrar soluções capazes de extrair correctamente distintos tipos de balões de texto a partir de qualquer tipo de banda desenhada, mas particularmente da mais complexa. O desafio provém do facto de que não existe na literatura um algoritmo capaz de lidar com quaisquer balões de texto sem fazer qualquer suposição em relação à profundidade de cor da imagem, orientação ou linguagem do texto. Pior ainda, é o facto de que a arte da banda desenhada evolui ao longo do tempo, o que faz com que exista um certo grau de imprevisibilidade associado aos livros. Isto significa que, um algoritmo pode funcionar bem para livros de banda desenhada lançados há vinte anos atrás, mas não tão bem para livros mais actuais, mesmo considerando que eles pertencem à mesma categoria ou série. Com esta dissertação pretende-se apresentar uma possível solução para este problema, ao introduzir um algoritmo capaz de extrair balões de texto de páginas de banda desenhada. O algoritmo apresentado, aqui designado por CCD (components and corners detection), baseia-se no conceito de detecção de cantos para identificar trechos de texto dentro de componentes candidatos a balão. Assim sendo, depois de descartar um número significativo de regiões que não são consideradas balões de texto por um ou outro motivo, olhamos para a forma dos buracos das restantes regiões para verificar se ainda possuem um número significativo de cantos que seja capaz de fazer com que um candidato seja classificado como balão de texto

    Segmentation and indexation of complex objects in comic book

    Get PDF
    Born in the 19th century, comics is a visual medium used to express ideas via images, often combined with text or visual information.It is an art form that uses images deployed in sequence for graphic storytelling (sequential art), spread worldwide initially using newspapers, books and magazines.Nowadays, the development of the new technologies and the World Wide Web is giving birth to a new form of paperless comics that takes advantage of the virtual world freedom.However, traditional comics still represent an important cultural heritage in many countries.They have not yet received the same level of attention as music, cinema or literature about their adaptation to the digital format.Using information technologies with digitized comic books would facilitate the exploration of digital libraries, accelerate their translation, allow augmented reading, speech playback for the visually impaired etc.Heritage museums such as the CIBDI (French acronym for International City of Comic books and Images), the Kyoto International Manga Museum and The Digital Comic Museum have already digitized several thousands of comic albums that some are now in the public domain.Despite the growing market place of digital comics, few research has been carried out to take advantage of the added value provided by these new media.A particularity of documents is their dependence on the type of document that often requires specific processing.The challenge of document analysis systems is to propose generic solutions for specific problems.The design process of comics is so specific that their automated analysis may be seen as a niche research field within document analysis, at the intersection of complex background, semi-structured and mixed content documents.Being at the intersection of several fields combines their difficulties.In this thesis, we review, highlight and illustrate the challenges related to comic book image analysis in order to provide a good overview about the last research progress in this field and the current issues.In order to cover the widest possible scope of study, we propose three different approaches for comic book image analysis.The three approaches aim to provide an automatic description of the image content.Different levels of description are discussed, from spacial positions (low level) to semantic information (high level).The first approach describes the image in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing.Simple elements such as panel, text and balloon regions are extracted first, followed by balloon tails and comic character positions from the direction indicated by the tails.The second approach addresses independent information extraction to recover the main drawback of the first approach: error propagation.This second method is composed by several specific extractors for each type of content, independent from each other.Those extractors can be used in parallel, without needing previous information which cancels the error propagation effect.Extra processing such as balloon type classification and text recognition are also covered.The third approach introduces a knowledge-driven system that combines low and high level processing to build a scalable system for comics image understanding.This approach is intended to improve the overall precision of content extraction methods.We built an expert system composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology.The first model embeds the knowledge about comic books and the second models the image processing related part.These two models allow consistency analysis of extracted information and inference of the relationships between all the extracted elements such as the reading order, the type of text (e.g. spoken, onomatopoeic, illustrative) and the relations between speech balloons and speaking characters.The expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order, the semantic of the balloon shapes, the relations between the speech balloons and their speakers, and the interaction between the comic characters.Apart from that, in this thesis we have provided the first public comic book image dataset and ground truth to the community along with an overall experimental comparison of all the proposed methods and some of the state-of-the-art method

    An ontology-based framework for the automated analysis and interpretation of comic books' images

    Get PDF
    International audienceSince the beginning of the twenty-first century, the cultural industry has been through a massive and historical mutation induced by the rise of digital technologies. The comic books industry keeps looking for the right solution and has not yet produced anything as convincing as the music or movie have. A lot of energy has been spent to transfer printed material to digital supports so far. The specificities of those supports are not always exploited at the best of their capabilities, while they could potentially be used to create new reading conventions. In spite of the needs induced by the large amount of data created since the beginning of the comics history, content indexing has been left behind. It is indeed quite a challenge to index such a composition of textual and visual information. While a growing number of researchers are working on comic books' image analysis from a low-level point of view, only a few are tackling the issue of representing the content at a high semantic level. We propose in this article a framework to handle the content of a comic book, to support the automatic extraction of its visual components and to formalize the semantic of the domain's codes. We tested our framework over two applications: 1) the unsupervised content discovery of comic books' images, 2) its capabilities to handle complex layouts and to produce a respectful browsing experience to the digital comics reader

    Confidence criterion for speech balloon segmentation

    Get PDF
    International audienceThis short paper investigates how to improve the confidence of speech balloon segmentation algorithms from comic book images. It comes from the need of precise indications about the quality of automatic processing in order to accept or not each segmented regions as a valid result, according to the application and without requiring any ground truth. We discuss several applications like result quality assessment for companies and automatic ground truth creation from high confidence results to train machine learning based systems.We present some ideas to combine several domain knowledge information (e.g. shape, text, etc.) and produce an improved confidence criterion

    Graphic Novel Subtitles:Requirement Elicitation and System Implementation

    Get PDF

    Exploring digital comics as an edutainment tool: An overview

    Get PDF
    This paper aims t oexplore the growing potential of digital comics and graphic novels as an edutainment tool.Initially, the evolvement of comics medium along with academic and commercial initiatives in designing comicware systems arebriefly discussed. Prominent to this study, the methods and impact of utilizing this visual media with embedded instructional content and student-generated comics in classroom setting are rationallyoutlined.By recognizing the emerging technologies available for supporting and accelerating educational comic development, this article addresses the diverse research challenges and opportunities of innovating effective strategies to enhance comics integrated learning across disciplines

    Segmentation et indexation d'objets complexes dans les images de bandes dessinées

    Get PDF
    In this thesis, we review, highlight and illustrate the challenges related to comic book image analysis in order to give to the reader a good overview about the last research progress in this field and the current issues. We propose three different approaches for comic book image analysis that are composed by several processing. The first approach is called "sequential'' because the image content is described in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing. Simple elements such as panel text and balloon are extracted first, followed by the balloon tail and then the comic character position in the panel. The second approach addresses independent information extraction to recover the main drawback of the first approach : error propagation. This second method is called “independent” because it is composed by several specific extractors for each elements of the image without any dependence between them. Extra processing such as balloon type classification and text recognition are also covered. The third approach introduces a knowledge-driven and scalable system of comics image understanding. This system called “expert system” is composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology. This expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order of panels and text, the relations between the speech balloons and their speakers and the comic character identification.Dans ce manuscrit de thèse, nous détaillons et illustrons les différents défis scientifiques liés à l'analyse automatique d'images de bandes dessinées, de manière à donner au lecteur tous les éléments concernant les dernières avancées scientifiques en la matière ainsi que les verrous scientifiques actuels. Nous proposons trois approches pour l'analyse d'image de bandes dessinées. La première approche est dite "séquentielle'' car le contenu de l'image est décrit progressivement et de manière intuitive. Dans cette approche, les extractions se succèdent, en commençant par les plus simples comme les cases, le texte et les bulles qui servent ensuite à guider l'extraction d'éléments plus complexes tels que la queue des bulles et les personnages au sein des cases. La seconde approche propose des extractions indépendantes les unes des autres de manière à éviter la propagation d'erreur due aux traitements successifs. D'autres éléments tels que la classification du type de bulle et la reconnaissance de texte y sont aussi abordés. La troisième approche introduit un système fondé sur une base de connaissance a priori du contenu des images de bandes dessinées. Ce système permet de construire une description sémantique de l'image, dirigée par les modèles de connaissances. Il combine les avantages des deux approches précédentes et permet une description sémantique de haut niveau pouvant inclure des informations telles que l'ordre de lecture, la sémantique des bulles, les relations entre les bulles et leurs locuteurs ainsi que les interactions entre les personnages
    corecore