Search CORE

18 research outputs found

Segmentation et indexation d'objets complexes dans les images de bandes dessinées

Author: Rigaud Christophe
Publication venue: HAL CCSD
Publication date: 11/12/2014
Field of study

In this thesis, we review, highlight and illustrate the challenges related to comic book image analysis in order to give to the reader a good overview about the last research progress in this field and the current issues. We propose three different approaches for comic book image analysis that are composed by several processing. The first approach is called "sequential'' because the image content is described in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing. Simple elements such as panel text and balloon are extracted first, followed by the balloon tail and then the comic character position in the panel. The second approach addresses independent information extraction to recover the main drawback of the first approach : error propagation. This second method is called “independent” because it is composed by several specific extractors for each elements of the image without any dependence between them. Extra processing such as balloon type classification and text recognition are also covered. The third approach introduces a knowledge-driven and scalable system of comics image understanding. This system called “expert system” is composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology. This expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order of panels and text, the relations between the speech balloons and their speakers and the comic character identification.Dans ce manuscrit de thèse, nous détaillons et illustrons les différents défis scientifiques liés à l'analyse automatique d'images de bandes dessinées, de manière à donner au lecteur tous les éléments concernant les dernières avancées scientifiques en la matière ainsi que les verrous scientifiques actuels. Nous proposons trois approches pour l'analyse d'image de bandes dessinées. La première approche est dite "séquentielle'' car le contenu de l'image est décrit progressivement et de manière intuitive. Dans cette approche, les extractions se succèdent, en commençant par les plus simples comme les cases, le texte et les bulles qui servent ensuite à guider l'extraction d'éléments plus complexes tels que la queue des bulles et les personnages au sein des cases. La seconde approche propose des extractions indépendantes les unes des autres de manière à éviter la propagation d'erreur due aux traitements successifs. D'autres éléments tels que la classification du type de bulle et la reconnaissance de texte y sont aussi abordés. La troisième approche introduit un système fondé sur une base de connaissance a priori du contenu des images de bandes dessinées. Ce système permet de construire une description sémantique de l'image, dirigée par les modèles de connaissances. Il combine les avantages des deux approches précédentes et permet une description sémantique de haut niveau pouvant inclure des informations telles que l'ordre de lecture, la sémantique des bulles, les relations entre les bulles et leurs locuteurs ainsi que les interactions entre les personnages

Thèses en Ligne

Theses.fr

Deep Learning for Free-Hand Sketch: A Survey

Author: Hospedales Timothy M.
Song Yi-Zhe
Wang Liang
Xiang Tao
Xu Peng
Yin Qiyue
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Free-hand sketches are highly illustrative, and have been widely used by humans to depict objects or stories from ancient times to the present. The recent prevalence of touchscreen devices has made sketch creation a much easier task than ever and consequently made sketch-oriented applications increasingly popular. The progress of deep learning has immensely benefited free-hand sketch research and applications. This paper presents a comprehensive survey of the deep learning techniques oriented at free-hand sketch data, and the applications that they enable. The main contents of this survey include: (i) A discussion of the intrinsic traits and unique challenges of free-hand sketch, to highlight the essential differences between sketch data and other data modalities, e.g., natural photos. (ii) A review of the developments of free-hand sketch research in the deep learning era, by surveying existing datasets, research topics, and the state-of-the-art methods through a detailed taxonomy and experimental evaluation. (iii) Promotion of future work via a discussion of bottlenecks, open problems, and potential research directions for the community.Comment: This paper is accepted by IEEE TPAM

arXiv.org e-Print Archive

Edinburgh Research Explorer

DR-NTU (Digital Repository of NTU)

An ontology-based framework for the automated analysis and interpretation of comic books' images

Author: Bertet Karell
Guérin Clément
Revel Arnaud
Rigaud Christophe
Publication venue: 'Elsevier BV'
Publication date: 01/02/2017
Field of study

International audienceSince the beginning of the twenty-first century, the cultural industry has been through a massive and historical mutation induced by the rise of digital technologies. The comic books industry keeps looking for the right solution and has not yet produced anything as convincing as the music or movie have. A lot of energy has been spent to transfer printed material to digital supports so far. The specificities of those supports are not always exploited at the best of their capabilities, while they could potentially be used to create new reading conventions. In spite of the needs induced by the large amount of data created since the beginning of the comics history, content indexing has been left behind. It is indeed quite a challenge to index such a composition of textual and visual information. While a growing number of researchers are working on comic books' image analysis from a low-level point of view, only a few are tackling the issue of representing the content at a high semantic level. We propose in this article a framework to handle the content of a comic book, to support the automatic extraction of its visual components and to formalize the semantic of the domain's codes. We tested our framework over two applications: 1) the unsupervised content discovery of comic books' images, 2) its capabilities to handle complex layouts and to produce a respectful browsing experience to the digital comics reader

A video summarisation system for post-production

Author: Wills Ciaran
Publication venue
Publication date: 01/01/2003
Field of study

Post-production facilities deal with large amounts of digital video, which presents difficulties when tracking, managing and searching this material. Recent research work in image and video analysis promises to offer help in these tasks, but there is a gap between what these systems can provide and what users actually need. In particular the popular research models for indexing and retrieving visual data do not fit well with how users actually work. In this thesis we explore how image and video analysis can be applied to an online video collection to assist users in reviewing and searching for material faster, rather than purporting to do it for them. We introduce a framework for automatically generating static 2-dimen- sional storyboards from video sequences. The storyboard consists of a series of frames, one for each shot in the sequence, showing the principal objects and motions of the shot. The storyboards are rendered as vector images in a familiar comic book style, allowing them to be quickly viewed and understood. The process consists of three distinct steps: shot-change detection, object segmentation, and presentation. The nature of the video material encountered in a post-production fa- cility is quite different from other material such as television programmes. Video sequences such as commercials and music videos are highly dy- namic with very short shots, rapid transitions and ambiguous edits. Video is often heavily manipulated, causing difficulties for many video processing techniques. We study the performance of a variety of published shot-change de- tection algorithms on the type of highly dynamic video typically encoun- tered in post-production work. Finding their performance disappointing, we develop a novel algorithm for detecting cuts and fades that operates directly on Motion-JPEG compressed video, exploiting the DCT coeffi- cients to save computation. The algorithm shows superior performance on highly dynamic material while performing comparably to previous algorithms on other material

Glasgow Theses Service

Artistic Content Representation and Modelling based on Visual Style Features

Author: El-Magd Essam A.
Freres P.
Publication venue: University of Canterbury. Computer Science and Software Engineering
Publication date: 01/01/1995
Field of study

This thesis aims to understand visual style in the context of computer science, using traditionally intangible artistic properties to enhance existing content manipulation algorithms and develop new content creation methods. The developed algorithms can be used to apply extracted properties to other drawings automatically; transfer a selected style; categorise images based upon perceived style; build 3D models using style features from concept artwork; and other style-based actions that change our perception of an object without changing our ability to recognise it. The research in this thesis aims to provide the style manipulation abilities that are missing from modern digital art creation pipelines

UC Research Repository

Visual Analysis of Large, Time-Dependent, Multi-Dimensional Smart Sensor Tracking Data

Author: James Walker
Publication venue: 'Swansea University'
Publication date: 01/01/2017
Field of study

Technological advancements over the past decade have increased our ability to collect data to previously unimaginable volumes [Kei02]. Understanding temporal patterns is key to gaining knowledge and insight. However, our capacity to store data now far exceeds the rate at which we are able to understand it [KKEM10]. This phenomenon has led to a growing need for advanced solutions to make sense and use of an ever-increasing data space. Abstract temporal data provides additional challenges in its, representation, size, and scalability, high dimensionality, and unique structure.One instance of such temporal data is acquired from smart sensor tags attached to freely roaming animals recording multiple parameters at infra-second rates which are becoming commonplace, and are transforming biologists understanding of the way wild animals behave.The excitement at the potential inherent in sophisticated tracking devices has, however, been limited by a lack of available software to advance research in the field. This thesis introduces methodologies to deal with the problem of the analysis of the large, multi-dimensional, time-dependent data acquired. Interpretation of such data is complex and currently limits the ability of biologists to realise the value of their recorded information.We present several contributions to the field of time-series visualisation, that is, the visualisation of ordered collections of real value data attributes at successive points in time sampled at uniform time intervals. Traditionally, time-series graphs have been used for temporal data. However, screen resolution is small in comparison to the large information space commonplace today. In such cases, we can only render a proportion of the data.It is widely accepted that the effective interpretation of large temporal data sets requires advanced methods and interaction techniques. In this thesis, we address these issues to enhance the exploration, analysis, and presentation of time-series data for movement ecologists in their smart sensor data analysis

Crossref

Cronfa at Swansea University

Multi-Sensory Interaction for Blind and Visually Impaired People

Author
Publication venue: 'MDPI AG'
Publication date: 21/03/2022
Field of study

This book conveyed the visual elements of artwork to the visually impaired through various sensory elements to open a new perspective for appreciating visual artwork. In addition, the technique of expressing a color code by integrating patterns, temperatures, scents, music, and vibrations was explored, and future research topics were presented. A holistic experience using multi-sensory interaction acquired by people with visual impairment was provided to convey the meaning and contents of the work through rich multi-sensory appreciation. A method that allows people with visual impairments to engage in artwork using a variety of senses, including touch, temperature, tactile pattern, and sound, helps them to appreciate artwork at a deeper level than can be achieved with hearing or touch alone. The development of such art appreciation aids for the visually impaired will ultimately improve their cultural enjoyment and strengthen their access to culture and the arts. The development of this new concept aids ultimately expands opportunities for the non-visually impaired as well as the visually impaired to enjoy works of art and breaks down the boundaries between the disabled and the non-disabled in the field of culture and arts through continuous efforts to enhance accessibility. In addition, the developed multi-sensory expression and delivery tool can be used as an educational tool to increase product and artwork accessibility and usability through multi-modal interaction. Training the multi-sensory experiences introduced in this book may lead to more vivid visual imageries or seeing with the mind’s eye

Directory of Open Access Books (DOAB)

Proceedings of the 19th Sound and Music Computing Conference

Author: Michon Romain
Orlarey Yann
Pottier Laurent
Publication venue: SMC Network
Publication date: 12/07/2022
Field of study

INRIA a CCSD electronic archive server