34 research outputs found

    Segmentation et indexation d'objets complexes dans les images de bandes dessinées

    Get PDF
    In this thesis, we review, highlight and illustrate the challenges related to comic book image analysis in order to give to the reader a good overview about the last research progress in this field and the current issues. We propose three different approaches for comic book image analysis that are composed by several processing. The first approach is called "sequential'' because the image content is described in an intuitive way, from simple to complex elements using previously extracted elements to guide further processing. Simple elements such as panel text and balloon are extracted first, followed by the balloon tail and then the comic character position in the panel. The second approach addresses independent information extraction to recover the main drawback of the first approach : error propagation. This second method is called “independent” because it is composed by several specific extractors for each elements of the image without any dependence between them. Extra processing such as balloon type classification and text recognition are also covered. The third approach introduces a knowledge-driven and scalable system of comics image understanding. This system called “expert system” is composed by an inference engine and two models, one for comics domain and another one for image processing, stored in an ontology. This expert system combines the benefits of the two first approaches and enables high level semantic description such as the reading order of panels and text, the relations between the speech balloons and their speakers and the comic character identification.Dans ce manuscrit de thèse, nous détaillons et illustrons les différents défis scientifiques liés à l'analyse automatique d'images de bandes dessinées, de manière à donner au lecteur tous les éléments concernant les dernières avancées scientifiques en la matière ainsi que les verrous scientifiques actuels. Nous proposons trois approches pour l'analyse d'image de bandes dessinées. La première approche est dite "séquentielle'' car le contenu de l'image est décrit progressivement et de manière intuitive. Dans cette approche, les extractions se succèdent, en commençant par les plus simples comme les cases, le texte et les bulles qui servent ensuite à guider l'extraction d'éléments plus complexes tels que la queue des bulles et les personnages au sein des cases. La seconde approche propose des extractions indépendantes les unes des autres de manière à éviter la propagation d'erreur due aux traitements successifs. D'autres éléments tels que la classification du type de bulle et la reconnaissance de texte y sont aussi abordés. La troisième approche introduit un système fondé sur une base de connaissance a priori du contenu des images de bandes dessinées. Ce système permet de construire une description sémantique de l'image, dirigée par les modèles de connaissances. Il combine les avantages des deux approches précédentes et permet une description sémantique de haut niveau pouvant inclure des informations telles que l'ordre de lecture, la sémantique des bulles, les relations entre les bulles et leurs locuteurs ainsi que les interactions entre les personnages

    Performance Challenges with Data Visualizations in Browser Environment

    Get PDF
    Information exists in many forms, from text, to equations, videos, audio, and graphical mediums. With graphical or visual mediums, it is becoming easier to absorb information where the alternatives are textual descriptions. Graphs are important vehicles of transporting information. In order to create a good graph, certain attributes need to be taken into account, such as which variables are being displayed over which axis, visual elements, and their sizes are also important to consider. In modern times with the internet and the amount of data being generated, how can all this data be fitted into a single graph? That question is the motivation for this thesis. Presenting large data in visualizations involves a great deal of thought, effort, and ingenuity on how to proceed with what information to convey. There are times when obtaining data for such visualization come with their own challenges. This thesis investigates the obstacles facing an internal tool within a company in regard to their data retrieval method. As well as the objective to research an efficient and easy-to-use method for presenting large data on a webpage

    Investigating User Experiences Through Animation-based Sketching

    Get PDF

    Design and training of deep reinforcement learning agents

    Get PDF
    Deep reinforcement learning is a field of research at the intersection of reinforcement learning and deep learning. On one side, the problem that researchers address is the one of reinforcement learning: to act efficiently. A large number of algorithms were developed decades ago in this field to update value functions and policies, explore, and plan. On the other side, deep learning methods provide powerful function approximators to address the problem of representing functions such as policies, value functions, and models. The combination of ideas from these two fields offers exciting new perspectives. However, building successful deep reinforcement learning experiments is particularly difficult due to the large number of elements that must be combined and adjusted appropriately. This thesis proposes a broad overview of the organization of these elements around three main axes: agent design, environment design, and infrastructure design. Arguably, the success of deep reinforcement learning research is due to the tremendous amount of effort that went into each of them, both from a scientific and engineering perspective, and their diffusion via open source repositories. For each of these three axes, a dedicated part of the thesis describes a number of related works that were carried out during the doctoral research. The first part, devoted to the design of agents, presents two works. The first one addresses the problem of applying discrete action methods to large multidimensional action spaces. A general method called action branching is proposed, and its effectiveness is demonstrated with a novel agent, named BDQ, applied to discretized continuous action spaces. The second work deals with the problem of maximizing the utility of a single transition when learning to achieve a large number of goals. In particular, it focuses on learning to reach spatial locations in games and proposes a new method called Q-map to do so efficiently. An exploration mechanism based on this method is then used to demonstrate the effectiveness of goal-directed exploration. Elements of these works cover some of the main building blocks of agents: update methods, neural architectures, exploration strategies, replays, and hierarchy. The second part, devoted to the design of environments, also presents two works. The first one shows how various tasks and demonstrations can be combined to learn complex skill spaces that can then be reused to solve even more challenging tasks. The proposed method, called CoMic, extends previous work on motor primitives by using a single multi-clip motion capture tracking task in conjunction with complementary tasks targeting out-of-distribution movements. The second work addresses a particular type of control method vastly neglected in traditional environments but essential for animals: muscle control. An open source codebase called OstrichRL is proposed, containing a musculoskeletal model of an ostrich, an ensemble of tasks, and motion capture data. The results obtained by training a state-of-the-art agent on the proposed tasks show that controlling such a complex system is very difficult and illustrate the importance of using motion capture data. Elements of these works demonstrate the meticulous work that must go into designing environment parts such as: models, observations, rewards, terminations, resets, steps, and demonstrations. The third part, on the design of infrastructures, presents three works. The first one explains the difference between the types of time limits commonly used in reinforcement learning and why they are often treated inappropriately. In one case, tasks are time-limited by nature and a notion of time should be available to agents to maintain the Markov property of the underlying decision process. In the other case, tasks are not time-limited by nature, but time limits are used for convenience to diversify experiences. This is the most common case. It requires a distinction between time limits and environmental terminations, and bootstrapping should be performed at the end of partial episodes. The second work proposes to unify the most popular deep learning frameworks using a single library called Ivy, and provides new differentiable and framework-agnostic libraries built with it. Four such code bases are provided for gradient-based robot motion planning, mechanics, 3D vision, and differentiable continuous control environments. Finally, the third paper proposes a novel deep reinforcement learning library, called Tonic, built with simplicity and modularity in mind, to accelerate prototyping and evaluation. In particular, it contains implementations of several continuous control agents and a large-scale benchmark. Elements of these works illustrate the different components to consider when building the infrastructure for an experiment: deep learning framework, schedules, and distributed training. Added to these are the various ways to perform evaluations and analyze results for meaningful, interpretable, and reproducible deep reinforcement learning research.Open Acces

    TV in the Age of the Internet: Information Quality of Science Fiction TV Fansites

    Get PDF
    Thesis (Ph.D.) - Indiana University, Information Science, 2011Communally created Web 2.0 content on the Internet has begun to compete with information provided by traditional gatekeeper institutions, such as academic journals, medical professionals, and large corporations. On the one hand, such gatekeepers need to understand the nature of this competition, as well as to try to ensure that the general public are not endangered by poor quality information. On the other hand, advocates of free and universal access to basic social services have argued that communal efforts can provide as good or better-quality versions of commonly needed resources. This dissertation arises from these needs to understand the nature and quality of information being produced on such websites. Website-oriented information quality (IQ) literature spans at least 15 different academic fields, a survey of which identified two types of IQ: perceptual and artifactual fitness-related, and representational accuracy and completeness-related. The current project studied websites in terms of all of these, except perceptual fitness. This study may be the only of its kind to have targeted fansites: websites made by fans of a mass media franchise. Despite the Internet's becoming a primary means by which millions of people consume and co-produce their entertainment, little academic attention has been paid to the IQ of sites about the mass media. For this study, the four central non-studio-affiliated sites about a highly popular and fan-engaging science fiction television franchise, Stargate, were chosen, and their IQ examined across sites having different sizes as well as editorial and business models. As exhaustive of samples as possible were collected from each site. Based on 21 relevant variables from the IQ literature, four qualitative and 17 exploratory statistical analyses were conducted. Key findings include: five possibly new IQ criteria; smaller sites concerned more with pleasing connoisseuring fans than the general public; larger sites being targeted towards older users; professional editors serving their own interests more than users'; wikis' greater user freedom attracting more invested and balanced writers; for-profit sites being more imposing upon, and less protecting of, users than non-profit sites; and the emergence of common writing styles, themes, data fields, advertisement types, linking strategies, and page types

    Graph-level operations: A high-level interface for graph visualization technique specification

    Get PDF
    More and more the world is being described as graphs---as connections between people, places, and ideas---since they provide a richer model than simply understanding each item in isolation. In order to help analysts understand these graphs, researchers have developed and studied a large number of graph visualization techniques. This variety of techniques presents solutions to a breadth of graph analysis tasks, but it introduces a new issue: complexity. The variety introduces both the complexity of comparing techniques in an objective way and the engineering complexity of implementing so many techniques. In this thesis, I present graph-level operations models (or GLO models) as an elegant solution to these challenges. A GLO model consists of a model of visual elements and a set of functions (GLOs) that manipulate those elements. I introduce GLOv1 and GLOv2, GLO models derived from six hand-picked graph visualization techniques and twenty-nine techniques derived from a review of 430 graph visualization publications, respectively. I show how to use GLOs to define graph visualization techniques, including a model's original seed techniques as well as novel techniques. I demonstrate the analysis potential of the GLO model by clustering the twenty-nine seed techniques using two different GLO-based schemes. Finally, I demonstrate the practical engineering potential of the model through an open-source Javascript implementation (GLO.js) and two applications built atop the implementation for exploring a graph and discovering novel techniques using GLOs (GLO-STIX and GLO-CLI).Ph.D

    Drawing from calculators.

    Get PDF

    Grammar and Corpora 2016

    Get PDF
    In recent years, the availability of large annotated corpora, together with a new interest in the empirical foundation and validation of linguistic theory and description, has sparked a surge of novel work using corpus methods to study the grammar of natural languages. This volume presents recent developments and advances, firstly, in corpus-oriented grammar research with a special focus on Germanic, Slavic, and Romance languages and, secondly, in corpus linguistic methodology as well as the application of corpus methods to grammar-related fields. The volume results from the sixth international conference Grammar and Corpora (GaC 2016), which took place at the Institute for the German Language (IDS) in Mannheim, Germany, in November 2016
    corecore