6 research outputs found

    Photo2ClipArt: Image Abstraction and Vectorization Using Layered Linear Gradients

    Get PDF
    International audienceWe present a method to create vector cliparts from photographs. Our approach aims at reproducing two key properties of cliparts: they should be easily editable, and they should represent image content in a clean, simplified way. We observe that vector artists satisfy both of these properties by modeling cliparts with linear color gradients, which have a small number of parameters and approximate well smooth color variations. In addition, skilled artists produce intricate yet editable artworks by stacking multiple gradients using opaque and semi-transparent layers. Motivated by these observations, our goal is to decompose a bitmap photograph into a stack of layers, each layer containing a vector path filled with a linear color gradient. We cast this problem as an optimization that jointly assigns each pixel to one or more layer and finds the gradient parameters of each layer that best reproduce the input. Since a trivial solution would consist in assigning each pixel to a different, opaque layer, we complement our objective with a simplicity term that favors decompositions made of few, semi-transparent layers. However, this formulation results in a complex combinatorial problem combining discrete unknowns (the pixel assignments) and continuous unknowns (the layer parameters). We propose a Monte Carlo Tree Search algorithm that efficiently explores this solution space by leveraging layering cues at image junctions. We demonstrate the effectiveness of our method by reverse-engineering existing cliparts and by creating original cliparts from studio photographs

    Perceptual Real-Time 2D-to-3D Conversion Using Cue Fusion

    Get PDF
    We propose a system to infer binocular disparity from a monocular video stream in real-time. Different from classic reconstruction of physical depth in computer vision, we compute perceptually plausible disparity, that is numerically inaccurate, but results in a very similar overall depth impression with plausible overall layout, sharp edges, fine details and agreement between luminance and disparity. We use several simple monocular cues to estimate disparity maps and confidence maps of low spatial and temporal resolution in real-time. These are complemented by spatially-varying, appearance-dependent and class-specific disparity prior maps, learned from example stereo images. Scene classification selects this prior at runtime. Fusion of prior and cues is done by means of robust MAP inference on a dense spatio-temporal conditional random field with high spatial and temporal resolution. Using normal distributions allows this in constant-time, parallel per-pixel work. We compare our approach to previous 2D-to-3D conversion systems in terms of different metrics, as well as a user study and validate our notion of perceptually plausible disparity

    Lifting Freehand Concept Sketches into 3D

    Get PDF
    International audienceWe present the first algorithm capable of automatically lifting real-world, vector-format, industrial design sketches into 3D. Targeting real-world sketches raises numerous challenges due to inaccuracies, use of overdrawn strokes, and construction lines. In particular, while construction lines convey important 3D information, they add significant clutter and introduce multiple accidental 2D intersections. Our algorithm exploits the geometric cues provided by the construction lines and lifts them to 3D by computing their intended 3D intersections and depths. Once lifted to 3D, these lines provide valuable geometric constraints that we leverage to infer the 3D shape of other artist drawn strokes. The core challenge we address is inferring the 3D connectivity of construction and other lines from their 2D projections by separating 2D intersections into 3D intersections and accidental occlusions. We efficiently address this complex combinatorial problem using a dedicated search algorithm that leverages observations about designer drawing preferences , and uses those to explore only the most likely solutions of the 3D intersection detection problem. We demonstrate that our separator outputs are of comparable quality to human annotations, and that the 3D structures we recover enable a range of design editing and visualization applications, including novel view synthesis and 3D-aware scaling of the depicted shape

    Review of 2D Animation Restoration in Visual Domain Based on Deep Learning

    Get PDF
    Traditional 2D animation is a distinct visual style with a production process and image characteristics that differ significantly from real-life scenes. It usually requires drawing pictures frame by frame and saving them as bitmaps. During the storage, transmission, and playback process, 2D animation may encounter problems such as picture quality degradation, insufficient resolution, and discontinuous timing. With the development of deep learning technology, it has been widely used in the field of animation restoration. This paper provides a comprehensive summary of 2D animation restoration based on deep learning. Firstly, exploring existing animation datasets can help identify the available data support and the bottleneck in establishing animation datasets. Secondly, investigating and testing deep learning-based algorithms for animation image quality restoration and animation interpolation can help identify key points and challenges in animation restoration. Additionally, introducing methods designed to ensure consistency between animation frames can provide insights for future animation video restoration. Analyzing the effectiveness of existing image quality assessment (IQA) methods for animation images can help identify practical IQA methods to guide restoration results. Finally, based on the above analysis, this paper clarifies the challenges in animation restoration tasks and presents future development directions of deep learning in animation restoration field

    Génération automatique de cartes de profondeur relative par utilisation des occlusions dynamiques

    Get PDF
    L’insuffisance de contenu 3D est un frein majeur à l’expansion des téléviseurs 3D. La generation automatique de contenu 3D à partir de contenu 2D ordinaire constitue une solution possible à ce problème. En effet, plusieurs indices de profondeur sont présents sur des images ou vidéos 2D, ce qui rend la conversion automatique 2D à 3D possible. Parmi ces indices, les occlusions dynamiques, qui permettent d’attribuer un ordre relatif aux objets adjacents, offrent les avantages d’être fiables et présentes dans tous les types de scènes. L’approche pour convertir du contenu 2D en 3D, proposée dans ce mémoire, repose sur l’utilisation de cet indice pour générer des cartes de profondeur relative. L’analyse du mouvement, avant et arrière entre deux trames consécutives, permet le calcul des occlusions dynamiques. Le mouvement considéré est calculé par une version modifiée du flot optique Epic-Flow propose par Revaud et al. en 2015. Les modifications apportées au calcul de ce flot optique ont permis de le rendre cohérent en avant-arrière sans détériorer ses performances. Grâce à cette nouvelle propriété, les occlusions sont plus simplement calculées que dans les approches présentes dans la littérature. En effet, contrairement à l’approche suivie par Salembier et Palou en 2014, la méthode de calcul des occlusions proposée ne nécessite pas la coûteuse opération de l’estimation de mouvement par région selon un modèle quadratique. Une fois les relations d’occlusions obtenues, elles permettent de déduire l’ordre des objets contenus dans l’image. Ces objets sont obtenus par une segmentation qui considère à la fois la couleur et le mouvement. La méthode proposée permet la génération automatique de cartes de profondeur relative en présence de mouvement des objets de la scène. Elle permet d’obtenir des résultats comparables à ceux obtenus par Salembier et Palou, sans nécessiter l’estimation de mouvement par région
    corecore