2,256 research outputs found

    gMotion: A spatio-temporal grammar for the procedural generation of motion graphics

    Get PDF
    Creating by hand compelling 2D animations that choreograph several groups of shapes requires a large number of manual edits. We present a method to procedurally generate motion graphics with timeslice grammars. Timeslice grammars are to time what split grammars are to space. We use this grammar to formally model motion graphics, manipulating them in both temporal and spatial components. We are able to combine both these aspects by representing animations as sets of affine transformations sampled uniformly in both space and time. Rules and operators in the grammar manipulate all spatio-temporal matrices as a whole, allowing us to expressively construct animation with few rules. The grammar animates shapes, which are represented as highly tessellated polygons, by applying the affine transforms to each shape vertex given the vertex position and the animation time. We introduce a small set of operators showing how we can produce 2D animations of geometric objects, by combining the expressive power of the grammar model, the composability of the operators with themselves, and the capabilities that derive from using a unified spatio-temporal representation for animation data. Throughout the paper, we show how timeslice grammars can produce a wide variety of animations that would take artists hours of tedious and time-consuming work. In particular, in cases where change of shapes is very common, our grammar can add motion detail to large collections of shapes with greater control over per-shape animations along with a compact rules structure

    PV3D: A 3D Generative Model for Portrait Video Generation

    Full text link
    Recent advances in generative adversarial networks (GANs) have demonstrated the capabilities of generating stunning photo-realistic portrait images. While some prior works have applied such image GANs to unconditional 2D portrait video generation and static 3D portrait synthesis, there are few works successfully extending GANs for generating 3D-aware portrait videos. In this work, we propose PV3D, the first generative framework that can synthesize multi-view consistent portrait videos. Specifically, our method extends the recent static 3D-aware image GAN to the video domain by generalizing the 3D implicit neural representation to model the spatio-temporal space. To introduce motion dynamics to the generation process, we develop a motion generator by stacking multiple motion layers to generate motion features via modulated convolution. To alleviate motion ambiguities caused by camera/human motions, we propose a simple yet effective camera condition strategy for PV3D, enabling both temporal and multi-view consistent video generation. Moreover, PV3D introduces two discriminators for regularizing the spatial and temporal domains to ensure the plausibility of the generated portrait videos. These elaborated designs enable PV3D to generate 3D-aware motion-plausible portrait videos with high-quality appearance and geometry, significantly outperforming prior works. As a result, PV3D is able to support many downstream applications such as animating static portraits and view-consistent video motion editing. Code and models will be released at https://showlab.github.io/pv3d

    High-fidelity rendering on shared computational resources

    Get PDF
    The generation of high-fidelity imagery is a computationally expensive process and parallel computing has been traditionally employed to alleviate this cost. However, traditional parallel rendering has been restricted to expensive shared memory or dedicated distributed processors. In contrast, parallel computing on shared resources such as a computational or a desktop grid, offers a low cost alternative. But, the prevalent rendering systems are currently incapable of seamlessly handling such shared resources as they suffer from high latencies, restricted bandwidth and volatility. A conventional approach of rescheduling failed jobs in a volatile environment inhibits performance by using redundant computations. Instead, clever task subdivision along with image reconstruction techniques provides an unrestrictive fault-tolerance mechanism, which is highly suitable for high-fidelity rendering. This thesis presents novel fault-tolerant parallel rendering algorithms for effectively tapping the enormous inexpensive computational power provided by shared resources. A first of its kind system for fully dynamic high-fidelity interactive rendering on idle resources is presented which is key for providing an immediate feedback to the changes made by a user. The system achieves interactivity by monitoring and adapting computations according to run-time variations in the computational power and employs a spatio-temporal image reconstruction technique for enhancing the visual fidelity. Furthermore, algorithms described for time-constrained offline rendering of still images and animation sequences, make it possible to deliver the results in a user-defined limit. These novel methods enable the employment of variable resources in deadline-driven environments

    Path tracing multivue adaptatif

    Get PDF
    International audienceRendering photo-realistic image sequences using path tracing and Monte Carlo integration often requires sampling a large number of paths to get converged results. In the context of rendering multiple views or animated sequences, such sampling can be highly redundant. Several methods have been developed to share sampled paths between spatially or temporarily similar views. However, such sharing is challenging since it can lead to bias in the final images. Our contribution is a Monte Carlo sampling technique which generates paths, taking into account several cameras. First, we sample the scene from all the cameras to generate hit points. Then, an importance sampling technique generates bouncing directions which are shared by a subset of cameras. This set of hit points and bouncing directions is then used within a regular path tracing solution. For animated scenes, paths remain valid for a fixed time only, but sharing can still occur between cameras as long as their exposure time intervals overlap. We show that our technique generates less noise than regular path tracing and does not introduce noticeable bias.Le rendu de séquences d'images photoréalistes en lancer de rayons nécessite souvent l'échantillonnage d'un grand nombre de chemins pour obtenir des résultats convergés. Dans le contexte du rendu multi-vue ou de séquences animées, cet échantillonnage peut être redondant. Plusieurs méthodes ont été développées pour réutiliser des chemins échantillonnés entre des vues proches spatialement ou temporellement. Cependant, un telle réutilisation est complexe car elle peut mener à du biais dans les images. Notre contribution est une technique d'échantillonnage qui génère des chemins en prenant en compte plusieurs caméras. Tout d'abord, nous échantillonnons la scène depuis toutes les caméras pour générer des points visibles. Ensuite, nous générons des directions par importance qui sont partagées par un sous-ensemble de caméras. Cet ensemble de points et de directions est ensuite utilisé dans une solution de lancer de rayons classique. Pour les scènes animées, les chemins ne sont réutilisables qu'à temps fixe, mais un partage peut toujours avoir lieu entre les caméras si leurs intervalles d'exposition se recouvrent. Notre technique génère moins de bruit que le path tracing classique à temps de calcul équivalent et n’introduit pas de biais perceptible

    Optimization techniques for computationally expensive rendering algorithms

    Get PDF
    Realistic rendering in computer graphics simulates the interactions of light and surfaces. While many accurate models for surface reflection and lighting, including solid surfaces and participating media have been described; most of them rely on intensive computation. Common practices such as adding constraints and assumptions can increase performance. However, they may compromise the quality of the resulting images or the variety of phenomena that can be accurately represented. In this thesis, we will focus on rendering methods that require high amounts of computational resources. Our intention is to consider several conceptually different approaches capable of reducing these requirements with only limited implications in the quality of the results. The first part of this work will study rendering of time-­¿varying participating media. Examples of this type of matter are smoke, optically thick gases and any material that, unlike the vacuum, scatters and absorbs the light that travels through it. We will focus on a subset of algorithms that approximate realistic illumination using images of real world scenes. Starting from the traditional ray marching algorithm, we will suggest and implement different optimizations that will allow performing the computation at interactive frame rates. This thesis will also analyze two different aspects of the generation of anti-­¿aliased images. One targeted to the rendering of screen-­¿space anti-­¿aliased images and the reduction of the artifacts generated in rasterized lines and edges. We expect to describe an implementation that, working as a post process, it is efficient enough to be added to existing rendering pipelines with reduced performance impact. A third method will take advantage of the limitations of the human visual system (HVS) to reduce the resources required to render temporally antialiased images. While film and digital cameras naturally produce motion blur, rendering pipelines need to explicitly simulate it. This process is known to be one of the most important burdens for every rendering pipeline. Motivated by this, we plan to run a series of psychophysical experiments targeted at identifying groups of motion-­¿blurred images that are perceptually equivalent. A possible outcome is the proposal of criteria that may lead to reductions of the rendering budgets

    Bridging the gap between reconstruction and synthesis

    Get PDF
    Aplicat embargament des de la data de defensa fins el 15 de gener de 20223D reconstruction and image synthesis are two of the main pillars in computer vision. Early works focused on simple tasks such as multi-view reconstruction and texture synthesis. With the spur of Deep Learning, the field has rapidly progressed, making it possible to achieve more complex and high level tasks. For example, the 3D reconstruction results of traditional multi-view approaches are currently obtained with single view methods. Similarly, early pattern based texture synthesis works have resulted in techniques that allow generating novel high-resolution images. In this thesis we have developed a hierarchy of tools that cover all these range of problems, lying at the intersection of computer vision, graphics and machine learning. We tackle the problem of 3D reconstruction and synthesis in the wild. Importantly, we advocate for a paradigm in which not everything should be learned. Instead of applying Deep Learning naively we propose novel representations, layers and architectures that directly embed prior 3D geometric knowledge for the task of 3D reconstruction and synthesis. We apply these techniques to problems including scene/person reconstruction and photo-realistic rendering. We first address methods to reconstruct a scene and the clothed people in it while estimating the camera position. Then, we tackle image and video synthesis for clothed people in the wild. Finally, we bridge the gap between reconstruction and synthesis under the umbrella of a unique novel formulation. Extensive experiments conducted along this thesis show that the proposed techniques improve the performance of Deep Learning models in terms of the quality of the reconstructed 3D shapes / synthesised images, while reducing the amount of supervision and training data required to train them. In summary, we provide a variety of low, mid and high level algorithms that can be used to incorporate prior knowledge into different stages of the Deep Learning pipeline and improve performance in tasks of 3D reconstruction and image synthesis.La reconstrucció 3D i la síntesi d'imatges són dos dels pilars fonamentals en visió per computador. Els estudis previs es centren en tasques senzilles com la reconstrucció amb informació multi-càmera i la síntesi de textures. Amb l'aparició del "Deep Learning", aquest camp ha progressat ràpidament, fent possible assolir tasques molt més complexes. Per exemple, per obtenir una reconstrucció 3D, tradicionalment s'utilitzaven mètodes multi-càmera, en canvi ara, es poden obtenir a partir d'una sola imatge. De la mateixa manera, els primers treballs de síntesi de textures basats en patrons han donat lloc a tècniques que permeten generar noves imatges completes en alta resolució. En aquesta tesi, hem desenvolupat una sèrie d'eines que cobreixen tot aquest ventall de problemes, situats en la intersecció entre la visió per computador, els gràfics i l'aprenentatge automàtic. Abordem el problema de la reconstrucció i la síntesi 3D en el món real. És important destacar que defensem un paradigma on no tot s'ha d'aprendre. Enlloc d'aplicar el "Deep Learning" de forma naïve, proposem representacions novedoses i arquitectures que incorporen directament els coneixements geomètrics ja existents per a aconseguir la reconstrucció 3D i la síntesi d'imatges. Nosaltres apliquem aquestes tècniques a problemes com ara la reconstrucció d'escenes/persones i a la renderització d'imatges fotorealistes. Primer abordem els mètodes per reconstruir una escena, les persones vestides que hi ha i la posició de la càmera. A continuació, abordem la síntesi d'imatges i vídeos de persones vestides en situacions quotidianes. I finalment, aconseguim, a través d'una nova formulació única, connectar la reconstrucció amb la síntesi. Els experiments realitzats al llarg d'aquesta tesi demostren que les tècniques proposades milloren el rendiment dels models de "Deepp Learning" pel que fa a la qualitat de les reconstruccions i les imatges sintetitzades alhora que redueixen la quantitat de dades necessàries per entrenar-los. En resum, proporcionem una varietat d'algoritmes de baix, mitjà i alt nivell que es poden utilitzar per incorporar els coneixements previs a les diferents etapes del "Deep Learning" i millorar el rendiment en tasques de reconstrucció 3D i síntesi d'imatges.Postprint (published version

    A Tool for the 3D Spatio-Temporal Structuring of Historic Building Reconstructions

    Get PDF
    The difficulty in the description, the analysis and the comprehension of cultural heritage often stands on the fact that buildings undergo numerous changes over time. Three factors condition the knowledge of historical heritage. Firstly, 3D reconstructions of heritage buildings focus normally on existing states and not on the management of historical evolutions. Secondly, if on one side iconographic sources are generally used like visual memory of a building temporal state to be restored graphically, on the other side few works today focus on the use of all metric and visual information contained in sources. At last, iconographic documentation concerning building past states is sometimes contradictory, dubious and incomplete. As a consequence, in 3D reconstructions uncertainties, contradictions and gaps in information should be highlighted. We present a methodological approach basing on the existing iconographic corpus for the analysis and the 3D management of building transformations. This approach joins three main aspects in a complete workflow. Firstly, it concerns the spatial and temporal referencing of 2D iconographic sources for the 3D reconstruction of disappeared building states. Secondly, it allows the analysis of building transformations by means of a temporal state distribution. Lastly, it uses spatial relations established between 2D iconography and 3D representation for the visual browsing of information based on spatiotemporal criteria. In particular, in this paper we detail the interface developed in order to accomplish multiple related tasks concerning the spatio-temporal structuring of the morphology to be reconstructed
    • …
    corecore