26 research outputs found

    Synthesizing and Editing Photo-realistic Visual Objects

    Get PDF
    In this thesis we investigate novel methods of synthesizing new images of a deformable visual object using a collection of images of the object. We investigate both parametric and non-parametric methods as well as a combination of the two methods for the problem of image synthesis. Our main focus are complex visual objects, specifically deformable objects and objects with varying numbers of visible parts. We first introduce sketch-driven image synthesis system, which allows the user to draw ellipses and outlines in order to sketch a rough shape of animals as a constraint to the synthesized image. This system interactively provides feedback in the form of ellipse and contour suggestions to the partial sketch of the user. The user's sketch guides the non-parametric synthesis algorithm that blends patches from two exemplar images in a coarse-to-fine fashion to create a final image. We evaluate the method and synthesized images through two user studies. Instead of non-parametric blending of patches, a parametric model of the appearance is more desirable as its appearance representation is shared between all images of the dataset. Hence, we propose Context-Conditioned Component Analysis, a probabilistic generative parametric model, which described images with a linear combination of basis functions. The basis functions are evaluated for each pixel using a context vector computed from the local shape information. We evaluate C-CCA qualitatively and quantitatively on inpainting, appearance transfer and reconstruction tasks. Drawing samples of C-CCA generates novel, globally-coherent images, which, unfortunately, lack high-frequency details due to dimensionality reduction and misalignment. We develop a non-parametric model that enhances the samples of C-CCA with locally-coherent, high-frequency details. The non-parametric model efficiently finds patches from the dataset that match the C-CCA sample and blends the patches together. We analyze the results of the combined method on the datasets of horse and elephant images

    Proceedings, MSVSCC 2016

    Get PDF
    Proceedings of the 10th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 14, 2016 at VMASC in Suffolk, Virginia

    Neural Radiance Fields: Past, Present, and Future

    Full text link
    The various aspects like modeling and interpreting 3D environments and surroundings have enticed humans to progress their research in 3D Computer Vision, Computer Graphics, and Machine Learning. An attempt made by Mildenhall et al in their paper about NeRFs (Neural Radiance Fields) led to a boom in Computer Graphics, Robotics, Computer Vision, and the possible scope of High-Resolution Low Storage Augmented Reality and Virtual Reality-based 3D models have gained traction from res with more than 1000 preprints related to NeRFs published. This paper serves as a bridge for people starting to study these fields by building on the basics of Mathematics, Geometry, Computer Vision, and Computer Graphics to the difficulties encountered in Implicit Representations at the intersection of all these disciplines. This survey provides the history of rendering, Implicit Learning, and NeRFs, the progression of research on NeRFs, and the potential applications and implications of NeRFs in today's world. In doing so, this survey categorizes all the NeRF-related research in terms of the datasets used, objective functions, applications solved, and evaluation criteria for these applications.Comment: 413 pages, 9 figures, 277 citation

    Learning Inference Models for Computer Vision

    Get PDF
    Computer vision can be understood as the ability to perform 'inference' on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques, as even the model design is often dictated by the complexity of inference in them. This thesis proposes learning based inference schemes and demonstrates applications in computer vision. We propose techniques for inference in both generative and discriminative computer vision models. Despite their intuitive appeal, the use of generative models in vision is hampered by the difficulty of posterior inference, which is often too complex or too slow to be practical. We propose techniques for improving inference in two widely used techniques: Markov Chain Monte Carlo (MCMC) sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative vision models show that the proposed techniques accelerate the inference process and/or converge to better solutions. A main complication in the design of discriminative models is the inclusion of prior knowledge in a principled way. For better inference in discriminative models, we propose techniques that modify the original model itself, as inference is simple evaluation of the model. We concentrate on convolutional neural network (CNN) models and propose a generalization of standard spatial convolutions, which are the basic building blocks of CNN architectures, to bilateral convolutions. First, we generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. We show how the bilateral filtering modules can be used for modifying existing CNN architectures for better image segmentation and propose a neural network approach for temporal information propagation in videos. Experiments demonstrate the potential of the proposed bilateral networks on a wide range of vision tasks and datasets. In summary, we propose learning based techniques for better inference in several computer vision models ranging from inverse graphics to freely parameterized neural networks. In generative vision models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way for incorporating prior knowledge into CNNs

    Proceedings, MSVSCC 2018

    Get PDF
    Proceedings of the 12th Annual Modeling, Simulation & Visualization Student Capstone Conference held on April 19, 2018 at VMASC in Suffolk, Virginia. 155 pp

    Don’t forget to save! User experience principles for video game narrative authoring tools.

    Get PDF
    Interactive Digital Narratives (IDNs) are a natural evolution of traditional storytelling melded with technological improvements brought about by the rapidly increasing digital revolution. This has and continues to enhance the complexities and functionality of the stories that we can tell. Video game narratives, both old and new, are considered close relatives of IDN, and due to their enhanced interactivity and presentational methods, further complicate the creation process. Authoring tool software aims to alleviate the complexities of this by abstracting underlying data models into accessible user interfaces that creatives, even those with limited technical experience, can use to author their stories. Unfortunately, despite the vast array of authoring tools in this space, user experience is often overlooked even though it is arguably one of the most vital components. This has resulted in a focus on the audience within IDN research rather than the authors, and consequently our knowledge and understanding of the impacts of user experience design decisions in authoring tools are limited. This thesis tackles the modeling of complex video game narrative structures and investigates how user experience design decisions within IDN authoring tools may impact the authoring process. I first introduce my concept of Discoverable Narrative which establishes a vocabulary for the analysis, categorization, and comparison of aspects of video game narrative that are discovered, observed, or experienced by players — something that existing models struggle to detail. I also develop and present my Novella Narrative Model which provides support for video game narrative elements and makes several novel innovations that set it apart from existing narrative models. This thesis then builds upon these models by presenting two bespoke user studies that examine the user experience of the state-of-the-art in IDN authoring tool design, together building a listing of seven general Themes and five principles (Metaphor Testing, Fast Track Testing, Structure, Experimentation, Branching) that highlight evidenced behavioral trends of authors based on different user experience design factors within IDN authoring tools. This represents some of the first work in this space that investigates the relationships between the user experience design of IDN authoring tools and the impacts that they can have on authors. Additionally, a generalized multi-stage pipeline for the design and development of IDN authoring tools is introduced, informed by professional industry- standard design techniques, in an effort to both ensure quality user experience within my own work and to raise awareness of the importance of following proper design processes when creating authoring tools, also serving as a template for doing so

    Patterns and Pattern Languages for Mobile Augmented Reality

    Get PDF
    Mixed Reality is a relatively new field in computer science which uses technology as a medium to provide modified or enhanced views of reality or to virtually generate a new reality. Augmented Reality is a branch of Mixed Reality which blends the real-world as viewed through a computer interface with virtual objects generated by a computer. The 21st century commodification of mobile devices with multi-core Central Processing Units, Graphics Processing Units, high definition displays and multiple sensors controlled by capable Operating Systems such as Android and iOS means that Mobile Augmented Reality applications have become increasingly feasible. Mobile Augmented Reality is a multi-disciplinary field requiring a synthesis of many technologies such as computer graphics, computer vision, machine learning and mobile device programming while also requiring theoretical knowledge of diverse fields such as Linear Algebra, Projective and Differential Geometry, Probability and Optimisation. This multi-disciplinary nature has led to a fragmentation of knowledge into various specialisations, making it difficult to integrate different solution components into a coherent architecture. Software design patterns provide a solution space of tried and tested best practices for a specified problem within a given context. The solution space is non-prescriptive and is described in terms of relationships between roles that can be assigned to software components. Architectural patterns are used to specify high level designs of complete systems, as opposed to domain or tactical level patterns that address specific lower level problem areas. Pattern Languages comprise multiple software patterns combining in multiple possible sequences to form a language with the individual patterns forming the language vocabulary while the valid sequences through the patterns define the grammar. Pattern Languages provide flexible generalised solutions within a particular domain that can be customised to solve problems of differing characteristics and levels of iii complexity within the domain. The specification of one or more Pattern Languages tailored to the Mobile Augmented Reality domain can therefore provide a generalised guide for the design and architecture of Mobile Augmented Reality applications from an architectural level down to the ”nuts-and-bolts” implementation level. While there is a large body of research into the technical specialisations pertaining to Mobile Augmented Reality, there is a dearth of up-to-date literature covering Mobile Augmented Reality design. This thesis fills this vacuum by: 1. Providing architectural patterns that provide the spine on which the design of Mobile Augmented Reality artefacts can be based; 2. Documenting existing patterns within the context of Mobile Augmented Reality; 3. Identifying new patterns specific to Mobile Augmented Reality; and 4. Combining the patterns into Pattern Languages for Detection & Tracking, Rendering & Interaction and Data Access for Mobile Augmented Reality. The resulting Pattern Languages support design at multiple levels of complexity from an object-oriented framework down to specific one-off Augmented Reality applications. The practical contribution of this thesis is the specification of architectural patterns and Pattern Language that provide a unified design approach for both the overall architecture and the detailed design of Mobile Augmented Reality artefacts. The theoretical contribution is a design theory for Mobile Augmented Reality gleaned from the extraction of patterns and creation of a pattern language or languages

    Tracer and Timescale Methods for Passive and Reactive Transport in Fluid Flows

    Get PDF
    Geophysical, environmental, and urban fluid flows (i.e., flows developing in oceans, seas, estuaries, rivers, aquifers, reservoirs, etc.) exhibit a wide range of reactive and transport processes. Therefore, identifying key phenomena, understanding their relative importance, and establishing causal relationships between them is no trivial task. Analysis of primitive variables (e.g., velocity components, pressure, temperature, concentration) is not always conducive to the most fruitful interpretations. Examining auxiliary variables introduced for diagnostic purposes is an option worth considering. In this respect, tracer and timescale methods are proving to be very effective. Such methods can help address questions such as, "where does a fluid-born dissolved or particulate substance come from and where will it go?" or, "how fast are the transport and reaction phenomena controlling the appearance and disappearance such substances?" These issues have been dealt with since the 19th century, essentially by means of ad hoc approaches. However, over the past three decades, methods resting on solid theoretical foundations have been developed, which permit the evaluation of tracer concentrations and diagnostic timescales (age, residence/exposure time, etc.) across space and time and using numerical models and field data. This book comprises research and review articles, introducing state-of-the-art diagnostic theories and their applications to domains ranging from shallow human-made reservoirs to lakes, river networks, marine domains, and subsurface flow
    corecore