Search CORE

15 research outputs found

Programmable Image-Based Light Capture for Previsualization

Author: Lindsay Clifford
Publication venue: Digital WPI
Publication date: 02/04/2013
Field of study

Previsualization is a class of techniques for creating approximate previews of a movie sequence in order to visualize a scene prior to shooting it on the set. Often these techniques are used to convey the artistic direction of the story in terms of cinematic elements, such as camera movement, angle, lighting, dialogue, and character motion. Essentially, a movie director uses previsualization (previs) to convey movie visuals as he sees them in his minds-eye . Traditional methods for previs include hand-drawn sketches, Storyboards, scaled models, and photographs, which are created by artists to convey how a scene or character might look or move. A recent trend has been to use 3D graphics applications such as video game engines to perform previs, which is called 3D previs. This type of previs is generally used prior to shooting a scene in order to choreograph camera or character movements. To visualize a scene while being recorded on-set, directors and cinematographers use a technique called On-set previs, which provides a real-time view with little to no processing. Other types of previs, such as Technical previs, emphasize accurately capturing scene properties but lack any interactive manipulation and are usually employed by visual effects crews and not for cinematographers or directors. This dissertation\u27s focus is on creating a new method for interactive visualization that will automatically capture the on-set lighting and provide interactive manipulation of cinematic elements to facilitate the movie maker\u27s artistic expression, validate cinematic choices, and provide guidance to production crews. Our method will overcome the drawbacks of the all previous previs methods by combining photorealistic rendering with accurately captured scene details, which is interactively displayed on a mobile capture and rendering platform. This dissertation describes a new hardware and software previs framework that enables interactive visualization of on-set post-production elements. A three-tiered framework, which is the main contribution of this dissertation is; 1) a novel programmable camera architecture that provides programmability to low-level features and a visual programming interface, 2) new algorithms that analyzes and decomposes the scene photometrically, and 3) a previs interface that leverages the previous to perform interactive rendering and manipulation of the photometric and computer generated elements. For this dissertation we implemented a programmable camera with a novel visual programming interface. We developed the photometric theory and implementation of our novel relighting technique called Symmetric lighting, which can be used to relight a scene with multiple illuminants with respect to color, intensity and location on our programmable camera. We analyzed the performance of Symmetric lighting on synthetic and real scenes to evaluate the benefits and limitations with respect to the reflectance composition of the scene and the number and color of lights within the scene. We found that, since our method is based on a Lambertian reflectance assumption, our method works well under this assumption but that scenes with high amounts of specular reflections can have higher errors in terms of relighting accuracy and additional steps are required to mitigate this limitation. Also, scenes which contain lights whose colors are a too similar can lead to degenerate cases in terms of relighting. Despite these limitations, an important contribution of our work is that Symmetric lighting can also be leveraged as a solution for performing multi-illuminant white balancing and light color estimation within a scene with multiple illuminants without limits on the color range or number of lights. We compared our method to other white balance methods and show that our method is superior when at least one of the light colors is known a priori

DigitalCommons@WPI

Data-driven approaches for interactive appearance editing

Author: Nguyen Hoang Chuong
Publication venue
Publication date: 01/01/2014
Field of study

This thesis proposes several techniques for interactive editing of digital content and fast rendering of virtual 3D scenes. Editing of digital content - such as images or 3D scenes - is difficult, requires artistic talent and technical expertise. To alleviate these difficulties, we exploit data-driven approaches that use the easily accessible Internet data (e. g., images, videos, materials) to develop new tools for digital content manipulation. Our proposed techniques allow casual users to achieve high-quality editing by interactively exploring the manipulations without the need to understand the underlying physical models of appearance. First, the thesis presents a fast algorithm for realistic image synthesis of virtual 3D scenes. This serves as the core framework for a new method that allows artists to fine tune the appearance of a rendered 3D scene. Here, artists directly paint the final appearance and the system automatically solves for the material parameters that best match the desired look. Along this line, an example-based material assignment approach is proposed, where the 3D models of a virtual scene can be "materialized" simply by giving a guidance source (image/video). Next, the thesis proposes shape and color subspaces of an object that are learned from a collection of exemplar images. These subspaces can be used to constrain image manipulations to valid shapes and colors, or provide suggestions for manipulations. Finally, data-driven color manifolds which contain colors of a specific context are proposed. Such color manifolds can be used to improve color picking performance, color stylization, compression or white balancing.Diese Dissertation stellt Techniken zum interaktiven Editieren von digitalen Inhalten und zum schnellen Rendering von virtuellen 3D Szenen vor. Digitales Editieren - seien es Bilder oder dreidimensionale Szenen - ist kompliziert, benötigt künstlerisches Talent und technische Expertise. Um diese Schwierigkeiten zu relativieren, nutzen wir datengesteuerte Ansätze, die einfach zugängliche Internetdaten, wie Bilder, Videos und Materialeigenschaften, nutzen um neue Werkzeuge zur Manipulation von digitalen Inhalten zu entwickeln. Die von uns vorgestellten Techniken erlauben Gelegenheitsnutzern das Editieren in hoher Qualität, indem Manipulationsmöglichkeiten interaktiv exploriert werden können ohne die zugrundeliegenden physikalischen Modelle der Bildentstehung verstehen zu müssen. Zunächst stellen wir einen effizienten Algorithmus zur realistischen Bildsynthese von virtuellen 3D Szenen vor. Dieser dient als Kerngerüst einer Methode, die Nutzern die Feinabstimmung des finalen Aussehens einer gerenderten dreidimensionalen Szene erlaubt. Hierbei malt der Künstler direkt das beabsichtigte Aussehen und das System errechnet automatisch die zugrundeliegenden Materialeigenschaften, die den beabsichtigten Eigenschaften am nahesten kommen. Zu diesem Zweck wird ein auf Beispielen basierender Materialzuordnungsansatz vorgestellt, für den das 3D Model einer virtuellen Szene durch das simple Anführen einer Leitquelle (Bild, Video) in Materialien aufgeteilt werden kann. Als Nächstes schlagen wir Form- und Farbunterräume von Objektklassen vor, die aus einer Sammlung von Beispielbildern gelernt werden. Diese Unterräume können genutzt werden um Bildmanipulationen auf valide Formen und Farben einzuschränken oder Manipulationsvorschläge zu liefern. Schließlich werden datenbasierte Farbmannigfaltigkeiten vorgestellt, die Farben eines spezifischen Kontexts enthalten. Diese Mannigfaltigkeiten ermöglichen eine Leistungssteigerung bei Farbauswahl, Farbstilisierung, Komprimierung und Weißabgleich

Universaar

Acronym

Synthetic image generation and the use of virtual environments for image enhancement tasks

Author: Del Gallego Neil Patrick
Publication venue: Animo Repository
Publication date: 01/09/2023
Field of study

Deep learning networks are often difficult to train if there are insufficient image samples. Gathering real-world images tailored for a specific job takes a lot of work to perform. This dissertation explores techniques for synthetic image generation and virtual environments for various image enhancement/ correction/restoration tasks, specifically distortion correction, dehazing, shadow removal, and intrinsic image decomposition. First, given various image formation equations, such as those used in distortion correction and dehazing, synthetic image samples can be produced, provided that the equation is well-posed. Second, using virtual environments to train various image models is applicable for simulating real-world effects that are otherwise difficult to gather or replicate, such as dehazing and shadow removal. Given synthetic images, one cannot train a network directly on it as there is a possible gap between the synthetic and real domains. We have devised several techniques for generating synthetic images and formulated domain adaptation methods where our trained deep-learning networks perform competitively in distortion correction, dehazing, and shadow removal. Additional studies and directions are provided for the intrinsic image decomposition problem and the exploration of procedural content generation, where a virtual Philippine city was created as an initial prototype. Keywords: image generation, image correction, image dehazing, shadow removal, intrinsic image decomposition, computer graphics, rendering, machine learning, neural networks, domain adaptation, procedural content generation

Animo Repository - De La Salle University Research

Inverse rendering techniques for physically grounded image editing

Author: Karsch Kevin
Publication venue
Publication date
Field of study

From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images

Illinois Digital Environment for Access to Learning and Scholarship Repository

Text–to–Video: Image Semantics and NLP

Author: Schwarz Katharina
Publication venue: Universität Tübingen
Publication date: 01/01/2018
Field of study

When aiming at automatically translating an arbitrary text into a visual story, the main challenge consists in finding a semantically close visual representation whereby the displayed meaning should remain the same as in the given text. Besides, the appearance of an image itself largely influences how its meaningful information is transported towards an observer. This thesis now demonstrates that investigating in both, image semantics as well as the semantic relatedness between visual and textual sources enables us to tackle the challenging semantic gap and to find a semantically close translation from natural language to a corresponding visual representation. Within the last years, social networking became of high interest leading to an enormous and still increasing amount of online available data. Photo sharing sites like Flickr allow users to associate textual information with their uploaded imagery. Thus, this thesis exploits this huge knowledge source of user generated data providing initial links between images and words, and other meaningful data. In order to approach visual semantics, this work presents various methods to analyze the visual structure as well as the appearance of images in terms of meaningful similarities, aesthetic appeal, and emotional effect towards an observer. In detail, our GPU-based approach efficiently finds visual similarities between images in large datasets across visual domains and identifies various meanings for ambiguous words exploring similarity in online search results. Further, we investigate in the highly subjective aesthetic appeal of images and make use of deep learning to directly learn aesthetic rankings from a broad diversity of user reactions in social online behavior. To gain even deeper insights into the influence of visual appearance towards an observer, we explore how simple image processing is capable of actually changing the emotional perception and derive a simple but effective image filter. To identify meaningful connections between written text and visual representations, we employ methods from Natural Language Processing (NLP). Extensive textual processing allows us to create semantically relevant illustrations for simple text elements as well as complete storylines. More precisely, we present an approach that resolves dependencies in textual descriptions to arrange 3D models correctly. Further, we develop a method that finds semantically relevant illustrations to texts of different types based on a novel hierarchical querying algorithm. Finally, we present an optimization based framework that is capable of not only generating semantically relevant but also visually coherent picture stories in different styles.Bei der automatischen Umwandlung eines beliebigen Textes in eine visuelle Geschichte, besteht die größte Herausforderung darin eine semantisch passende visuelle Darstellung zu finden. Dabei sollte die Bedeutung der Darstellung dem vorgegebenen Text entsprechen. Darüber hinaus hat die Erscheinung eines Bildes einen großen Einfluß darauf, wie seine bedeutungsvollen Inhalte auf einen Betrachter übertragen werden. Diese Dissertation zeigt, dass die Erforschung sowohl der Bildsemantik als auch der semantischen Verbindung zwischen visuellen und textuellen Quellen es ermöglicht, die anspruchsvolle semantische Lücke zu schließen und eine semantisch nahe Übersetzung von natürlicher Sprache in eine entsprechend sinngemäße visuelle Darstellung zu finden. Des Weiteren gewann die soziale Vernetzung in den letzten Jahren zunehmend an Bedeutung, was zu einer enormen und immer noch wachsenden Menge an online verfügbaren Daten geführt hat. Foto-Sharing-Websites wie Flickr ermöglichen es Benutzern, Textinformationen mit ihren hochgeladenen Bildern zu verknüpfen. Die vorliegende Arbeit nutzt die enorme Wissensquelle von benutzergenerierten Daten welche erste Verbindungen zwischen Bildern und Wörtern sowie anderen aussagekräftigen Daten zur Verfügung stellt. Zur Erforschung der visuellen Semantik stellt diese Arbeit unterschiedliche Methoden vor, um die visuelle Struktur sowie die Wirkung von Bildern in Bezug auf bedeutungsvolle Ähnlichkeiten, ästhetische Erscheinung und emotionalem Einfluss auf einen Beobachter zu analysieren. Genauer gesagt, findet unser GPU-basierter Ansatz effizient visuelle Ähnlichkeiten zwischen Bildern in großen Datenmengen quer über visuelle Domänen hinweg und identifiziert verschiedene Bedeutungen für mehrdeutige Wörter durch die Erforschung von Ähnlichkeiten in Online-Suchergebnissen. Des Weiteren wird die höchst subjektive ästhetische Anziehungskraft von Bildern untersucht und "deep learning" genutzt, um direkt ästhetische Einordnungen aus einer breiten Vielfalt von Benutzerreaktionen im sozialen Online-Verhalten zu lernen. Um noch tiefere Erkenntnisse über den Einfluss des visuellen Erscheinungsbildes auf einen Betrachter zu gewinnen, wird erforscht, wie alleinig einfache Bildverarbeitung in der Lage ist, tatsächlich die emotionale Wahrnehmung zu verändern und ein einfacher aber wirkungsvoller Bildfilter davon abgeleitet werden kann. Um bedeutungserhaltende Verbindungen zwischen geschriebenem Text und visueller Darstellung zu ermitteln, werden Methoden des "Natural Language Processing (NLP)" verwendet, die der Verarbeitung natürlicher Sprache dienen. Der Einsatz umfangreicher Textverarbeitung ermöglicht es, semantisch relevante Illustrationen für einfache Textteile sowie für komplette Handlungsstränge zu erzeugen. Im Detail wird ein Ansatz vorgestellt, der Abhängigkeiten in Textbeschreibungen auflöst, um 3D-Modelle korrekt anzuordnen. Des Weiteren wird eine Methode entwickelt die, basierend auf einem neuen hierarchischen Such-Anfrage Algorithmus, semantisch relevante Illustrationen zu Texten verschiedener Art findet. Schließlich wird ein optimierungsbasiertes Framework vorgestellt, das nicht nur semantisch relevante, sondern auch visuell kohärente Bildgeschichten in verschiedenen Bildstilen erzeugen kann

Publikationsserver der Universität Tübingen

Simulator Networking Handbook: Distributed Interactive Simulation Testbed

Author: Goldiez Brian
Publication venue: Institute for Simulation and Training, University of Central Florida
Publication date: 01/01/1993
Field of study

Report is an attempt to collect and organize a large body of knowledge regarding the design and development of simulation networks, particularly distributed interactive simulation

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Virtual Heritage: new technologies for edutainment

Author: Martina Andrea
Publication venue: Politecnico di Torino
Publication date
Field of study

Cultural heritage represents an enormous amount of information and knowledge. Accessing this treasure chest allows not only to discover the legacy of physical and intangible attributes of the past but also to provide a better understanding of the present. Museums and cultural institutions have to face the problem of providing access to and communicating these cultural contents to a wide and assorted audience, meeting the expectations and interests of the reference end-users and relying on the most appropriate tools available. Given the large amount of existing tangible and intangible heritage, artistic, historical and cultural contents, what can be done to preserve and properly disseminate their heritage significance? How can these items be disseminated in the proper way to the public, taking into account their enormous heterogeneity? Answering this question requires to deal as well with another aspect of the problem: the evolution of culture, literacy and society during the last decades of 20th century. To reflect such transformations, this period witnessed a shift in the museum’s focus from the aesthetic value of museum artifacts to the historical and artistic information they encompass, and a change into the museums’ role from a mere "container" of cultural objects to a "narrative space" able to explain, describe, and revive the historical material in order to attract and entertain visitors. These developments require creating novel exhibits, able to tell stories about the objects and enabling visitors to construct semantic meanings around them. The objective that museums presently pursue is reflected by the concept of Edutainment, Education + Entertainment. Nowadays, visitors are not satisfied with ‘learning something’, but would rather engage in an ‘experience of learning’, or ‘learning for fun’, being active actors and players in their own cultural experience. As a result, institutions are faced with several new problems, like the need to communicate with people from different age groups and different cultural backgrounds, the change in people attitude due to the massive and unexpected diffusion of technology into everyday life, the need to design the visit by a personal point of view, leading to a high level of customization that allows visitors to shape their path according to their characteristics and interests. In order to cope with these issues, I investigated several approaches. In particular, I focused on Virtual Learning Environments (VLE): real-time interactive virtual environments where visitors can experience a journey through time and space, being immersed into the original historical, cultural and artistic context of the work of arts on display. VLE can strongly help archivists and exhibit designers, allowing to create new interesting and captivating ways to present cultural materials. In this dissertation I will tackle many of the different dimensions related to the creation of a cultural virtual experience. During my research project, the entire pipeline involved into the development and deployment of VLE has been investigated. The approach followed was to analyze in details the main sub-problems to face, in order to better focus on specific issues. Therefore, I first analyzed different approaches to an effective recreation of the historical and cultural context of heritage contents, which is ultimately aimed at an effective transfer of knowledge to the end-users. In particular, I identified the enhancement of the users’ sense of presence in VLE as one of the main tools to reach this objective. Presence is generally expressed as the perception of 'being there', i.e. the subjective belief of users that they are in a certain place, even if they know that the experience is mediated by the computer. Presence is related to the number of senses involved by the VLE and to the quality of the sensorial stimuli. But in a cultural scenario, this is not sufficient as the cultural presence plays a relevant role. Cultural presence is not just a feeling of 'being there' but of being - not only physically, but also socially, culturally - 'there and then'. In other words, the VLE must be able to transfer not only the appearance, but also all the significance and characteristics of the context that makes it a place and both the environment and the context become tools capable of transferring the cultural significance of a historic place. The attention that users pay to the mediated environment is another aspect that contributes to presence. Attention is related to users’ focalization and concentration and to their interests. Thus, in order to improve the involvement and capture the attention of users, I investigated in my work the adoption of narratives and storytelling experiences, which can help people making sense of history and culture, and of gamification approaches, which explore the use of game thinking and game mechanics in cultural contexts, thus engaging users while disseminating cultural contents and, why not?, letting them have fun during this process. Another dimension related to the effectiveness of any VLE is also the quality of the user experience (UX). User interaction, with both the virtual environment and its digital contents, is one of the main elements affecting UX. With respect to this I focused on one of the most recent and promising approaches: the natural interaction, which is based on the idea that persons need to interact with technology in the same way they are used to interact with the real world in everyday life. Then, I focused on the problem of presenting, displaying and communicating contents. VLE represent an ideal presentation layer, being multiplatform hypermedia applications where users are free to interact with the virtual reconstructions by choosing their own visiting path. Cultural items, embedded into the environment, can be accessed by users according to their own curiosity and interests, with the support of narrative structures, which can guide them through the exploration of the virtual spaces, and conceptual maps, which help building meaningful connections between cultural items. Thus, VLE environments can even be seen as visual interfaces to DBs of cultural contents. Users can navigate the VE as if they were browsing the DB contents, exploiting both text-based queries and visual-based queries, provided by the re-contextualization of the objects into their original spaces, whose virtual exploration can provide new insights on specific elements and improve the awareness of relationships between objects in the database. Finally, I have explored the mobile dimension, which became absolutely relevant in the last period. Nowadays, off-the-shelf consumer devices as smartphones and tablets guarantees amazing computing capabilities, support for rich multimedia contents, geo-localization and high network bandwidth. Thus, mobile devices can support users in mobility and detect the user context, thus allowing to develop a plethora of location-based services, from way-finding to the contextualized communication of cultural contents, aimed at providing a meaningful exploration of exhibits and cultural or tourist sites according to visitors’ personal interest and curiosity

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Development of unsupervised learning methods with applications to life sciences data

Author: Gardini Erika <1994>
Publication venue: Alma Mater Studiorum - Università di Bologna
Publication date: 29/03/2023
Field of study

Machine Learning makes computers capable of performing tasks typically requiring human intelligence. A domain where it is having a considerable impact is the life sciences, allowing to devise new biological analysis protocols, develop patients’ treatments efficiently and faster, and reduce healthcare costs. This Thesis work presents new Machine Learning methods and pipelines for the life sciences focusing on the unsupervised field. At a methodological level, two methods are presented. The first is an “Ab Initio Local Principal Path” and it is a revised and improved version of a pre-existing algorithm in the manifold learning realm. The second contribution is an improvement over the Import Vector Domain Description (one-class learning) through the Kullback-Leibler divergence. It hybridizes kernel methods to Deep Learning obtaining a scalable solution, an improved probabilistic model, and state-of-the-art performances. Both methods are tested through several experiments, with a central focus on their relevance in life sciences. Results show that they improve the performances achieved by their previous versions. At the applicative level, two pipelines are presented. The first one is for the analysis of RNA-Seq datasets, both transcriptomic and single-cell data, and is aimed at identifying genes that may be involved in biological processes (e.g., the transition of tissues from normal to cancer). In this project, an R package is released on CRAN to make the pipeline accessible to the bioinformatic Community through high-level APIs. The second pipeline is in the drug discovery domain and is useful for identifying druggable pockets, namely regions of a protein with a high probability of accepting a small molecule (a drug). Both these pipelines achieve remarkable results. Lastly, a detour application is developed to identify the strengths/limitations of the “Principal Path” algorithm by analyzing Convolutional Neural Networks induced vector spaces. This application is conducted in the music and visual arts domains

AMS Tesi di Dottorato

Virtual Reality for the Visually Impaired

Author: Waskiewicz Milosz Zygmunt
Publication venue: The University of Bergen
Publication date: 01/07/2022
Field of study

This thesis aims to illuminate and describe how there are problems with the development of virtual reality regarding visually impaired people. After discussing the reasons how and why this is a problem, this thesis will provide some possible solutions to develop virtual reality into a more user accessible technology, specifically for the visually impaired. As the popularity of virtual reality increases in digital culture, especially with Facebook announcing their development of Metaverse, there is a need for a future virtual reality environment that everyone can use. And it is in these early stages of development, that the need to address the problem of inaccessibility arises. As virtual reality is a relatively new medium in digital culture, the research on its use by visually impaired people has significant gaps. And as relatively few researchers are exploring this topic, my research will hopefully lead to more activity in this important area. Therefore, my research questions aim to address the current limitations of virtual reality, filling in some of the most significant gaps in this research area. My thesis will do this by conducting interviews and surveys to gather data that can further support and identify the crucial limitations of the visually impaired experience while trying to use virtual reality technology. The findings in this thesis will further address the problem, creating a possible solution and emphasizing the importance of user accessibility for the visually impaired in the future development of virtual reality. If digital companies and developers address this problem now, we can have a future where visually impaired people are treated more equally, with technologies developed specifically for them to experience virtual worlds.Master's Thesis in Digital CultureDIKULT350MAHF-DIKU

University of Bergen