60 research outputs found
Videos in Context for Telecommunication and Spatial Browsing
The research presented in this thesis explores the use of videos embedded in panoramic imagery to transmit spatial and temporal information describing remote environments and their dynamics. Virtual environments (VEs) through which users can explore remote locations are rapidly emerging as a popular medium of presence and remote collaboration. However, capturing visual representation of locations to be used in VEs is usually a tedious process that requires either manual modelling of environments or the employment of specific hardware. Capturing environment dynamics is not straightforward either, and it is usually performed through specific tracking hardware. Similarly, browsing large unstructured video-collections with available tools is difficult, as the abundance of spatial and temporal information makes them hard to comprehend. At the same time, on a spectrum between 3D VEs and 2D images, panoramas lie in between, as they offer the same 2D images accessibility while preserving 3D virtual environments surrounding representation. For this reason, panoramas are an attractive basis for videoconferencing and browsing tools as they can relate several videos temporally and spatially. This research explores methods to acquire, fuse, render and stream data coming from heterogeneous cameras, with the help of panoramic imagery. Three distinct but interrelated questions are addressed. First, the thesis considers how spatially localised video can be used to increase the spatial information transmitted during video mediated communication, and if this improves quality of communication. Second, the research asks whether videos in panoramic context can be used to convey spatial and temporal information of a remote place and the dynamics within, and if this improves users' performance in tasks that require spatio-temporal thinking. Finally, the thesis considers whether there is an impact of display type on reasoning about events within videos in panoramic context. These research questions were investigated over three experiments, covering scenarios common to computer-supported cooperative work and video browsing. To support the investigation, two distinct video+context systems were developed. The first telecommunication experiment compared our videos in context interface with fully-panoramic video and conventional webcam video conferencing in an object placement scenario. The second experiment investigated the impact of videos in panoramic context on quality of spatio-temporal thinking during localization tasks. To support the experiment, a novel interface to video-collection in panoramic context was developed and compared with common video-browsing tools. The final experimental study investigated the impact of display type on reasoning about events. The study explored three adaptations of our video-collection interface to three display types. The overall conclusion is that videos in panoramic context offer a valid solution to spatio-temporal exploration of remote locations. Our approach presents a richer visual representation in terms of space and time than standard tools, showing that providing panoramic contexts to video collections makes spatio-temporal tasks easier. To this end, videos in context are suitable alternative to more difficult, and often expensive solutions. These findings are beneficial to many applications, including teleconferencing, virtual tourism and remote assistance
Virtual Field Trip via Digital Storytelling
Digital storytelling is a practice of combining digital content such as 3-dimensional
images, text, sound, images, and video to create a short story. It is the intersection
between the old art of storytelling and access to powerful technologies. This project will
be a step to experiment the development and effectiveness of digital storytelling and
hopefully ignite a source of motivation and encourages others to tap into their interests
and skills to develop their own digital storytelling and expand ICT usage in this
country. School children look forward to traditional field trips. However, such trips are
costly. VFT aims to reduce if not eliminate the constraints that traditional field trips
face such as money, time, energy, resources, distance and inaccessible area. To fit the
time frame, the VFT is created only for small selected areas in the KL Bird Park even
though the KL Bird Park is not that big because some of the areas are not suitable to
take panoramic pictures. The development of the VFT is adapted from QTVR Creation
Steps by Kitchens (2006). The procedure consists of defining the problem statements
and goals, literature review and research, creating image content through taking photos
at the site, transforming the photos to QTVR node through stitching, design and
construct prototype, inserting interactivity such as hotspots, delivering the output, and
last but not least, evaluation. The final output of the project is the KL Bird Park Virtual
Field Trip which consists of a photo based 3D panoramic images for each scene from
the site which are linked to one another and also hotspots which are placed on the
panoramic images to reveal the birds' information with one click on the hotspots. The
informal evaluation of the final output that was conducted shows an overwhelming
response and acceptance. All of the respondents would like to see more of this type of
VFT in the future
Spatial displays for visual awareness of remote locations
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2009.Cataloged from PDF version of thesis.Includes bibliographical references (p. [113]-116).uCom enables remote users to be visually aware of each other using "spatial displays" - live views of a remote space assembled according to an estimate of the remote space's layout. The main elements of the system design are a 3D representation of each space and a multi-display physical setup. The 3D image-based representation of a space is composed of an aggregate of live video feeds acquired from multiple viewpoints and rendered in a graphical visualization resembling a 3D collage. Its navigation controls allow users to transition among the remote views, while maintaining a sense of how the images relate in 3D space. Additionally, the system uses a configurable set of displays to portray always-on visual connections with a remote site integrated into the local physical environment. The evaluation investigates to what extent the system improves users' understanding of the layout of a remote space.by Ana Luisa de Araujo Santos.S.M
Computational immersive displays
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 77-79).Immersion is an oft-quoted but ill-defined term used to describe a viewer or participant's sense of engagement with a visual display system or participatory media. Traditionally, advances in immersive quality came at the high price of ever-escalating hardware requirements and computational budgets. But what if one could increase a participant's sense of immersion, instead, by taking advantage of perceptual cues, neuroprocessing, and emotional engagement while adding only a small, yet distinctly targeted, set of advancements to the display hardware? This thesis describes three systems that introduce small amounts of computation to the visual display of information in order to increase the viewer's sense of immersion and participation. It also describes the types of content used to evaluate the systems, as well as the results and conclusions gained from small user studies. The first system, Infinity-by-Nine, takes advantage of the dropoff in peripheral visual acuity to surround the viewer with an extended lightfield generated in realtime from existing video content. The system analyzes an input video stream and outpaints a low-resolution, pattern-matched lightfield that simulates a fully immersive environment in a computationally efficient way. The second system, the Narratarium, is a context-aware projector that applies pattern recognition and natural language processing to an input such as an audio stream or electronic text to generate images, colors, and textures appropriate to the narrative or emotional content. The system outputs interactive illustrations and audio projected into spaces such as children's rooms, retail settings, or entertainment venues. The final system, the 3D Telepresence Chair, combines a 19th-century stage illusion known as Pepper's Ghost with an array of micro projectors and a holographic diffuser to create an autostereoscopic representation of a remote subject with full horizontal parallax. The 3D Telepresence Chair is a portable, self-contained apparatus meant to enhance the experience of teleconferencing.by Daniel E. Novy.S.M
An investigation into web-based panoramic video virtual reality with reference to the virtual zoo.
Panoramic image Virtual Reality (VR) is a 360 degree image which has been interpreted as a kind of VR that allows users to navigate, view, hear and have remote access to a virtual environment. Panoramic Video VR builds on this, where filming is done in the real world to create a highly dynamic and immersive environment. This is proving to be a very attractive technology and has introduced many possible applications but still present a number of challenges, considered in this research.
An initial literature survey identified limitations in panoramic video to date: these were the technology (e.g. filming and stitching) and the design of effective navigation methods. In particular, there is a tendency for users to become disoriented during way-finding. In addition, an effective interface design to embed contextual information is required.
The research identified the need to have a controllable test environment in order to evaluate the production of the video and the optimal way of presenting and navigating within the scene. Computer Graphics (CG) simulation scenes were developed to establish a method of capturing, editing and stitching the video under controlled conditions. In addition, a novel navigation method, named the “image channel” was proposed and integrated within this environment. This replaced hotspots: the traditional navigational jumps between locations. Initial user testing indicated that the production was appropriate and did significantly improve user perception of position and orientation over jump-based navigation. The interface design combined with the environment view alone was sufficient for users to understand their location without the need to augment the view with an on screen map.
After obtaining optimal methods in building and improving the technology, the research looked for a natural, complex, and dynamic real environment for testing. The web-based virtual zoo (World Association of Zoos and Aquariums) was selected as an ideal production: It had the purpose to allow people to get close to animals in their natural habitat and created particular interest to develop a system for knowledge delivery, raising protection concerns, and entertaining visitors: all key roles of a zoo.
The design method established from CG was then used to develop a film rig and production unit for filming a real animal habitat: the Formosan rock monkey in Taiwan. A web-based panoramic video of this was built and tested though user experience testing and expert interviews. The results of this were essentially identical to the testing done in the prototype environment, and validated the production. Also was successfully attracting users to the site repeatedly.
The research has contributed to new knowledge in improvement to the production process, improvement to presentation and navigating within panoramic videos through the proposed Image Channel method, and has demonstrated that web-based virtual zoo can be improved to help address considerable pressure on animal extinction and animal habitat degradation that affect humans by using this technology. Further studies were addressed. The research was sponsored by Taiwan’s Government and Twycross Zoo UK was a collaborator
Recommended from our members
Camera positioning for 3D panoramic image rendering
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University London.Virtual camera realisation and the proposition of trapezoidal camera architecture are the two broad contributions of this thesis. Firstly, multiple camera and their arrangement constitute a critical component which affect the integrity of visual content acquisition for multi-view video. Currently, linear, convergence, and divergence arrays are the prominent camera topologies adopted. However, the large number of cameras required and their synchronisation are two of prominent challenges usually encountered. The use of virtual cameras can significantly reduce the number of physical cameras used with respect to any of the known
camera structures, hence adequately reducing some of the other implementation issues. This thesis explores to use image-based rendering with and without geometry in the implementations leading to the realisation of virtual cameras. The virtual camera implementation was carried out from the perspective of depth map (geometry) and use of multiple image samples (no geometry). Prior to the virtual camera realisation, the generation of depth map was investigated using region match measures widely known for solving image point correspondence problem. The constructed depth maps have been compare with the ones generated
using the dynamic programming approach. In both the geometry and no geometry approaches, the virtual cameras lead to the rendering of views from a textured depth map, construction of 3D panoramic image of a scene by stitching multiple image samples and performing superposition on them, and computation
of virtual scene from a stereo pair of panoramic images. The quality of these rendered images were assessed through the use of either objective or subjective analysis in Imatest software. Further more, metric reconstruction of a scene was performed by re-projection of the pixel points from multiple image samples with
a single centre of projection. This was done using sparse bundle adjustment algorithm. The statistical summary obtained after the application of this algorithm provides a gauge for the efficiency of the optimisation step. The optimised data was then visualised in Meshlab software environment, hence providing the reconstructed scene. Secondly, with any of the well-established camera arrangements, all cameras are usually constrained to the same horizontal plane. Therefore, occlusion becomes an extremely challenging problem, and a robust camera set-up is required in order to resolve strongly the hidden part of any scene objects.
To adequately meet the visibility condition for scene objects and given that occlusion of the same scene objects can occur, a multi-plane camera structure is highly desirable. Therefore, this thesis also explore trapezoidal camera structure for image acquisition. The approach here is to assess the feasibility and potential
of several physical cameras of the same model being sparsely arranged on the edge of an efficient trapezoid graph. This is implemented both Matlab and Maya. The quality of the depth maps rendered in Matlab are better in Quality
Multimodality in {VR}: {A} Survey
Virtual reality has the potential to change the way we create and consume content in our everyday life. Entertainment, training, design and manufacturing, communication, or advertising are all applications that already benefit from this new medium reaching consumer level. VR is inherently different from traditional media: it offers a more immersive experience, and has the ability to elicit a sense of presence through the place and plausibility illusions. It also gives the user unprecedented capabilities to explore their environment, in contrast with traditional media. In VR, like in the real world, users integrate the multimodal sensory information they receive to create a unified perception of the virtual world. Therefore, the sensory cues that are available in a virtual environment can be leveraged to enhance the final experience. This may include increasing realism, or the sense of presence; predicting or guiding the attention of the user through the experience; or increasing their performance if the experience involves the completion of certain tasks. In this state-of-the-art report, we survey the body of work addressing multimodality in virtual reality, its role and benefits in the final user experience. The works here reviewed thus encompass several fields of research, including computer graphics, human computer interaction, or psychology and perception. Additionally, we give an overview of different applications that leverage multimodal input in areas such as medicine, training and education, or entertainment; we include works in which the integration of multiple sensory information yields significant improvements, demonstrating how multimodality can play a fundamental role in the way VR systems are designed, and VR experiences created and consumed
- …