Search CORE

151 research outputs found

MirrorForge: Rapid Prototyping of Complex Mirrors for Camera and Projector Systems

Author: Echtler Florian
Getschmann Christopher
Mthunzi Everett Mondliwethu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/02/2022
Field of study

Computational immersive displays

Author: Novy Daniel E. (Daniel Edward)
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2013
Field of study

Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2013.Cataloged from PDF version of thesis.Includes bibliographical references (p. 77-79).Immersion is an oft-quoted but ill-defined term used to describe a viewer or participant's sense of engagement with a visual display system or participatory media. Traditionally, advances in immersive quality came at the high price of ever-escalating hardware requirements and computational budgets. But what if one could increase a participant's sense of immersion, instead, by taking advantage of perceptual cues, neuroprocessing, and emotional engagement while adding only a small, yet distinctly targeted, set of advancements to the display hardware? This thesis describes three systems that introduce small amounts of computation to the visual display of information in order to increase the viewer's sense of immersion and participation. It also describes the types of content used to evaluate the systems, as well as the results and conclusions gained from small user studies. The first system, Infinity-by-Nine, takes advantage of the dropoff in peripheral visual acuity to surround the viewer with an extended lightfield generated in realtime from existing video content. The system analyzes an input video stream and outpaints a low-resolution, pattern-matched lightfield that simulates a fully immersive environment in a computationally efficient way. The second system, the Narratarium, is a context-aware projector that applies pattern recognition and natural language processing to an input such as an audio stream or electronic text to generate images, colors, and textures appropriate to the narrative or emotional content. The system outputs interactive illustrations and audio projected into spaces such as children's rooms, retail settings, or entertainment venues. The final system, the 3D Telepresence Chair, combines a 19th-century stage illusion known as Pepper's Ghost with an array of micro projectors and a holographic diffuser to create an autostereoscopic representation of a remote subject with full horizontal parallax. The 3D Telepresence Chair is a portable, self-contained apparatus meant to enhance the experience of teleconferencing.by Daniel E. Novy.S.M

DSpace@MIT

Use of baited remote underwater video (BRUV) and motion analysis for studying the impacts of underwater noise upon free ranging fish and implications for marine energy management

Author: Elliott Michael
Pérez-Domínguez Rafael
Roberts Louise
Publication venue: 'Elsevier BV'
Publication date: 13/09/2016
Field of study

© 2016 Elsevier Ltd Free-ranging individual fish were observed using a baited remote underwater video (BRUV) system during sound playback experiments. This paper reports on test trials exploring BRUV design parameters, image analysis and practical experimental designs. Three marine species were exposed to playback noise, provided as examples of behavioural responses to impulsive sound at 163–171 dB re 1 μPa (peak-to-peak SPL) and continuous sound of 142.7 dB re 1 μPa (RMS, SPL), exhibiting directional changes and accelerations. The methods described here indicate the efficacy of BRUV to examine behaviour of free-ranging species to noise playback, rather than using confinement. Given the increasing concern about the effects of water-borne noise, for example its inclusion within the EU Marine Strategy Framework Directive, and the lack of empirical evidence in setting thresholds, this paper discusses the use of BRUV, and short term behavioural changes, in supporting population level marine noise management

Repository@Hull - Worktribe

Towards Intelligent Telerobotics: Visualization and Control of Remote Robot

Author: Fu Bo
Publication venue: UKnowledge
Publication date: 01/01/2015
Field of study

Human-machine cooperative or co-robotics has been recognized as the next generation of robotics. In contrast to current systems that use limited-reasoning strategies or address problems in narrow contexts, new co-robot systems will be characterized by their flexibility, resourcefulness, varied modeling or reasoning approaches, and use of real-world data in real time, demonstrating a level of intelligence and adaptability seen in humans and animals. The research I focused is in the two sub-field of co-robotics: teleoperation and telepresence. We firstly explore the ways of teleoperation using mixed reality techniques. I proposed a new type of display: hybrid-reality display (HRD) system, which utilizes commodity projection device to project captured video frame onto 3D replica of the actual target surface. It provides a direct alignment between the frame of reference for the human subject and that of the displayed image. The advantage of this approach lies in the fact that no wearing device needed for the users, providing minimal intrusiveness and accommodating users eyes during focusing. The field-of-view is also significantly increased. From a user-centered design standpoint, the HRD is motivated by teleoperation accidents, incidents, and user research in military reconnaissance etc. Teleoperation in these environments is compromised by the Keyhole Effect, which results from the limited field of view of reference. The technique contribution of the proposed HRD system is the multi-system calibration which mainly involves motion sensor, projector, cameras and robotic arm. Due to the purpose of the system, the accuracy of calibration should also be restricted within millimeter level. The followed up research of HRD is focused on high accuracy 3D reconstruction of the replica via commodity devices for better alignment of video frame. Conventional 3D scanner lacks either depth resolution or be very expensive. We proposed a structured light scanning based 3D sensing system with accuracy within 1 millimeter while robust to global illumination and surface reflection. Extensive user study prove the performance of our proposed algorithm. In order to compensate the unsynchronization between the local station and remote station due to latency introduced during data sensing and communication, 1-step-ahead predictive control algorithm is presented. The latency between human control and robot movement can be formulated as a linear equation group with a smooth coefficient ranging from 0 to 1. This predictive control algorithm can be further formulated by optimizing a cost function. We then explore the aspect of telepresence. Many hardware designs have been developed to allow a camera to be placed optically directly behind the screen. The purpose of such setups is to enable two-way video teleconferencing that maintains eye-contact. However, the image from the see-through camera usually exhibits a number of imaging artifacts such as low signal to noise ratio, incorrect color balance, and lost of details. Thus we develop a novel image enhancement framework that utilizes an auxiliary color+depth camera that is mounted on the side of the screen. By fusing the information from both cameras, we are able to significantly improve the quality of the see-through image. Experimental results have demonstrated that our fusion method compares favorably against traditional image enhancement/warping methods that uses only a single image

University of Kentucky

Mental vision:a computer graphics platform for virtual reality, science and education

Author: Peternier Achille
Publication venue: Lausanne, EPFL
Publication date: 28/05/2009
Field of study

Despite the wide amount of computer graphics frameworks and solutions available for virtual reality, it is still difficult to find a perfect one fitting at the same time the many constraints of research and educational contexts. Advanced functionalities and user-friendliness, rendering speed and portability, or scalability and image quality are opposite characteristics rarely found into a same approach. Furthermore, fruition of virtual reality specific devices like CAVEs or wearable systems is limited by their costs and accessibility, being most of these innovations reserved to institutions and specialists able to afford and manage them through strong background knowledge in programming. Finally, computer graphics and virtual reality are a complex and difficult matter to learn, due to the heterogeneity of notions a developer needs to practice with before attempting to implement a full virtual environment. In this thesis we describe our contributions to these topics, assembled in what we called the Mental Vision platform. Mental Vision is a framework composed of three main entities. First, a teaching/research oriented graphics engine, simplifying access to 2D/3D real-time rendering on mobile devices, personal computers and CAVE systems. Second, a series of pedagogical modules to introduce and practice computer graphics and virtual reality techniques. Third, two advanced VR systems: a wearable, lightweight and handsfree mixed reality setup, and a four sides CAVE designed through off the shelf hardware. In this dissertation we explain our conceptual, architectural and technical approach, pointing out how we managed to create a robust and coherent solution reducing complexity related to cross-platform and multi-device 3D rendering, and answering simultaneously to contradictory common needs of computer graphics and virtual reality for researchers and students. A series of case studies evaluates how Mental Vision concretely satisfies these needs and achieves its goals on in vitro benchmarks and in vivo scientific and educational projects

Infoscience - École polytechnique fédérale de Lausanne

360º hypervideo

Author: Neng Luís António da Rosa
Publication venue
Publication date: 01/01/2011
Field of study

Tese de mestrado em Informática, apresentada à Universidade de Lisboa, através da Faculdade de Ciências, 2011Nesta dissertação descrevemos uma abordagem para o design e desenvolvimento de uma interface imersiva e interactiva para a visualização e navegação de hipervídeos em 360º através da internet. Estes tipos de hipervídeos permite aos utilizadores movimentarem-se em torno de um eixo para visualizar os conteúdos dos vídeos em diferentes ângulos e acedê los de forma eficiente através de hiperligações. Desafios para a apresentação deste tipo de hipervídeos incluem: proporcionar aos utilizadores uma interface adequada que seja capaz de explorar conteúdos em 360º num ecrã normal, onde o vídeo deve mudar de perspectiva para que os utilizadores sintam que estão a olhar ao redor, e formas de navegação adequadas para compreenderem facilmente a estrutura do hipervídeo, mesmo quando as hiperligações estejam fora do alcance do campo de visão. Os dispositivos para a captura de vídeo em 360º, bem como as formas de os disponibilizar na Web, são cada vez mais comuns e acessíveis ao público em geral. Neste contexto, é pertinente explorar formas e técnicas de navegação para visualizar e interagir com hipervídeos em 360º. Tradicionalmente, para visualizar o conteúdo de um vídeo, o utilizador fica limitado à região para onde a câmara estava apontada durante a sua captura, o que significa que o vídeo resultante terá limites laterais. Com a gravação de vídeo em 360º, já não há estes limites: abrindo novas direcções a explorar. Um player de hipervídeo em 360º vai permitir aos utilizadores movimentarem-se à volta para visualizar o resto do conteúdo e aceder de forma fácil às informações fornecidas pelas hiperligações. O vídeo é um tipo de informação muito rico que apresenta uma enorme quantidade de informação que muda ao longo do tempo. Um vídeo em 360º apresenta ainda mais informações ao mesmo tempo e acrescenta desafios, pois nem tudo está dentro do nosso campo de visão. No entanto, proporciona ao utilizador uma nova experiência de visualização potencialmente imersiva. Exploramos técnicas de navegação para ajudar os utilizadores a compreenderem e navegarem facilmente um espaço de hipervídeo a 360º e proporcionar uma experiência de visualização a outro nível, através dum espaço hipermédia imersivo. As hiperligações levam o utilizador para outros conteúdos hipermédia relacionados, tais como textos, imagens e vídeos ou outras páginas na Web. Depois de terminar a reprodução ou visualização dos conteúdos relacionados, o utilizador poderá retornar à posição anterior no vídeo. Através da utilização de técnicas de sumarização, podemos ainda fornecer aos utilizadores um sumário de todo o conteúdo do vídeo para que possam visualizá-lo e compreendê-lo duma forma mais eficiente e flexível, sem necessitar de visualizar o vídeo todo em sequência. O vídeo tem provado ser uma das formas mais eficientes de comunicação, permitindo a apresentação de um leque enorme e variado de informação num curto período de tempo. Os vídeos em 360º podem fornecer ainda mais informação, podendo ser mapeados sobre projecções cilíndricas ou esféricas. A projecção cilíndrica foi inventada em 1796 pelo pintor Robert Barker de Edimburgo que obteve a sua patente. A utilização de vídeo na Web tem consistido essencialmente na sua inclusão nas páginas, onde são visualizados de forma linear, e com interacções em geral limitadas às acções de play e pause, fast forward e reverse. Nos últimos anos, os avanços mais promissores no sentido do vídeo interactivo parecem ser através de hipervídeo, proporcionando uma verdadeira integração do vídeo em espaços hipermédia, onde o conteúdo pode ser estruturado e navegado através de hiperligações definidas no espaço e no tempo e de mecanismos de navegação interactivos flexíveis. Ao estender o conceito de hipervídeo para 360º, surgem novos desafios, principalmente porque grande parte do conteúdo está fora do campo de visão. O player de hipervídeo a 360º tem que fornecer aos utilizadores mecanismos apropriados para facilitar a percepção da estrutura do hipervídeo, para navegar de forma eficiente no espaço hipervídeo a 360º e idealmente proporcionar uma experiência imersiva. Para poder navegar num espaço hipervídeo a 360º, necessitamos de novos mecanismos de navegação. Apresentamos os principais mecanismos concebidos para visualização deste tipo de hipervídeo e soluções para os principais desafios em hipermédia: desorientação e sobrecarga cognitiva, agora no contexto de 360º. Focamos, essencialmente, os mecanismos de navegação que ajudam o utilizador a orientar-se no espaço de 360º. Desenvolvemos uma interface que funciona por arrastamento para a navegação no vídeo em 360º. Esta interface permite que o utilizador movimente o vídeo para visualizar o conteúdo em diferentes ângulos. O utilizador só precisa de arrastar o cursor para a esquerda ou para a direita para movimentar o campo de visão. Pode no entanto movimentar-se apenas para um dos lados para dar a volta sem qualquer tipo de limitação. A percepção da localização e do ângulo de visualização actual tornou-se um problema devido à falta de limites laterais. Durante os nossos testes, muitos utilizadores sentiram-se perdidos no espaço de 360º, sem saber que ângulos é que estavam a visualizar. Em hipervídeo, a percepção de hiperligações é mais desafiante do que em hipermédia tradicional porque as hiperligações podem ter duração, podem coexistir no tempo e no espaço e o vídeo muda ao longo do tempo. Assim, são precisos mecanismos especiais, para torná-las perceptíveis aos utilizadores. Em hipervídeo em 360º, grande parte do conteúdo é invisível ao utilizador por não estar no campo de visão, logo será necessário estudar novas abordagens e mecanismos para indicar a existência de hiperligações. Criámos os Hotspots Availability e Location Indicators para permitir aos utilizadores saberem a existência e a localização de cada uma das hiperligações. O posicionamento dos indicadores de hotspots availabity no eixo da ordenada, nas margens laterais do vídeo, serve para indicar em que posição vertical está cada uma das hiperligações. O tamanho do indicador serve para indicar a distância do hotspot em relação ao ângulo de visualização. Quanto mais perto fica o hotspot, maior é o indicador. Os indicadores são semi-transparentes e estão posicionados nas margens laterais para minimizar o impacto que têm sobre o conteúdo do vídeo. O Mini Map também fornece informações acerca da existência e localização de hotspots, que deverão conter alguma informação do conteúdo de destino, para que o utilizador possa ter alguma expectativa acerca do que vai visualizar depois de seguir a hiperligação. Uma caixa de texto com aspecto de balão de banda desenhada permite acomodar várias informações relevantes. Quando os utilizadores seleccionam o hotspot, poderão ser redireccionados para um tempo pré-definido do vídeo ou uma página com informação adicional ou a selecção pode ser memorizada pelo sistema e o seu conteúdo ser mostrado apenas quando o utilizador desejar, dependendo do tipo de aplicação. Por exemplo, se a finalidade do vídeo for o apoio à aprendizagem (e-learning), pode fazer mais sentido abrir logo o conteúdo da hiperligação, pois os utilizadores estão habituados a ver aquele tipo de informação passo a passo. Se o vídeo for de entretenimento, os utilizadores provavelmente não gostam de ser interrompidos pela abertura do novo conteúdo, podendo optar pela memorização da hiperligação, e pelo seu acesso posterior, quando quiserem. Para além do título e da descrição do vídeo, o modo Image Map fornece uma visualização global do conteúdo do vídeo. As pré-visualizações (thumbnails) referem-se às cenas do vídeo e são representadas através duma projecção cilíndrica, para que todo o conteúdo ao longo do tempo possa ser visualizado. Permite também, de forma sincronizada, saber a cena actual e oferece ao utilizador a possibilidade de navegar para outras cenas. Toda a área de pré-visualização é sensível ao clique e determina as coordenadas da pré-visualização que o utilizador seleccionou. Uma versão mais condensada disponibiliza apenas a pré-visualização da parte central de cada uma das cenas. Permite a apresentação simultânea de um maior número de cenas, mas limita a visualização e a flexibilidade para navegar para o ângulo desejado de forma mais directa. Algumas funcionalidades também foram adicionadas à linha de tempo (timeline), ou Barra de Progresso. Para além dos tradicionais botões de Play, Pause e Tempo de Vídeo, estendemos a barra para adaptar a algumas características de uma página Web. Como é um Player desenvolvido para funcionar na internet, precisamos de ter em conta que é preciso tempo para carregar o vídeo. A barra de bytes loaded indica ao utilizador o progresso do carregamento do vídeo e não permite que o utilizador aceda às informações que ainda não foram carregadas. O hiperespaço é navegado em contextos espácio-temporais que a história recorda. A barra de memória, Memory Bar, fornece informação ao utilizador acerca das partes do vídeo que já foram visualizadas. O botão Toogle Full Screen alterna o modo de visualização do vídeo entre full e standard screen . O modo full screen leva o utilizador para fora das limitações do browser e maximiza o conteúdo do vídeo para o tamanho do ecrã. É mais um passo para um modo de visualização imersiva, por exemplo numa projecção 360º dentro duma Cave, como estamos a considerar explorar em trabalho futuro. Nesta dissertação, apresentamos uma abordagem para a visualização e interacção de vídeos em 360º. A navegação num espaço de vídeo em 360º apresenta uma nova experiência para grande parte das pessoas e não existem ainda intuições consistentes sobre o comportamento deste tipo de navegação. Os utilizadores, muito provavelmente, vão sentir o problema que inicialmente houve com o hipertexto, em que o utilizador se sentia perdido no hiperespaço. Por isso, o Player de Hipervídeo a 360º tem que ser o mais claro e eficaz possível para que os utilizadores possam interagir facilmente. O teste de usabilidade foi feito com base no questionário USE e entrevistas aos utilizadores de modo a determinar a usabilidade e experiência de acordo com os seus comentários, sugestões e preocupações sobre as funcionalidades, mecanismos de acesso ou de representação de informação fornecidos. Os resultados dos testes e comentários obtidos, permitiu-nos obter mais informação sobre a usabilidade do player e identificar as possíveis melhorias. Em resumo, os comentários dos utilizadores foram muito positivos e úteis que nos ajudará a continuar a trabalhar na investigação do Hipervídeo 360º. O trabalho futuro consiste na realização de mais testes de usabilidade e desenvolvimento de diferentes versões do Player de Hipervídeo em 360º, com mecanismos de navegação revistos e estendidos, com base nos resultados das avaliações. O Player de Hipervídeo em 360º não deverá ser apenas uma aplicação para Web, deverá poder integrar com quiosques multimédia ou outras instalações imersivas. Provavelmente serão necessárias novas funcionalidades e tipos de navegação para adaptar a diferentes contextos. O exemplo do Player de Hipervídeo em 360º apresentado neste artigo utiliza um Web browser e um rato como meio de apresentação e interacção. Com o crescimento das tecnologias de vídeo 3D, multi-toque e eye-tracking, podem surgir novas formas de visualização e de interacção com o espaço 360º. Estas novas formas trazem novos desafios mas também um potencial acrescido de novas experiências a explorar.In traditional video, the user is locked to the angle where the camera was pointing to during the capture of the video. With 360º video recording, there are no longer these boundaries, and 360º video capturing devices are becoming more common and affordable to the general public. Hypervideo stretches boundaries even further, allowing to explore the video and to navigate to related information. By extending the hypervideo concept into the 360º video, which we call 360º hypervideo, new challenges arise. Challenges for presenting this type of hypervideo include: providing users with an appropriate interface capable to explore 360º contents, where the video should change perspective so that the users actually get the feeling of looking around; and providing the appropriate affordances to understand the hypervideo structure and to navigate it effectively in a 360º hypervideo space, even when link opportunities arise in places outside the current viewport. In this thesis, we describe an approach to the design and development of an immersive and interactive interface for the visualization and navigation of 360º hypervideos. Such interface allow users to pan around to view the contents in different angles and effectively access related information through the hyperlinks. Then a user study was conducted to evaluate the 360º Hypervideo Player’s user interface and functionalities. By collecting specific and global comments, concerns and suggestions for functionalities and access mechanisms that would allow us to gain more awareness about the player usability and identify directions for improvements and finally we draw some conclusions and opens perspectives for future work

Universidade de Lisboa: Repositório.UL

Eyewear Computing \u2013 Augmenting the Human with Head-Mounted Wearable Assistants

Author: Bulling Andreas
Cucchiara Rita
Kunze Kai
Rehg James
Publication venue
Publication date: 01/01/2016
Field of study

The seminar was composed of workshops and tutorials on head-mounted eye tracking, egocentric vision, optics, and head-mounted displays. The seminar welcomed 30 academic and industry researchers from Europe, the US, and Asia with a diverse background, including wearable and ubiquitous computing, computer vision, developmental psychology, optics, and human-computer interaction. In contrast to several previous Dagstuhl seminars, we used an ignite talk format to reduce the time of talks to one half-day and to leave the rest of the week for hands-on sessions, group work, general discussions, and socialising. The key results of this seminar are 1) the identification of key research challenges and summaries of breakout groups on multimodal eyewear computing, egocentric vision, security and privacy issues, skill augmentation and task guidance, eyewear computing for gaming, as well as prototyping of VR applications, 2) a list of datasets and research tools for eyewear computing, 3) three small-scale datasets recorded during the seminar, 4) an article in ACM Interactions entitled \u201cEyewear Computers for Human-Computer Interaction\u201d, as well as 5) two follow-up workshops on \u201cEgocentric Perception, Interaction, and Computing\u201d at the European Conference on Computer Vision (ECCV) as well as \u201cEyewear Computing\u201d at the ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)

Archivio istituzionale della ricerca - Università di Modena e Reggio Emilia

Recommended from our members

Holoscopic 3D image depth estimation and segmentation techniques

Author: Alazawi Eman
Publication venue: Brunel University London
Publication date: 01/01/2015
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonToday’s 3D imaging techniques offer significant benefits over conventional 2D imaging techniques. The presence of natural depth information in the scene affords the observer an overall improved sense of reality and naturalness. A variety of systems attempting to reach this goal have been designed by many independent research groups, such as stereoscopic and auto-stereoscopic systems. Though the images displayed by such systems tend to cause eye strain, fatigue and headaches after prolonged viewing as users are required to focus on the screen plane/accommodation to converge their eyes to a point in space in a different plane/convergence. Holoscopy is a 3D technology that targets overcoming the above limitations of current 3D technology and was recently developed at Brunel University. This work is part W4.1 of the 3D VIVANT project that is funded by the EU under the ICT program and coordinated by Dr. Aman Aggoun at Brunel University, West London, UK. The objective of the work described in this thesis is to develop estimation and segmentation techniques that are capable of estimating precise 3D depth, and are applicable for holoscopic 3D imaging system. Particular emphasis is given to the task of automatic techniques i.e. favours algorithms with broad generalisation abilities, as no constraints are placed on the setting. Algorithms that provide invariance to most appearance based variation of objects in the scene (e.g. viewpoint changes, deformable objects, presence of noise and changes in lighting). Moreover, have the ability to estimate depth information from both types of holoscopic 3D images i.e. Unidirectional and Omni-directional which gives horizontal parallax and full parallax (vertical and horizontal), respectively. The main aim of this research is to develop 3D depth estimation and 3D image segmentation techniques with great precision. In particular, emphasis on automation of thresholding techniques and cues identifications for development of robust algorithms. A method for depth-through-disparity feature analysis has been built based on the existing correlation between the pixels at a one micro-lens pitch which has been exploited to extract the viewpoint images (VPIs). The corresponding displacement among the VPIs has been exploited to estimate the depth information map via setting and extracting reliable sets of local features. ii Feature-based-point and feature-based-edge are two novel automatic thresholding techniques for detecting and extracting features that have been used in this approach. These techniques offer a solution to the problem of setting and extracting reliable features automatically to improve the performance of the depth estimation related to the generalizations, speed and quality. Due to the resolution limitation of the extracted VPIs, obtaining an accurate 3D depth map is challenging. Therefore, sub-pixel shift and integration is a novel interpolation technique that has been used in this approach to generate super-resolution VPIs. By shift and integration of a set of up-sampled low resolution VPIs, the new information contained in each viewpoint is exploited to obtain a super resolution VPI. This produces a high resolution perspective VPI with wide Field Of View (FOV). This means that the holoscopic 3D image system can be converted into a multi-view 3D image pixel format. Both depth accuracy and a fast execution time have been achieved that improved the 3D depth map. For a 3D object to be recognized the related foreground regions and depth information map needs to be identified. Two novel unsupervised segmentation methods that generate interactive depth maps from single viewpoint segmentation were developed. Both techniques offer new improvements over the existing methods due to their simple use and being fully automatic; therefore, producing the 3D depth interactive map without human interaction. The final contribution is a performance evaluation, to provide an equitable measurement for the extent of the success of the proposed techniques for foreground object segmentation, 3D depth interactive map creation and the generation of 2D super-resolution viewpoint techniques. The no-reference image quality assessment metrics and their correlation with the human perception of quality are used with the help of human participants in a subjective manner

Brunel University Research Archive

Enhanced life-size holographic telepresence framework with real-time three-dimensional reconstruction for dynamic scene

Author: Fadzli Fazliaty Edora
Publication venue
Publication date: 01/01/2022
Field of study

Three-dimensional (3D) reconstruction has the ability to capture and reproduce 3D representation of a real object or scene. 3D telepresence allows the user to feel the presence of remote user that was remotely transferred in a digital representation. Holographic display is one of alternatives to discard wearable hardware restriction, it utilizes light diffraction to display 3D images to the viewers. However, to capture a real-time life-size or a full-body human is still challenging since it involves a dynamic scene. The remaining issue arises when dynamic object to be reconstructed is always moving and changes shapes and required multiple capturing views. The life-size data captured were multiplied exponentially when working with more depth cameras, it can cause the high computation time especially involving dynamic scene. To transfer high volume 3D images over network in real-time can also cause lag and latency issue. Hence, the aim of this research is to enhance life-size holographic telepresence framework with real-time 3D reconstruction for dynamic scene. There are three stages have been carried out, in the first stage the real-time 3D reconstruction with the Marching Square algorithm is combined during data acquisition of dynamic scenes captured by life-size setup of multiple Red Green Blue-Depth (RGB-D) cameras. Second stage is to transmit the data that was acquired from multiple RGB-D cameras in real-time and perform double compression for the life-size holographic telepresence. The third stage is to evaluate the life-size holographic telepresence framework that has been integrated with the real-time 3D reconstruction of dynamic scenes. The findings show that by enhancing life-size holographic telepresence framework with real-time 3D reconstruction, it has reduced the computation time and improved the 3D representation of remote user in dynamic scene. By running the double compression for the life-size holographic telepresence, 3D representations in life-size is smooth. It has proven can minimize the delay or latency during acquired frames synchronization in remote communications

Universiti Teknologi Malaysia Institutional Repository