Search CORE

491 research outputs found

Perceptual modelling for 2D and 3D

Author: Bosc Emilie
Gautier Josselin
Le Meur Olivier
Ricordel Vincent
Wang Junle
Publication venue: HAL CCSD
Publication date: 01/11/2010
Field of study

Livrable D1.1 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D1.1 du projet

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Methods for reducing visual discomfort in stereoscopic 3D: A review

Author: Akeley
Bando
Banks
Basha
Blohm
Carnegie
Chang
Chen
Chen
Chen
Choi
Fry
Harris
Heinzle
Hoffman
Hoffman
Holliman
Hong
Howarth
Hwang
Iatsun
Ideses
Jiang
Jiang
Jung
Jung
Jung
Jung
Jung
Kang
Kasim Terzić
Kim
Kim
Kim
Kim
Kim
Kitrosser
Konrad
Kooi
Koppal
Lambooij
Lambooij
Lang
Le Callet
Lee
Lee
Lee
Lee
Leroy
Li
Li
Li
Li
Lipton
Liu
Love
López
Ma
MacKenzie
MacKenzie
Masia
McIntire
Meesters
Mendiburu
Miles Hansard
Moorthy
Mu
Nojiri
Oh
Oh
Pajak
Park
Park
Park
Park
Percival
Pritch
Qi
Read
Rolland
Sakamoto
Sanftmann
Scher
Schor
Schor
Schor
Schor
Seuntiëns
Shao
Shao
Sheard
Sheedy
Shibata
Shibata
Shiwa
Sohn
Sohn
Sohn
Solimini
Tasli
Templin
Torii
Urvoy
Wang
Wang
Wang
Wang
Ware
Winkler
Wopking
Xia
Xue
Yan
Yano
Yoo
Yun
Zellinger
Zeng
Zeri
Zhang
Zhou
Zitnick
Publication venue: 'Elsevier BV'
Publication date: 11/08/2016
Field of study

This work was supported by the EPSRC Grant EP/M01469X/1, “Geometric Evaluation of Stereoscopic Video”

Saliency-aware Stereoscopic Video Retargeting

Author: Imani Hassan
Islam Md Baharul
Wong Lai-Kuan
Publication venue
Publication date: 18/04/2023
Field of study

Stereo video retargeting aims to resize an image to a desired aspect ratio. The quality of retargeted videos can be significantly impacted by the stereo videos spatial, temporal, and disparity coherence, all of which can be impacted by the retargeting process. Due to the lack of a publicly accessible annotated dataset, there is little research on deep learning-based methods for stereo video retargeting. This paper proposes an unsupervised deep learning-based stereo video retargeting network. Our model first detects the salient objects and shifts and warps all objects such that it minimizes the distortion of the salient parts of the stereo frames. We use 1D convolution for shifting the salient objects and design a stereo video Transformer to assist the retargeting process. To train the network, we use the parallax attention mechanism to fuse the left and right views and feed the retargeted frames to a reconstruction module that reverses the retargeted frames to the input frames. Therefore, the network is trained in an unsupervised manner. Extensive qualitative and quantitative experiments and ablation studies on KITTI stereo 2012 and 2015 datasets demonstrate the efficiency of the proposed method over the existing state-of-the-art methods. The code is available at https://github.com/z65451/SVR/.Comment: 8 pages excluding references. CVPRW conferenc

arXiv.org e-Print Archive

Study of depth bias of observers in free viewing of still stereoscopic synthetic stimuli

Author: Callet Patrick Le
Ricordel Vincent
Silva Matthieu Perreira Da
Tourancheau Sylvain
Wang Junle
Publication venue: University of Bern
Publication date: 06/09/2012
Field of study

Observers’ fixations exhibit a marked bias towards certain areas on the screen when viewing scenes on computer monitors. For instance, there exists a well-known “center-bias” which means that fixations are biased towards the center of the screen during the viewing of 2D still images. In the viewing of 3D content, stereoscopic displays enhance depth perception by the mean of binocular parallax. This additional depth cue has a great influence on guiding eye movements. Relatively little is known about the impact of binocular parallax on visual attention of the 3D content displayed on stereoscopic screen. Several studies mentioned that people tend to look preferably at the objects located at certain positions in depth. But studies proving or quantifying this depth-bias are still limited. In this paper, we conducted a binocular eye-tracking experiment by showing synthetic stimuli on a stereoscopic display. Observers were required to do a free-viewing task through passive polarized glasses. Gaze positions of both eyes were recorded and the depth of eyes’ fixation was determined. The stimuli used in the experiment were designed in such a way that the center-bias and the depth-bias affect eye movements individually. Results indicate the existence of a depth-bias: objects closer to the viewer attract attention earlier than distant objects, and the number of fixations located on objects varies as a function of objects’ depth. The closest object in a scene always attracts most fixations. The fixation distribution along depth also shows a convergent behavior as the viewing time increases

Journal of Eye Movement Research

Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

Author: A. Bartoli
A. Groch
A. Kolb
Ali
Audette
Bachta
Bailey
Barnard
Baumhauer
Benincasa
Besl
Blake
Bogatyrenko
Bronstein
Brown
Burschka
Böhme
Cash
Cash
Chen
Chen
Chen
Chen
Clancy
Clancy
Clatz
Cleary
Clements
Criminisi
Cryer
D. Elson
D. Stoyanov
Dumpuri
Durrant-Whyte
Elhawary
Falk
Faugeras
Fayad
Feuerstein
Fichtinger
Foix
Fuchs
Galvez-Lopez
Giannarou
Ginhoux
Glocker
Gorthi
Gudmundsson
H. Elhawary
Haneishi
Hartley
Hayashibe
Horn
Hu
Huhle
Huhle
Ieiri
Iftimia
J. Sorger
Jannin
Jannin
Jerabkova
Jin
Kolmogorov
Konishi
Kowalczuk
L. Maier-Hein
Lindner
Lindner
Lipman
M. Rodrigues
Maier-Hein
Marchesseau
Marescaux
Markelj
Marr
Marr
Marvik
Megali
Mersmann
Mezger
Miller
Mirota
Mountney
Mutter
Nalpantidis
Nicolau
Nozaki
Okatani
Ortmaier
P. Mountney
Pavlidis
Perriollat
Pilet
Pizarro
Placht
Pluim
Pratt
Rauth
Richa
Robinson
Röhl
S. Speidel
Salvi
Salzmann
Sauvee
Schaller
Scharstein
Schmalz
Shekhar
Simpfendorfer
Simpson
Soper
Stoyanov
Su
Szpala
Taffinder
Thrun
Thrun
Totz
Ukimura
Ullman
van Kaick
Vigneron
Warren
Wentz
Wittek
Wittek
Wolf
Wu
Wu
Wu
Wöhler
Yip
Yoon
Zhang
Zhang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 03/05/2013
Field of study

One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

Spiral - Imperial College Digital Repository

D-SAV360: A Dataset of Gaze Scanpaths on 360° Ambisonic Videos

Author: Bernal-Berdún Edurne
Gutierrez Diego
Malpica Sandra
Martín Daniel
Masia Belén
Pérez Pedro J.
Serrano Ana
Publication venue
Publication date: 01/01/2023
Field of study

Understanding human visual behavior within virtual reality environments is crucial to fully leverage their potential. While previous research has provided rich visual data from human observers, existing gaze datasets often suffer from the absence of multimodal stimuli. Moreover, no dataset has yet gathered eye gaze trajectories (i.e., scanpaths) for dynamic content with directional ambisonic sound, which is a critical aspect of sound perception by humans. To address this gap, we introduce D-SAV360, a dataset of 4,609 head and eye scanpaths for 360° videos with first-order ambisonics. This dataset enables a more comprehensive study of multimodal interaction on visual behavior in virtual reality environments. We analyze our collected scanpaths from a total of 87 participants viewing 85 different videos and show that various factors such as viewing mode, content type, and gender significantly impact eye movement statistics. We demonstrate the potential of D-SAV360 as a benchmarking resource for state-of-the-art attention prediction models and discuss its possible applications in further research. By providing a comprehensive dataset of eye movement data for dynamic, multimodal virtual environments, our work can facilitate future investigations of visual behavior and attention in virtual reality

Predictive Model of Driver\u27s Eye Fixation for Maneuver Prediction in the Design of Advanced Driving Assistance Systems

Author: Shirpour Mohsen
Publication venue: Scholarship@Western
Publication date: 09/04/2021
Field of study

Over the last few years, Advanced Driver Assistance Systems (ADAS) have been shown to significantly reduce the number of vehicle accidents. Accord- ing to the National Highway Traffic Safety Administration (NHTSA), driver errors contribute to 94% of road collisions. This research aims to develop a predictive model of driver eye fixation by analyzing the driver eye and head information (cephalo-ocular) for maneuver prediction in an Advanced Driving Assistance System (ADAS). Several ADASs have been developed to help drivers to perform driving tasks in complex environments and many studies were conducted on improving automated systems. Some research has relied on the fact that the driver plays a crucial role in most driving scenarios, recognizing the driver’s role as the central element in ADASs. The way in which a driver monitors the surrounding environment is at least partially descriptive of the driver’s situation awareness. This thesis’s primary goal is the quantitative and qualitative analysis of driver behavior to determine the relationship between driver intent and actions. The RoadLab initiative provided an instrumented vehicle equipped with an on-board diagnostic system, an eye-gaze tracker, and a stereo vision system for the extraction of relevant features from the driver, the vehicle, and the environment. Several driver behavioral features are investigated to determine whether there is a relevant relation between the driver’s eye fixations and the prediction of driving maneuvers

Scholarship@Western

Multimodal Stereoscopic Movie Summarization Conforming to Narrative Characteristics

Author: Mademlis Ioannis
Nikolaidis Nikos
Pitas Ioannis
Tefas Anastasios
Publication venue
Publication date: 19/10/2016
Field of study

Video summarization is a timely and rapidly developing research field with broad commercial interest, due to the increasing availability of massive video data. Relevant algorithms face the challenge of needing to achieve a careful balance between summary compactness, enjoyability, and content coverage. The specific case of stereoscopic 3D theatrical films has become more important over the past years, but not received corresponding research attention. In this paper, a multi-stage, multimodal summarization process for such stereoscopic movies is proposed, that is able to extract a short, representative video skim conforming to narrative characteristics from a 3D film. At the initial stage, a novel, low-level video frame description method is introduced (frame moments descriptor) that compactly captures informative image statistics from luminance, color, optical flow, and stereoscopic disparity video data, both in a global and in a local scale. Thus, scene texture, illumination, motion, and geometry properties may succinctly be contained within a single frame feature descriptor, which can subsequently be employed as a building block in any key-frame extraction scheme, e.g., for intra-shot frame clustering. The computed key-frames are then used to construct a movie summary in the form of a video skim, which is post-processed in a manner that also considers the audio modality. The next stage of the proposed summarization pipeline essentially performs shot pruning, controlled by a user-provided shot retention parameter, that removes segments from the skim based on the narrative prominence of movie characters in both the visual and the audio modalities. This novel process (multimodal shot pruning) is algebraically modeled as a multimodal matrix column subset selection problem, which is solved using an evolutionary computing approach. Subsequently, disorienting editing effects induced by summarization are dealt with, through manipulation of the video skim. At the last step, the skim is suitably post-processed in order to reduce stereoscopic video defects that may cause visual fatigue

ZENODO

Perceptually driven stereoscopic camera control in 3D virtual environments

Author: Kevinç Elif Bengü
Publication venue: Bilkent University
Publication date: 01/01/2013
Field of study

Ankara : The Department of Computer Engineering and the Graduate School of Engineering and Science of Bilkent University, 2013.Thesis (Master's) -- Bilkent University, 2013.Includes bibliographical references leaves 56-59.Depth notion and how to perceive depth have long been studied in the eld of psychology, physiology, and even art. Human visual perception enables to perceive spatial layout of the outside world by using visual depth cues. Binocular disparity among these depth cues, is based on the separation between two di erent views that are observed by two eyes. Disparity concept constitutes the base of the construction of the stereoscopic vision. Emerging technologies try to replicate binocular disparity principles in order to provide 3D illusion and stereoscopic vision. However, the complexity of applying the underlying principles of 3D perception, confronted researchers the problem of wrongly produced stereoscopic contents. It is still a great challenge to give realistic but also comfortable 3D experience. In this work, we present a camera control mechanism: a novel approach for disparity control and a model for path generation. We try to address the challenges of stereoscopic 3D production by presenting comfortable viewing experience to users. Therefore, our disparity system approaches the accommodation/convergence con- ict problem, which is the most known issue that causes visual fatigue in stereo systems, by taking objects' importance into consideration. Stereo camera parameters are calculated automatically with an optimization process. In the second part of our control mechanism, the camera path is constructed for a given 3D environment and scene elements. Moving around important regions of objects is a desired scene exploration task. In this respect, object saliencies are used for viewpoint selection around scene elements. Path structure is generated by using linked B ezier curves which assures to pass through pre-determined viewpoints. Though there is considerable amount of research found in the eld of stereo creation, we believe that approaching this problem from scene content aspect provides a uniquely promising experience. We validate our assumption with user studies in which our method and existing two other disparity control models are compared. The study results show that our method shows superior results in quality, depth, and comfort.Kevinç, Elif BengüM.S