82 research outputs found

    The Video Mesh: A Data Structure for Image-based Three-dimensional Video Editing

    Get PDF
    This paper introduces the video mesh, a data structure for representing video as 2.5D “paper cutouts.” The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. The user rotoscopes occluding contours and we introduce an algorithm to cut the video mesh along them. Object boundaries are refined with per-pixel alpha values. The video mesh is at its core a set of texture mapped triangles, we leverage graphics hardware to enable interactive editing and rendering of a variety of effects. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depth-of-field manipulation, and 2D to 3D video conversion

    Subjective responses and eye fixations to visual displays of spatial sequences

    Get PDF
    In the selection of spatial sequences for this study importance was given to an understanding of the architect's design intentions, Chapters IV and VIII. The displays of the Sydney Opera House have revealed an incomplete structure, and the contrast of some completed spaces with substantial environmental noise of scaffold and workmens' ramps has been an important part of the study. In many cases the provision of a temporary path through space has been accepted by a group of subjects, and its importance "has been illustrated from eye fixations to the displays.In Displays 3, 4, 6, 7, 8, 9, 11 and 12 a conflict has not generally arisen between the designer's intentions of subjective experience and the responses of the subjects. But in Displays 1, 2, 5 and 10 the architect's design for the completed building and the recording of space in the incomplete building are at variance.The forty subjects who have taken part in the experiments represented a range of architectural training in addition to variations of personality and general conditioning. The group of subjects was not a sample of the general population, and for this reason the average creativity score was considerably higher than the average for the general population

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Generative RGB-D face completion for head-mounted display removal

    Get PDF
    Head-mounted displays (HMDs) are an essential display device for the observation of virtual reality (VR) environments. However, HMDs obstruct external capturing methods from recording the user's upper face. This severely impacts social VR applications, such as teleconferencing, which commonly rely on external RGB-D sensors to capture a volumetric representation of the user. In this paper, we introduce an HMD removal framework based on generative adversarial networks (GANs), capable of jointly filling in missing color and depth data in RGB-D face images. Our framework includes an RGB-based identity loss function for identity preservation and several components aimed at surface reproduction. Our results demonstrate that our framework is able to remove HMDs from synthetic RGB-D face images while preserving the subject's identity

    Efficient and High-Quality Rendering of Higher-Order Geometric Data Representations

    Get PDF
    Computer-Aided Design (CAD) bezeichnet den Entwurf industrieller Produkte mit Hilfe von virtuellen 3D Modellen. Ein CAD-Modell besteht aus parametrischen Kurven und FlĂ€chen, in den meisten FĂ€llen non-uniform rational B-Splines (NURBS). Diese mathematische Beschreibung wird ebenfalls zur Analyse, Optimierung und PrĂ€sentation des Modells verwendet. In jeder dieser Entwicklungsphasen wird eine unterschiedliche visuelle Darstellung benötigt, um den entsprechenden Nutzern ein geeignetes Feedback zu geben. Designer bevorzugen beispielsweise illustrative oder realistische Darstellungen, Ingenieure benötigen eine verstĂ€ndliche Visualisierung der Simulationsergebnisse, wĂ€hrend eine immersive 3D Darstellung bei einer Benutzbarkeitsanalyse oder der Designauswahl hilfreich sein kann. Die interaktive Darstellung von NURBS-Modellen und -Simulationsdaten ist jedoch aufgrund des hohen Rechenaufwandes und der eingeschrĂ€nkten HardwareunterstĂŒtzung eine große Herausforderung. Diese Arbeit stellt vier neuartige Verfahren vor, welche sich mit der interaktiven Darstellung von NURBS-Modellen und Simulationensdaten befassen. Die vorgestellten Algorithmen nutzen neue FĂ€higkeiten aktueller Grafikkarten aus, um den Stand der Technik bezĂŒglich QualitĂ€t, Effizienz und Darstellungsgeschwindigkeit zu verbessern. Zwei dieser Verfahren befassen sich mit der direkten Darstellung der parametrischen Beschreibung ohne Approximationen oder zeitaufwĂ€ndige Vorberechnungen. Die dabei vorgestellten Datenstrukturen und Algorithmen ermöglichen die effiziente Unterteilung, Klassifizierung, Tessellierung und Darstellung getrimmter NURBS-FlĂ€chen und einen interaktiven Ray-Casting-Algorithmus fĂŒr die IsoflĂ€chenvisualisierung von NURBSbasierten isogeometrischen Analysen. Die weiteren zwei Verfahren beschreiben zum einen das vielseitige Konzept der programmierbaren Transparenz fĂŒr illustrative und verstĂ€ndliche Visualisierungen tiefenkomplexer CAD-Modelle und zum anderen eine neue hybride Methode zur Reprojektion halbtransparenter und undurchsichtiger Bildinformation fĂŒr die Beschleunigung der Erzeugung von stereoskopischen Bildpaaren. Die beiden letztgenannten AnsĂ€tze basieren auf rasterisierter Geometrie und sind somit ebenfalls fĂŒr normale Dreiecksmodelle anwendbar, wodurch die Arbeiten auch einen wichtigen Beitrag in den Bereichen der Computergrafik und der virtuellen RealitĂ€t darstellen. Die Auswertung der Arbeit wurde mit großen, realen NURBS-DatensĂ€tzen durchgefĂŒhrt. Die Resultate zeigen, dass die direkte Darstellung auf Grundlage der parametrischen Beschreibung mit interaktiven Bildwiederholraten und in subpixelgenauer QualitĂ€t möglich ist. Die EinfĂŒhrung programmierbarer Transparenz ermöglicht zudem die Umsetzung kollaborativer 3D Interaktionstechniken fĂŒr die Exploration der Modelle in virtuellenUmgebungen sowie illustrative und verstĂ€ndliche Visualisierungen tiefenkomplexer CAD-Modelle. Die Erzeugung stereoskopischer Bildpaare fĂŒr die interaktive Visualisierung auf 3D Displays konnte beschleunigt werden. Diese messbare Verbesserung wurde zudem im Rahmen einer Nutzerstudie als wahrnehmbar und vorteilhaft befunden.In computer-aided design (CAD), industrial products are designed using a virtual 3D model. A CAD model typically consists of curves and surfaces in a parametric representation, in most cases, non-uniform rational B-splines (NURBS). The same representation is also used for the analysis, optimization and presentation of the model. In each phase of this process, different visualizations are required to provide an appropriate user feedback. Designers work with illustrative and realistic renderings, engineers need a comprehensible visualization of the simulation results, and usability studies or product presentations benefit from using a 3D display. However, the interactive visualization of NURBS models and corresponding physical simulations is a challenging task because of the computational complexity and the limited graphics hardware support. This thesis proposes four novel rendering approaches that improve the interactive visualization of CAD models and their analysis. The presented algorithms exploit latest graphics hardware capabilities to advance the state-of-the-art in terms of quality, efficiency and performance. In particular, two approaches describe the direct rendering of the parametric representation without precomputed approximations and timeconsuming pre-processing steps. New data structures and algorithms are presented for the efficient partition, classification, tessellation, and rendering of trimmed NURBS surfaces as well as the first direct isosurface ray-casting approach for NURBS-based isogeometric analysis. The other two approaches introduce the versatile concept of programmable order-independent semi-transparency for the illustrative and comprehensible visualization of depth-complex CAD models, and a novel method for the hybrid reprojection of opaque and semi-transparent image information to accelerate stereoscopic rendering. Both approaches are also applicable to standard polygonal geometry which contributes to the computer graphics and virtual reality research communities. The evaluation is based on real-world NURBS-based models and simulation data. The results show that rendering can be performed directly on the underlying parametric representation with interactive frame rates and subpixel-precise image results. The computational costs of additional visualization effects, such as semi-transparency and stereoscopic rendering, are reduced to maintain interactive frame rates. The benefit of this performance gain was confirmed by quantitative measurements and a pilot user study

    Tele-Robotics VR with Holographic Vision in Immersive Video

    Get PDF
    We present a first-of-its-kind end-to-end tele-robotic VR system where the user operates a robot arm remotely, while being virtually immersed into the scene through force feedback and holographic vision. In contrast to stereoscopic head mounted displays that only provide depth perception to the user, the holographic vision device projects a light field, additionally allowing the user to correctly accommodate his/her eyes to the perceived depth of the scene's objects. The highly improved immersive user experience results in less fatigue in the tele-operator's daily work, creating safer and/or longer working conditions. The core technology relies on recent advances in immersive video coding for audio-visual transmission developed within the MPEG standardization committee. Virtual viewpoints are synthesized for the tele-operator's viewing direction from a couple of colour and depth fixed video feeds. Besides of the display hardware and its GPU-enabled view synthesis driver, the biggest challenge hides in obtaining high-quality and reliable depth images from low-cost depth sensing devices. Specialized depth refinement tools have been developed for running in real- time at zero delay within the end-to-end tele-robotic immersive video pipeline, which must remain interactive by essence. Various modules work asynchronously and efficiently at their own pace, with the acquisition devices typically being limited to 30 frames per second (fps), while the holographic headset updates its projected light field at up to 240 fps. Such modular approach ensures high genericity over a wide range of free navigation VR/XR applications, also beyond the tele-robotic one presented in this paper

    Automatic 2D to Stereoscopic Video Conversion for 3DTV

    Get PDF
    In this thesis we address the problem of automatically converting a video filmed with a single camera to stereoscopic content tailored for viewing using 3D TVs. We present two techniques: (a) a non-parametric approach which does not require extensive training and produces good results for simple rigid scenes and, (b) a deep learning approach able to handle dynamic changes in the scene. The proposed solutions both include two stages: depth generation and rendering. For the first stage, for the non-parametric approach we utilize an energy-based optimization, and for the deep learning approach a multi-scale convolutional neural network to address the complex problem of depth estimation from a single image. Depth maps are generated based on the input RGB images. We reformulate and simplify the process of generating the virtual camera’s depth map and present how this can be used to render an anaglyph image. Anaglyph stereo was used for demonstration only because of the easy and wide availability of red/cyan glasses however, this does not limit the applicability of the proposed technique to other stereo forms. Finally, we have extensively tested the proposed approaches and present the results

    Deep Industrial Image Anomaly Detection: A Survey

    Full text link
    The recent rapid development of deep learning has laid a milestone in industrial Image Anomaly Detection (IAD). In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets. In addition, we extract the new setting from industrial manufacturing and review the current IAD approaches under our proposed our new setting. Moreover, we highlight several opening challenges for image anomaly detection. The merits and downsides of representative network architectures under varying supervision are discussed. Finally, we summarize the research findings and point out future research directions. More resources are available at https://github.com/M-3LAB/awesome-industrial-anomaly-detection

    Automatic 2D-to-3D conversion of single low depth-of-field images

    Get PDF
    This research presents a novel approach to the automatic rendering of 3D stereoscopic disparity image pairs from single 2D low depth-of-field (LDOF) images. Initially a depth map is produced through the assignment of depth to every delineated object and region in the image. Subsequently the left and right disparity images are produced through depth imagebased rendering (DIBR). The objects and regions in the image are initially assigned to one of six proposed groups or labels. Labelling is performed in two stages. The first involves the delineation of the dominant object-of-interest (OOI). The second involves the global object and region grouping of the non-OOI regions. The matting of the OOI is also performed in two stages. Initially the in focus foreground or region-of-interest (ROI) is separated from the out of focus background. This is achieved through the correlation of edge, gradient and higher-order statistics (HOS) saliencies. Refinement of the ROI is performed using k-means segmentation and CIEDE2000 colour-difference matching. Subsequently the OOI is extracted from within the ROI through analysis of the dominant gradients and edge saliencies together with k-means segmentation. Depth is assigned to each of the six labels by correlating Gestalt-based principles with vanishing point estimation, gradient plane approximation and depth from defocus (DfD). To minimise some of the dis-occlusions that are generated through the 3D warping sub-process within the DIBR process the depth map is pre-smoothed using an asymmetric bilateral filter. Hole-filling of the remaining dis-occlusions is performed through nearest-neighbour horizontal interpolation, which incorporates depth as well as direction of warp. To minimising the effects of the lateral striations, specific directional Gaussian and circular averaging smoothing is applied independently to each view, with additional average filtering applied to the border transitions. Each stage of the proposed model is benchmarked against data from several significant publications. Novel contributions are made in the sub-speciality fields of ROI estimation, OOI matting, LDOF image classification, Gestalt-based region categorisation, vanishing point detection, relative depth assignment and hole-filling or inpainting. An important contribution is made towards the overall knowledge base of automatic 2D-to-3D conversion techniques, through the collation of existing information, expansion of existing methods and development of newer concepts
    • 

    corecore