10 research outputs found

    Efficient multiview depth representation based on image segmentation

    Get PDF
    The persistent improvements witnessed in multimedia production have considerably augmented users demand for immersive 3D systems. Expedient implementation of this technology however, entails the need for significant reduction in the amount of information required for representation. Depth image-based rendering algorithms have considerably reduced the number of images necessary for 3D scene reconstruction, nevertheless the compression of depth maps still poses several challenges due to the peculiar nature of the data. To this end, this paper proposes a novel depth representation methodology that exploits the intrinsic correlation present between colour intensity and depth images of a natural scene. A segmentation-based approach is implemented which decreases the amount of information necessary for transmission by a factor of 24 with respect to conventional JPEG algorithms whilst maintaining a quasi identical reconstruction quality of the 3D views.peer-reviewe

    Performance improvement of segmentation-based depth representation in 3D imagery by region merging

    Get PDF
    The feasible implementation of immersive 3D video systems entails the need for a substantial reduction in the amount of image information necessary for representation. Multiview image rendering algorithms based on depth data have radically reduced the number of images required to reconstruct a 3D scene. Nonetheless, the compression of depth maps still poses several challenges due to the particular nature and characteristics of the data. To this end, this paper outlines a depth representation technique, developed in our earlier work, that exploits the correlation intrinsically present between color intensity and depth images capturing a natural scene. In this technique, a segmentation-based algorithm that is backwards compatible with conventional video coding systems is implemented. The effectiveness of our previous technique is enhanced in this contribution by a region merging process on the segmented regions, which results in a decrease in the amount of information necessary for transmission or storage of multiview image data by a factor of 20.5 with respect to the reference H.264/AVC coding methodology. This is furthermore achieved whilst maintaining a 3D image reconstruction and viewing quality which is quasi identical to the referenced approach.peer-reviewe

    A Modular Approach for Synchronized Wireless Multimodal Multisensor Data Acquisition in Highly Dynamic Social Settings

    Full text link
    Existing data acquisition literature for human behavior research provides wired solutions, mainly for controlled laboratory setups. In uncontrolled free-standing conversation settings, where participants are free to walk around, these solutions are unsuitable. While wireless solutions are employed in the broadcasting industry, they can be prohibitively expensive. In this work, we propose a modular and cost-effective wireless approach for synchronized multisensor data acquisition of social human behavior. Our core idea involves a cost-accuracy trade-off by using Network Time Protocol (NTP) as a source reference for all sensors. While commonly used as a reference in ubiquitous computing, NTP is widely considered to be insufficiently accurate as a reference for video applications, where Precision Time Protocol (PTP) or Global Positioning System (GPS) based references are preferred. We argue and show, however, that the latency introduced by using NTP as a source reference is adequate for human behavior research, and the subsequent cost and modularity benefits are a desirable trade-off for applications in this domain. We also describe one instantiation of the approach deployed in a real-world experiment to demonstrate the practicality of our setup in-the-wild.Comment: 9 pages, 8 figures, Proceedings of the 28th ACM International Conference on Multimedia (MM '20), October 12--16, 2020, Seattle, WA, USA. First two authors contributed equall

    3D data presentation

    Get PDF
    Tato práce seznamuje s metodami zprostředkovávajícími pozorovateli prostorový vjem. Podává přehled stereoskopických metod jako SIRDS, anaglyf a projekce pomocí dvojice projektorů a polarizačních filtrů. Je zde také popsán postup při generování pohledu na 3D scénu tzv. DIBR metodou. V práci je také uveden přehled technologií a formátů dat pro projekci a přenášení prostorového obrazu.This bachelor thesis informs about methods for 3D perception. Basic principles of autostereoscopic methods like SIRDS, anaglyph and projection stereo images by two projectors and anaglyphic or polarizing filters are described here. The procedure for generating the view on 3D by means of so-called depth-image-based rendering (DIBR) techniques is analyzed as well. An overview of technologies and formats for projection and simulcast spatial image is stated in this thesis too.

    Cost-effective solution to synchronised audio-visual data capture using multiple sensors

    Get PDF
    Applications such as surveillance and human behaviour analysis require high-bandwidth recording from multiple cameras, as well as from other sensors. In turn, sensor fusion has increased the required accuracy of synchronisation between sensors. Using commercial off-the-shelf components may compromise quality and accuracy due to several challenges, such as dealing with the combined data rate from multiple sensors; unknown offset and rate discrepancies between independent hardware clocks; the absence of trigger inputs or -outputs in the hardware; as well as the different methods for time-stamping the recorded data. To achieve accurate synchronisation, we centralise the synchronisation task by recording all trigger- or timestamp signals with a multi-channel audio interface. For sensors that don't have an external trigger signal, we let the computer that captures the sensor data periodically generate timestamp signals from its serial port output. These signals can also be used as a common time base to synchronise multiple asynchronous audio interfaces. Furthermore, we show that a consumer PC can currently capture 8-bit video data with 1024 × 1024 spatial- and 59.1 Hz temporal resolution, from at least 14 cameras, together with 8 channels of 24-bit audio at 96 kHz. We thus improve the quality/cost ratio of multi-sensor systems data capture systems

    Research Article A Flexible Client-Driven 3DTV System for Real-Time Acquisition, Transmission, and Display of Dynamic Scenes

    Get PDF
    3D experience and free-viewpoint navigation are expected to be two essential features of next generation television. In this paper, we present a flexible 3DTV system in which multiview video streams are captured, compressed, transmitted, and finally converted to high-quality 3D video in real time. Our system consists of an 8 × 8 camera array, 16 producer PCs, a streaming server, multiple clients, and several autostereoscopic displays. The whole system is implemented over IP network to provide multiple users with interactive 2D/3D switching, viewpoint control, and synthesis for dynamic scenes. In our approach, multiple video streams are first captured by a synchronized camera array. Then, we adopt a lengthened-B-field and region of interest- (ROI-) based coding scheme to guarantee a seamless view switching for each user as well as saving per-user transmission bandwidth. Finally, a convenient rendering algorithm is used to synthesize a visually pleasing result by introducing a new metric called Clarity Degree (CD). Experiments on both synthetic and real-world data have verified the feasibility, flexibility, and good performance of our system. Copyright © 2008 Xun Cao et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 1

    Enhanced e-learning and simulation for obstetrics education

    Get PDF
    Background: In medicine, new media technologies have been used in recent years to simulate situations and techniques that may not be common enough for students to experience in reality or may not be visible to the naked eye. Especially in areas of medicine focusing on important surgeries or procedures, these simulated designs could train students and ultimately prevent possible risk or morbidity. Aims: The aim of this thesis was to develop a multipurpose hybrid educational resource based on a physical/software driven simulator platform enabling the use of multimedia properties like 3D and video to enhance the educational training of obstetrics students through haptic interactions. All of this content was enabled by the learning preferences of the obstetric students involved. Method: The learning resource was developed using a combination of student learning preference, online learning content, 3D, video, human patient simulations and sensor technology interaction. These mediums were all interconnected to create a multipurpose resource. The learning preference was collected through a developed student online survey, the results consequently informed the creation of the other aspects of the finished resource. The interactive aspects were created through position and orientation sensors and the 3D/video influences which localised the position and orientation of an object like a fetal model relative to a human patient simulator. All of these methods combined with added assessment contributions for obstetric tutors, enabled the finalising of a prototype. Conclusion: This form of learning resource has a vital role in the progressing higher level education in the digital age. This proposal is the development of a new type of joint simulator that allows students and practitioners physically involve themselves in a series of processes while assessing their own progression through real time digital feedback in the form of video narrative and analytics. Usability test was not conducted on the full resource (one on the video platform) due to time limitations
    corecore