75 research outputs found

    Object tracking and matting for A class of dynamic image-based representations

    Get PDF
    Image-based rendering (IBR) is an emerging technology for photo-realistic rendering of scenes from a collection of densely sampled images and videos. Recently, an object-based approach for a class of dynamic image-based representations called plenoptic videos was proposed. This paper proposes an automatic object tracking approach using the level-set method. Our tracking method, which utilizes both local and global features of the image sequences instead of global features exploited in previous approach, can achieve better tracking results for objects, especially with non-uniform energy distribution. Due to possible segmentation errors around object boundaries, natural matting with Bayesian approach is also incorporated into our system. Furthermore, a MPEG-4 like object-based algorithm is developed for compressing the plenoptic videos, which consist of the alpha maps, depth maps and textures of the segmented image-based objects from different video plenoptic streams. Experimental results show that satisfactory renderings can be obtained by the proposed approaches. © 2005 IEEE.published_or_final_versio

    Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications

    Get PDF
    Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications

    A multi-camera approach to image-based rendering and 3-D/Multiview display of ancient chinese artifacts

    Get PDF
    published_or_final_versio

    Object-based coding for plenoptic videos

    Get PDF
    A new object-based coding system for a class of dynamic image-based representations called plenoptic videos (PVs) is proposed. PVs are simplified dynamic light fields, where the videos are taken at regularly spaced locations along line segments instead of a 2-D plane. In the proposed object-based approach, objects at different depth values are segmented to improve the rendering quality. By encoding PVs at the object level, desirable functionalities such as scalability of contents, error resilience, and interactivity with an individual image-based rendering (IBR) object can be achieved. Besides supporting the coding of texture and binary shape maps for IBR objects with arbitrary shapes, the proposed system also supports the coding of grayscale alpha maps as well as depth maps (geometry information) to respectively facilitate the matting and rendering of the IBR objects. Both temporal and spatial redundancies among the streams in the PV are exploited to improve the coding performance, while avoiding excessive complexity in selective decoding of PVs to support fast rendering speed. Advanced spatial/temporal prediction methods such as global disparity-compensated prediction, as well as direct prediction and its extensions are developed. The bit allocation and rate control scheme employing a new convex optimization-based approach are also introduced. Experimental results show that considerable improvements in coding performance are obtained for both synthetic and real scenes, while supporting the stated object-based functionalities. © 2006 IEEE.published_or_final_versio

    Image-based rendering and synthesis

    Get PDF
    Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

    An object-based compression system for a class of dynamic image-based representations

    Get PDF
    S P I E Conference on Visual Communications and Image Processing, Beijing, China, 12-15 July 2005This paper proposes a new object-based compression system for a class of dynamic image-based representations called plenoptic videos (PVs). PVs are simplified dynamic light fields, where the videos are taken at regularly spaced locations along line segments instead of a 2-D plane. The proposed system employs an object-based approach, where objects at different depth values are segmented to improve the rendering quality as in the pop-up light fields. Furthermore, by coding the plenoptic video at the object level, desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects can be achieved. Besides supporting the coding of the texture and binary shape maps for IBR objects with arbitrary shapes, the proposed system also supports the coding of gray-scale alpha maps as well as geometry information in the form of depth maps to respectively facilitate the matting and rendering of the IBR objects. To improve the coding performance, the proposed compression system exploits both the temporal redundancy and spatial redundancy among the video object streams in the PV by employing disparity-compensated prediction or spatial prediction in its texture, shape and depth coding processes. To demonstrate the principle and effectiveness of the proposed system, a multiple video camera system was built and experimental results show that considerable improvements in coding performance are obtained for both synthetic scene and real scene, while supporting the stated object-based functionalities.published_or_final_versio

    POPRAWA METOD KOMPENSACJI RUCHU PORUSZAJĄCYCH SIĘ OBIEKTÓW DYNAMICZNYCH W STREAMIE WIDEO SYSTEMU WIDEOKONFERENCYJNEGO

    Get PDF
    Videoconferencing gives us the opportunity to work and communicate in real time, as well as to use collective applications, interactive information exchange. Videoconferencing systems are one of the basic components of the organization of manegment, ensuring, the timeliness and necessary quality management of the implementation of objective control over the solution of the tasks. The quality of the image and the time of transmission of video information is unsatisfactory for the quality control of the troops. Considered ways to increase the efficiency of management and operational activities, due to methods of compensation of motion, using technology to reduce the volume of video data for quality improvement.Wideokonferencje dają możliwość pracy i komunikowania się w czasie rzeczywistym, a także korzystania ze zbiorowych aplikacji, interaktywnej wymiany informacji. Systemy wideokonferencyjne są jednym z podstawowych elementów organizacji zarządzania, zapewniając terminowość i niezbędne zarządzanie jakością w zakresie realizacji kontroli nad rozwiązaniem zadań. Jakość obrazu i czas transmisji informacji wideo jest niezadowalający dla kontroli jakości wojsk. Rozważono sposoby zwiększania efektywności zarządzania i działań operacyjnych, ze względu na metody kompensacji ruchu, z wykorzystaniem technologii zmniejszającej ilość danych wideo w celu poprawy jakości

    Semi-Automatic Video Object Extraction Menggunakan Alpha Matting Berbasis Motion Estimation

    Get PDF
    Ekstraksi objek merupakan pekerjaan penting dalam aplikasi video editing, karena objek independen diperlukan untuk proses compositing. Proses ekstraksi dilakukan dengan image matting diawali dengan mendefinisikan scribble manual untuk mewakili daerah foreground dan background, sedangkan daerah unknown ditentukan dengan estimasi alpha. Permasalahan dalam image matting adalah piksel dalam daerah unknown tidak secara tegas menjadi bagian foreground atau background. Sedangkan dalam domain temporal, scribble tidak memungkinkan untuk didefinisikan secara independen di seluruh frame. Untuk mengatasi permasalahan tersebut, diusulkan metode ekstraksi objek dengan tahapan estimasi adaptive threshold untuk alpha matting, perbaikan akurasi image matting, dan estimasi temporal constraint untuk propagasi scribble. Algoritma Fuzzy C-Means (FCM) dan Otsu diaplikasikan untuk estimasi adaptive threshold. Dengan FCM hasil evaluasi menggunakan Means Squared Error (MSE) menunjukkan bahwa rata-rata kesalahan piksel di setiap frame berkurang dari 30.325,10 menjadi 26.999,33, sedangkan dengan Otsu menjadi 28.921,70. Kualitas matting yang menurun akibat perubahan intensitas pada image terkompresi diperbaiki menggunakan Discrete Cosine Transform (DCT-2D). Algoritma ini menurunkan Root Means Squared Error (RMSE) dari 16.68 menjadi 11.44. Estimasi temporal constraint untuk propagasi scribble dilakukan dengan memprediksi motion vector dari frame sekarang ke frame selanjutnya. Prediksi motion vector yang v dilakukan menggunakan exhaustive search diperbaiki dengan mendefinisikan matrik yang berukuran dinamis terhadap ukuran scribble, motion vector ditentukan dengan Sum of Absolute Difference (SAD) antara frame sekarang dan frame berikutnya. Hasilnya ketika diaplikasikan pada ruang warna RGB dapat menurunkan rata-rata kesalahan piksel setiap frame dari 3.058,55 menjadi 1.533,35, sedangkan dalam ruang waktu HSV menjadi 1.662,83. KiMoHar yang merupakan framework yang diusulkan meliputi tiga hal sebagai berikut. Pertama adalah image matting dengan adaptive threshold FCM dapat meningkatkan akurasi sebesar 11.05 %. Kedua, perbaikan kualitas matting pada image terkompresi menggunakan DCT-2D meningkatkan akurasi sebesar 31.41%. Sedangkan yang ketiga, estimasi temporal constraint pada ruang warna RGB meningkatkan akurasi 56.30%, dan dalam ruang HSV 52.61%. ======================================================================================================== It is important to have object extraction in video editing application because compositing process is necessary for independent object. Extraction process is performed by image matting which is defining manual scribble to represent the foreground and background area, and alpha estimation to determine the unknown area. In image matting, there are problem which are pixel in unknown area is not firmly being the part of foreground or background, whereas, in temporal domain, it is not possible to define the scribble independently in whole frame. In order to overcome the problem, object extraction model with adaptive threshold estimation phase for alpha matting, accuracy improvement for image matting, and temporal constraint estimation for scribble propagation is proposed. Fuzzy C-Means (FCM) Algorithm and Otsu are applied for adaptive threshold estimation. By FCM,the evaluationresult byusingMeansSquaredError(MSE) showsthatthe averageerrorof pixelsineachframeis reducedfrom30.325,10 to 26.999,33, while in the use of Otsu, the result shows 28.921,70. The matting quality is reducing since the intensity changing in compressed image improved by Discrete Cosine Transform (DCT-2D). The algorithm reduces Root Means Squared Error (RMSE) value from 16.68 to 11.4. Temporal constraint estimation for scribble propagation is performed by predicting motion vector from recent frame and forward. Motion vector prediction performed using exhaustive search is improved by defining the matrix in dynamic size to scribble; motion vector is determined by Sum of Absolute Difference (SAD) v between recent frame and forward. In its application to RGB space, it results the averageerrorof pixelsineachframe from 3.058,55 to 1.533,35, and 1.662,83 in HSV time space. KiMoHar, the proposed framework, includes three things which are: First, image matting by adaptive threshold FCM increases the accuracy to 11.05%. Second, matting quality improvement in compressed image by DCT-2D increases the accuracy to 31,41%. Three, temporal constraint estimation in RGB space increases the accuracy to 56.30%, and 52.61% in HSV space

    alternative clustering methods, sub-pixel accurate object extraction from still images, and generic video segmentation

    Get PDF
    This paper presents a practical approach for object extraction from still images and video sequences that is both: simple to use and easy to implement. Many image segmentation projects focus on special cases or try to use complicated heuristics and classificators to cope with every special case. The presented approach focuses on typical pictures and videos taken from everyday life working under the assumption that the foreground objects are sufficiently perceptual different from the background. The approach incorporates experiences and user feedback from several projects that have integrated the algorithm already. The segmentation works in realtime for video and is noise robust and provides subpixel accuracy for still images

    A Survey on Video-based Graphics and Video Visualization

    Get PDF
    corecore