Search CORE

82 research outputs found

Surface Reconstruction and Evolution from Multiple Views

Author: Jana Soumya
M Sudhakar
Publication venue
Publication date: 01/01/2010
Field of study

Applications like 3D Telepresence necessitate faithful 3D surface reconstruction of the object and 3D data compression in both spatial and temporal domains. This makes us feel immersed in virtual environments there by making 3D Telepresence a powerful tool in many applications. Hence 3D surface reconstruction and 3D compression are two challenging problems which are addressed in this thesis

Research Archive of Indian Institute of Technology Hyderabad

High-Level Synthesis Based VLSI Architectures for Video Coding

Author: Ahmad Waqar
Publication venue: Politecnico di Torino
Publication date: 01/01/2017
Field of study

High Efficiency Video Coding (HEVC) is state-of-the-art video coding standard. Emerging applications like free-viewpoint video, 360degree video, augmented reality, 3D movies etc. require standardized extensions of HEVC. The standardized extensions of HEVC include HEVC Scalable Video Coding (SHVC), HEVC Multiview Video Coding (MV-HEVC), MV-HEVC+ Depth (3D-HEVC) and HEVC Screen Content Coding. 3D-HEVC is used for applications like view synthesis generation, free-viewpoint video. Coding and transmission of depth maps in 3D-HEVC is used for the virtual view synthesis by the algorithms like Depth Image Based Rendering (DIBR). As first step, we performed the profiling of the 3D-HEVC standard. Computational intensive parts of the standard are identified for the efficient hardware implementation. One of the computational intensive part of the 3D-HEVC, HEVC and H.264/AVC is the Interpolation Filtering used for Fractional Motion Estimation (FME). The hardware implementation of the interpolation filtering is carried out using High-Level Synthesis (HLS) tools. Xilinx Vivado Design Suite is used for the HLS implementation of the interpolation filters of HEVC and H.264/AVC. The complexity of the digital systems is greatly increased. High-Level Synthesis is the methodology which offers great benefits such as late architectural or functional changes without time consuming in rewriting of RTL-code, algorithms can be tested and evaluated early in the design cycle and development of accurate models against which the final hardware can be verified

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

3D Medical Image Lossless Compressor Using Deep Learning Approaches

Author: Omniah Nagoor
Publication venue: 'Swansea University'
Publication date: 01/01/2022
Field of study

The ever-increasing importance of accelerated information processing, communica-tion, and storing are major requirements within the big-data era revolution. With the extensive rise in data availability, handy information acquisition, and growing data rate, a critical challenge emerges in eﬃcient handling. Even with advanced technical hardware developments and multiple Graphics Processing Units (GPUs) availability, this demand is still highly promoted to utilise these technologies eﬀectively. Health-care systems are one of the domains yielding explosive data growth. Especially when considering their modern scanners abilities, which annually produce higher-resolution and more densely sampled medical images, with increasing requirements for massive storage capacity. The bottleneck in data transmission and storage would essentially be handled with an eﬀective compression method. Since medical information is critical and imposes an inﬂuential role in diagnosis accuracy, it is strongly encouraged to guarantee exact reconstruction with no loss in quality, which is the main objective of any lossless compression algorithm. Given the revolutionary impact of Deep Learning (DL) methods in solving many tasks while achieving the state of the art results, includ-ing data compression, this opens tremendous opportunities for contributions. While considerable eﬀorts have been made to address lossy performance using learning-based approaches, less attention was paid to address lossless compression. This PhD thesis investigates and proposes novel learning-based approaches for compressing 3D medical images losslessly.Firstly, we formulate the lossless compression task as a supervised sequential prediction problem, whereby a model learns a projection function to predict a target voxel given sequence of samples from its spatially surrounding voxels. Using such 3D local sampling information eﬃciently exploits spatial similarities and redundancies in a volumetric medical context by utilising such a prediction paradigm. The proposed NN-based data predictor is trained to minimise the diﬀerences with the original data values while the residual errors are encoded using arithmetic coding to allow lossless reconstruction.Following this, we explore the eﬀectiveness of Recurrent Neural Networks (RNNs) as a 3D predictor for learning the mapping function from the spatial medical domain (16 bit-depths). We analyse Long Short-Term Memory (LSTM) models’ generalisabil-ity and robustness in capturing the 3D spatial dependencies of a voxel’s neighbourhood while utilising samples taken from various scanning settings. We evaluate our proposed MedZip models in compressing unseen Computerized Tomography (CT) and Magnetic Resonance Imaging (MRI) modalities losslessly, compared to other state-of-the-art lossless compression standards.This work investigates input conﬁgurations and sampling schemes for a many-to-one sequence prediction model, speciﬁcally for compressing 3D medical images (16 bit-depths) losslessly. The main objective is to determine the optimal practice for enabling the proposed LSTM model to achieve a high compression ratio and fast encoding-decoding performance. A solution for a non-deterministic environments problem was also proposed, allowing models to run in parallel form without much compression performance drop. Compared to well-known lossless codecs, experimental evaluations were carried out on datasets acquired by diﬀerent hospitals, representing diﬀerent body segments, and have distinct scanning modalities (i.e. CT and MRI).To conclude, we present a novel data-driven sampling scheme utilising weighted gradient scores for training LSTM prediction-based models. The objective is to determine whether some training samples are signiﬁcantly more informative than others, speciﬁcally in medical domains where samples are available on a scale of billions. The eﬀectiveness of models trained on the presented importance sampling scheme was evaluated compared to alternative strategies such as uniform, Gaussian, and sliced-based sampling

Cronfa at Swansea University

Pattern Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

Directory of Open Access Books (DOAB)

A survey on compact features for visual content analysis

Author: Baroffio Luca
Redondi Alessandro E. C
Tagliasacchi Marco
Tubaro Stefano
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2016
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Deep Video Compression

Author: Ma Di
Publication venue
Publication date: 24/06/2021
Field of study

Explore Bristol Research

Model based estimation of image depth and displacement

Author: Damour Kevin T.
Publication venue
Publication date
Field of study

Passive depth and displacement map determinations have become an important part of computer vision processing. Applications that make use of this type of information include autonomous navigation, robotic assembly, image sequence compression, structure identification, and 3-D motion estimation. With the reliance of such systems on visual image characteristics, a need to overcome image degradations, such as random image-capture noise, motion, and quantization effects, is clearly necessary. Many depth and displacement estimation algorithms also introduce additional distortions due to the gradient operations performed on the noisy intensity images. These degradations can limit the accuracy and reliability of the displacement or depth information extracted from such sequences. Recognizing the previously stated conditions, a new method to model and estimate a restored depth or displacement field is presented. Once a model has been established, the field can be filtered using currently established multidimensional algorithms. In particular, the reduced order model Kalman filter (ROMKF), which has been shown to be an effective tool in the reduction of image intensity distortions, was applied to the computed displacement fields. Results of the application of this model show significant improvements on the restored field. Previous attempts at restoring the depth or displacement fields assumed homogeneous characteristics which resulted in the smoothing of discontinuities. In these situations, edges were lost. An adaptive model parameter selection method is provided that maintains sharp edge boundaries in the restored field. This has been successfully applied to images representative of robotic scenarios. In order to accommodate image sequences, the standard 2-D ROMKF model is extended into 3-D by the incorporation of a deterministic component based on previously restored fields. The inclusion of past depth and displacement fields allows a means of incorporating the temporal information into the restoration process. A summary on the conditions that indicate which type of filtering should be applied to a field is provided

NASA Technical Reports Server

Video object segmentation.

Author
Publication venue
Publication date: 01/01/2006
Field of study

Wei Wei.Thesis submitted in: December 2005.Thesis (M.Phil.)--Chinese University of Hong Kong, 2006.Includes bibliographical references (leaves 112-122).Abstracts in English and Chinese.Abstract --- p.IIList of Abbreviations --- p.IVChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Overview of Content-based Video Standard --- p.1Chapter 1.2 --- Video Object Segmentation --- p.4Chapter 1.2.1 --- Video Object Plane (VOP) --- p.4Chapter 1.2.2 --- Object Segmentation --- p.5Chapter 1.3 --- Problems of Video Object Segmentation --- p.6Chapter 1.4 --- Objective of the research work --- p.7Chapter 1.5 --- Organization of This Thesis --- p.8Chapter 1.6 --- Notes on Publication --- p.8Chapter Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- What is segmentation? --- p.10Chapter 2.1.1 --- Manual Segmentation --- p.10Chapter 2.1.2 --- Automatic Segmentation --- p.11Chapter 2.1.3 --- Semi-automatic segmentation --- p.12Chapter 2.2 --- Segmentation Strategy --- p.14Chapter 2.3 --- Segmentation of Moving Objects --- p.17Chapter 2.3.1 --- Motion --- p.18Chapter 2.3.2 --- Motion Field Representation --- p.19Chapter 2.3.3 --- Video Object Segmentation --- p.25Chapter 2.4 --- Summary --- p.35Chapter Chapter 3 --- Automatic Video Object Segmentation Algorithm --- p.37Chapter 3.1 --- Spatial Segmentation --- p.38Chapter 3.1.1 --- k:-Medians Clustering Algorithm --- p.39Chapter 3.1.2 --- Cluster Number Estimation --- p.41Chapter 3.1.2 --- Region Merging --- p.46Chapter 3.2 --- Foreground Detection --- p.48Chapter 3.2.1 --- Global Motion Estimation --- p.49Chapter 3.2.2 --- Detection of Moving Objects --- p.50Chapter 3.3 --- Object Tracking and Extracting --- p.50Chapter 3.3.1 --- Binary Model Tracking --- p.51Chapter 3.3.1.2 --- Initial Model Extraction --- p.53Chapter 3.3.2 --- Region Descriptor Tracking --- p.59Chapter 3.4 --- Results and Discussions --- p.65Chapter 3.4.1 --- Objective Evaluation --- p.65Chapter 3.4.2 --- Subjective Evaluation --- p.66Chapter 3.5 --- Conclusion --- p.74Chapter Chapter 4 --- Disparity Estimation and its Application in Video Object Segmentation --- p.76Chapter 4.1 --- Disparity Estimation --- p.79Chapter 4.1.1. --- Seed Selection --- p.80Chapter 4.1.2. --- Edge-based Matching by Propagation --- p.82Chapter 4.2 --- Remedy Matching Sparseness by Interpolation --- p.84Chapter 4.2 --- Disparity Applications in Video Conference Segmentation --- p.92Chapter 4.3 --- Conclusion --- p.106Chapter Chapter 5 --- Conclusion and Future Work --- p.108Chapter 5.1 --- Conclusion and Contribution --- p.108Chapter 5.2 --- Future work --- p.109Reference --- p.11

CUHK Digital Repository

Model- and image-based scene representation.

Author
Publication venue
Publication date: 01/01/1999
Field of study

Lee Kam Sum.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 97-101).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.2Chapter 1.1 --- Video representation using panorama mosaic and 3D face model --- p.2Chapter 1.2 --- Mosaic-based Video Representation --- p.3Chapter 1.3 --- "3D Human Face modeling ," --- p.7Chapter 2 --- Background --- p.13Chapter 2.1 --- Video Representation using Mosaic Image --- p.13Chapter 2.1.1 --- Traditional Video Compression --- p.17Chapter 2.2 --- 3D Face model Reconstruction via Multiple Views --- p.19Chapter 2.2.1 --- Shape from Silhouettes --- p.19Chapter 2.2.2 --- Head and Face Model Reconstruction --- p.22Chapter 2.2.3 --- Reconstruction using Generic Model --- p.24Chapter 3 --- System Overview --- p.27Chapter 3.1 --- Panoramic Video Coding Process --- p.27Chapter 3.2 --- 3D Face model Reconstruction Process --- p.28Chapter 4 --- Panoramic Video Representation --- p.32Chapter 4.1 --- Mosaic Construction --- p.32Chapter 4.1.1 --- Cylindrical Panorama Mosaic --- p.32Chapter 4.1.2 --- Cylindrical Projection of Mosaic Image --- p.34Chapter 4.2 --- Foreground Segmentation and Registration --- p.37Chapter 4.2.1 --- Segmentation Using Panorama Mosaic --- p.37Chapter 4.2.2 --- Determination of Background by Local Processing --- p.38Chapter 4.2.3 --- Segmentation from Frame-Mosaic Comparison --- p.40Chapter 4.3 --- Compression of the Foreground Regions --- p.44Chapter 4.3.1 --- MPEG-1 Compression --- p.44Chapter 4.3.2 --- MPEG Coding Method: I/P/B Frames --- p.45Chapter 4.4 --- Video Stream Reconstruction --- p.48Chapter 5 --- Three Dimensional Human Face modeling --- p.52Chapter 5.1 --- Capturing Images for 3D Face modeling --- p.53Chapter 5.2 --- Shape Estimation and Model Deformation --- p.55Chapter 5.2.1 --- Head Shape Estimation and Model deformation --- p.55Chapter 5.2.2 --- Face organs shaping and positioning --- p.58Chapter 5.2.3 --- Reconstruction with both intrinsic and extrinsic parameters --- p.59Chapter 5.2.4 --- Reconstruction with only Intrinsic Parameter --- p.63Chapter 5.2.5 --- Essential Matrix --- p.65Chapter 5.2.6 --- Estimation of Essential Matrix --- p.66Chapter 5.2.7 --- Recovery of 3D Coordinates from Essential Matrix --- p.67Chapter 5.3 --- Integration of Head Shape and Face Organs --- p.70Chapter 5.4 --- Texture-Mapping --- p.71Chapter 6 --- Experimental Result & Discussion --- p.74Chapter 6.1 --- Panoramic Video Representation --- p.74Chapter 6.1.1 --- Compression Improvement from Foreground Extraction --- p.76Chapter 6.1.2 --- Video Compression Performance --- p.78Chapter 6.1.3 --- Quality of Reconstructed Video Sequence --- p.80Chapter 6.2 --- 3D Face model Reconstruction --- p.91Chapter 7 --- Conclusion and Future Direction --- p.94Bibliography --- p.10

CUHK Digital Repository