Search CORE

19 research outputs found

Méthodes de tatouage robuste pour la protection de l imagerie numerique 3D

Author: CHAMMEM Afef
MITREA Mihai
PRETEUX Françoise
Publication venue
Publication date: 01/01/2013
Field of study

La multiplication des contenus stéréoscopique augmente les risques de piratage numérique. La solution technologique par tatouage relève ce défi. En pratique, le défi d une approche de tatouage est d'atteindre l équilibre fonctionnel entre la transparence, la robustesse, la quantité d information insérée et le coût de calcul. Tandis que la capture et l'affichage du contenu 3D ne sont fondées que sur les deux vues gauche/droite, des représentations alternatives, comme les cartes de disparité devrait également être envisagée lors de la transmission/stockage. Une étude spécifique sur le domaine d insertion optimale devient alors nécessaire. Cette thèse aborde les défis mentionnés ci-dessus. Tout d'abord, une nouvelle carte de disparité (3D video-New Three Step Search- 3DV-SNSL) est développée. Les performances des 3DV-NTSS ont été évaluées en termes de qualité visuelle de l'image reconstruite et coût de calcul. En comparaison avec l'état de l'art (NTSS et FS-MPEG) des gains moyens de 2dB en PSNR et 0,1 en SSIM sont obtenus. Le coût de calcul est réduit par un facteur moyen entre 1,3 et 13. Deuxièmement, une étude comparative sur les principales classes héritées des méthodes de tatouage 2D et de leurs domaines d'insertion optimales connexes est effectuée. Quatre méthodes d'insertion appartenant aux familles SS, SI et hybride (Fast-IProtect) sont considérées. Les expériences ont mis en évidence que Fast-IProtect effectué dans la nouvelle carte de disparité (3DV-NTSS) serait suffisamment générique afin de servir une grande variété d'applications. La pertinence statistique des résultats est donnée par les limites de confiance de 95% et leurs erreurs relatives inférieurs er <0.1The explosion in stereoscopic video distribution increases the concerns over its copyright protection. Watermarking can be considered as the most flexible property right protection technology. The watermarking applicative issue is to reach the trade-off between the properties of transparency, robustness, data payload and computational cost. While the capturing and displaying of the 3D content are solely based on the two left/right views, some alternative representations, like the disparity maps should also be considered during transmission/storage. A specific study on the optimal (with respect to the above-mentioned properties) insertion domain is also required. The present thesis tackles the above-mentioned challenges. First, a new disparity map (3D video-New Three Step Search - 3DV-NTSS) is designed. The performances of the 3DV-NTSS were evaluated in terms of visual quality of the reconstructed image and computational cost. When compared with state of the art methods (NTSS and FS-MPEG) average gains of 2dB in PSNR and 0.1 in SSIM are obtained. The computational cost is reduced by average factors between 1.3 and 13. Second, a comparative study on the main classes of 2D inherited watermarking methods and on their related optimal insertion domains is carried out. Four insertion methods are considered; they belong to the SS, SI and hybrid (Fast-IProtect) families. The experiments brought to light that the Fast-IProtect performed in the new disparity map domain (3DV-NTSS) would be generic enough so as to serve a large variety of applications. The statistical relevance of the results is given by the 95% confidence limits and their underlying relative errors lower than er<0.1EVRY-INT (912282302) / SudocSudocFranceF

OpenGrey Repository

A blind stereoscopic image quality evaluator with segmented stacked autoencoders considering the whole visual perception route

Author: Baihua Li (1253553)
Jiachen Yang (840978)
Kyohoon Sim (7168592)
Qinggang Meng (1257072)
Wen Lu (153883)
Xinbo Gao (709340)
Publication venue
Publication date: 01/01/2018
Field of study

Most of the current blind stereoscopic image quality assessment (SIQA) algorithms cannot show reliable accuracy. One reason is that they do not have the deep architectures and the other reason is that they are designed on the relatively weak biological basis, compared with findings on human visual system (HVS). In this paper, we propose a Deep Edge and COlor Signal INtegrity Evaluator (DECOSINE) based on the whole visual perception route from eyes to the frontal lobe, and especially focus on edge and color signal processing in retinal ganglion cells (RGC) and lateral geniculate nucleus (LGN). Furthermore, to model the complex and deep structure of the visual cortex, Segmented Stacked Auto-encoder (S-SAE) is used, which has not utilized for SIQA before. The utilization of the S-SAE complements weakness of deep learning-based SIQA metrics that require a very long training time. Experiments are conducted on popular SIQA databases, and the superiority of DECOSINE in terms of prediction accuracy and monotonicity is proved. The experimental results show that our model about the whole visual perception route and utilization of S-SAE are effective for SIQA

Loughborough University Institutional Repository

Reduced reference image and video quality assessments: review of methods

Author: Dost Shahi
Khan Muhammad Gufran
Lovstrom Benny
Saud Faryal
Shabbir Maham
Shahid Muhammad
Publication venue: New York, NY : Hindawi Publishing Corp.
Publication date: 01/01/2022
Field of study

With the growing demand for image and video-based applications, the requirements of consistent quality assessment metrics of image and video have increased. Different approaches have been proposed in the literature to estimate the perceptual quality of images and videos. These approaches can be divided into three main categories; full reference (FR), reduced reference (RR) and no-reference (NR). In RR methods, instead of providing the original image or video as a reference, we need to provide certain features (i.e., texture, edges, etc.) of the original image or video for quality assessment. During the last decade, RR-based quality assessment has been a popular research area for a variety of applications such as social media, online games, and video streaming. In this paper, we present review and classification of the latest research work on RR-based image and video quality assessment. We have also summarized different databases used in the field of 2D and 3D image and video quality assessment. This paper would be helpful for specialists and researchers to stay well-informed about recent progress of RR-based image and video quality assessment. The review and classification presented in this paper will also be useful to gain understanding of multimedia quality assessment and state-of-the-art approaches used for the analysis. In addition, it will help the reader select appropriate quality assessment methods and parameters for their respective applications

Institutionelles Repositorium der Leibniz Universität Hannover

Recommended from our members

Reduced reference image and video quality assessments: review of methods

Author: Dost Shahi
Khan Muhammad Gufran
Lovstrom Benny
Saud Faryal
Shabbir Maham
Shahid Muhammad
Publication venue: New York, NY : Hindawi Publishing Corp.
Publication date: 01/01/2022
Field of study

Repositorium für Naturwissenschaften und Technik

Cubic-panorama image dataset analysis for storage and transmission

Author
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date
Field of study

Crossref

Quality of Experience in Immersive Video Technologies

Author: Hanhart Philippe
Publication venue: Lausanne, EPFL
Publication date: 06/04/2016
Field of study

Over the last decades, several technological revolutions have impacted the television industry, such as the shifts from black & white to color and from standard to high-definition. Nevertheless, further considerable improvements can still be achieved to provide a better multimedia experience, for example with ultra-high-definition, high dynamic range & wide color gamut, or 3D. These so-called immersive technologies aim at providing better, more realistic, and emotionally stronger experiences. To measure quality of experience (QoE), subjective evaluation is the ultimate means since it relies on a pool of human subjects. However, reliable and meaningful results can only be obtained if experiments are properly designed and conducted following a strict methodology. In this thesis, we build a rigorous framework for subjective evaluation of new types of image and video content. We propose different procedures and analysis tools for measuring QoE in immersive technologies. As immersive technologies capture more information than conventional technologies, they have the ability to provide more details, enhanced depth perception, as well as better color, contrast, and brightness. To measure the impact of immersive technologies on the viewersâ QoE, we apply the proposed framework for designing experiments and analyzing collected subjectsâ ratings. We also analyze eye movements to study human visual attention during immersive content playback. Since immersive content carries more information than conventional content, efficient compression algorithms are needed for storage and transmission using existing infrastructures. To determine the required bandwidth for high-quality transmission of immersive content, we use the proposed framework to conduct meticulous evaluations of recent image and video codecs in the context of immersive technologies. Subjective evaluation is time consuming, expensive, and is not always feasible. Consequently, researchers have developed objective metrics to automatically predict quality. To measure the performance of objective metrics in assessing immersive content quality, we perform several in-depth benchmarks of state-of-the-art and commonly used objective metrics. For this aim, we use ground truth quality scores, which are collected under our subjective evaluation framework. To improve QoE, we propose different systems for stereoscopic and autostereoscopic 3D displays in particular. The proposed systems can help reducing the artifacts generated at the visualization stage, which impact picture quality, depth quality, and visual comfort. To demonstrate the effectiveness of these systems, we use the proposed framework to measure viewersâ preference between these systems and standard 2D & 3D modes. In summary, this thesis tackles the problems of measuring, predicting, and improving QoE in immersive technologies. To address these problems, we build a rigorous framework and we apply it through several in-depth investigations. We put essential concepts of multimedia QoE under this framework. These concepts not only are of fundamental nature, but also have shown their impact in very practical applications. In particular, the JPEG, MPEG, and VCEG standardization bodies have adopted these concepts to select technologies that were proposed for standardization and to validate the resulting standards in terms of compression efficiency

Infoscience - École polytechnique fédérale de Lausanne

On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

Author: Antonisse Joey
Azzopardi George
Bennabhaktula Swaroop
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2021
Field of study

Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Recommended from our members

Adaptive intra refresh for robust wireless multi-view video

Author: Lawan Sagir
Publication venue: Brunel University London
Publication date: 01/01/2016
Field of study

This thesis was submitted for the award of PhD and was awarded by Brunel University LondonMobile wireless communication technology is a fast developing field and every day new mobile communication techniques and means are becoming available. In this thesis multi-view video (MVV) is also refers to as 3D video. Thus, the 3D video signals through wireless communication are shaping telecommunication industry and academia. However, wireless channels are prone to high level of bit and burst errors that largely deteriorate the quality of service (QoS). Noise along the wireless transmission path can introduce distortion or make a compressed bitstream lose vital information. The error caused by noise progressively spread to subsequent frames and among multiple views due to prediction. This error may compel the receiver to pause momentarily and wait for the subsequent INTRA picture to continue decoding. The pausing of video stream affects the user's Quality of Experience (QoE). Thus, an error resilience strategy is needed to protect the compressed bitstream against transmission errors. This thesis focuses on error resilience Adaptive Intra Refresh (AIR) technique. The AIR method is developed to make the compressed 3D video more robust to channel errors. The process involves periodic injection of Intra-coded macroblocks in a cyclic pattern using H.264/AVC standard. The algorithm takes into account individual features in each macroblock and the feedback information sent by the decoder about the channel condition in order to generate an MVV-AIR map. MVV-AIR map generation regulates the order of packets arrival and identifies the motion activities in each macroblock. Based on the level of motion activity contained in each macroblock, the MVV-AIR map classifies frames as high or low motion macroblocks. A proxy MVV-AIR transcoder is used to validate the efficiency of the generated MVV-AIR map. The MVV-AIR transcoding algorithm uses spatial and views downscaling scheme to convert from MVV to single view. Various experimental results indicate that the proposed error resilient MVV-AIR transcoder technique effectively improves the quality of reconstructed 3D video in wireless networks. A comparison of MVV-AIR transcoder algorithm with some traditional error resilience techniques demonstrates that MVV-AIR algorithm performs better in an error prone channel. Results of simulation revealed significant improvements in both objective and subjective qualities. No additional computational complexity emanates from the scheme while the QoS and QoE requirements are still fully met.Tertiary Institution Trust Fund (TETFund) of Nigeri

Brunel University Research Archive

3D-in-2D Displays for ATC.

Author: Amaldi P.
Amaldi P.
Boccalatte A.
Boccalatte A.
Fields B.
Fields B.
Gaukrodger S.
Gaukrodger S.
Loomes M.
Loomes M.
Martin P.
Martin P.
Rozzi S.
Rozzi S.
Wong B.
Wong B.
Publication venue: Eurocontrol
Publication date: 01/01/2007
Field of study

This paper reports on the efforts and accomplishments of the 3D-in-2D Displays for ATC project at the end of Year 1. We describe the invention of 10 novel 3D/2D visualisations that were mostly implemented in the Augmented Reality ARToolkit. These prototype implementations of visualisation and interaction elements can be viewed on the accompanying video. We have identified six candidate design concepts which we will further research and develop. These designs correspond with the early feasibility studies stage of maturity as defined by the NASA Technology Readiness Level framework. We developed the Combination Display Framework from a review of the literature, and used it for analysing display designs in terms of display technique used and how they are combined. The insights we gained from this framework then guided our inventions and the human-centered innovation process we use to iteratively invent. Our designs are based on an understanding of user work practices. We also developed a simple ATC simulator that we used for rapid experimentation and evaluation of design ideas. We expect that if this project continues, the effort in Year 2 and 3 will be focus on maturing the concepts and employment in a operational laboratory settings

Middlesex University Research Repository

Estimating Head Measurements from 3D Point Clouds

Author: Patiño Mejía Isabel Cristina
Publication venue: Universität Tübingen
Publication date: 01/01/2019
Field of study

Maße menschlicher Köpfe sind unter anderem nützlich für die Ergonomie, die Akustik, die Medizin, Computer Vision sowie Computergrafik. Solche Maße werden üblicherweise gänzlich oder teilweise manuell gewonnen, was ein umständliches Verfahren darstellt, da die Genauigkeit von der Kompetenz der Person abhängt, die diese Messungen vornimmt. Darüber hinaus enthalten manuell erfasste Daten weniger Informationen, von denen neue Maße abgeleitet werden können, wenn das Subjekt nicht länger verfügbar ist. Um diese Nachteile wettzumachen, wurde ein Verfahren entwickelt, das in diesem Manuskript vorgestellt wird, um automatisch Maße aus 3D Punktwolken zu bestimmen, da diese eine langfristige Repräsentation von Menschen darstellen. Diese 3D Punktwolken wurden mit dem ASUS Xtion Pro Live RGB-D Sensor und KinFu (der open-source Implementierung von KinectFusion) aufgenommen. Es werden sowohl qualitative als auch quantitative Auswertungen der gewonnenen Maße präsentiert. Weiterhin wurde die Umsetzbarkeit des entwickelten Verfahrens anhand einer Fallstudie beurteilt, in der die gewonnenen Maße genutzt wurden, um den Einfluss von anthropometrischen Daten auf die Berechung der interauralen Zeitdifferenz zu schätzen. In Anbetracht der vielversprechenden Ergebnisse der Bestimmung von Maßen aus 3D Modellen, die mit dem Asus Xtion Pro Live Sensor und KinFu erstellt wurden, (sowie der Ergebnisse aus der Literatur) und der Entwicklung neuer RGB-D Sensoren, wird außerdem eine Studie des Einflusses von sieben verschiedenen RGB-D Sensoren auf die Rekonstruktion mittels KinFu dargestellt. Diese Studie enthält qualitative und quantitative Auswertungen von Rekonstruktionen vier verschiedener Objekte, die in unterschiedlichen Distanzen von 40 cm bis 120 cm aufgenommen wurden. Diese Spanne wurde anhand der Reichweite der Sensoren gewählt. Des Weiteren ist eine Sammlung der erhaltenen Rekonstruktionen als Datensatz verfügbar unter http://uni-tuebingen.de/en/138898.Human head measurements are valuable in ergonomics, acoustics, medicine, computer vision, and computer graphics, among other fields. Such measurements are usually obtained using entirely or partially manual tasks, which is a cumbersome practice since the level of accuracy depends on the expertise of the person that takes the measurements. Moreover, manually acquired measurements contain less information from which new measurements can be deduced when the subject is no longer accessible. Therefore, in order to overcome these disadvantages, an approach to automatically estimate measurements from 3D point clouds, which are long-term representations of humans, has been developed and is described in the presented manuscript. The 3D point clouds were acquired using an RGBD sensor Asus Xtion Pro Live and KinFu (open-source implementation of KinectFusion). Qualitative and quantitative evaluations of the estimated measurements are presented. Furthermore, the feasibility of the developed approach was evaluated through a case study in which the estimated measurements were used to appraise the influence of anthropometric data on the computation of the interaural time difference. Considering the promising results obtained from the estimation of measurements from 3D models acquired with the sensor Asus Xtion Pro Live and KinFu (plus the results reported in the literature) and the development of new RGBD sensors, a study of the influence of seven different RGBD sensors on the reconstruction obtained with KinFu is also presented. This study contains qualitative and quantitative evaluations of reconstructions of four diverse objects captured at different distances that range from 40 cm to 120 cm. Such range was established according to the operational range of the sensors. Furthermore, a collection of obtained reconstructions is available as a dataset in http://uni-tuebingen.de/en/138898

Publikationsserver der Universität Tübingen