24 research outputs found

    Selected topics on distributed video coding

    Get PDF
    Distributed Video Coding (DVC) is a new paradigm for video compression based on the information theoretical results of Slepian and Wolf (SW), and Wyner and Ziv (WZ). While conventional coding has a rigid complexity allocation as most of the complex tasks are performed at the encoder side, DVC enables a flexible complexity allocation between the encoder and the decoder. The most novel and interesting case is low complexity encoding and complex decoding, which is the opposite of conventional coding. While the latter is suitable for applications where the cost of the decoder is more critical than the encoder's one, DVC opens the door for a new range of applications where low complexity encoding is required and the decoder's complexity is not critical. This is interesting with the deployment of small and battery-powered multimedia mobile devices all around in our daily life. Further, since DVC operates as a reversed-complexity scheme when compared to conventional coding, DVC also enables the interesting scenario of low complexity encoding and decoding between two ends by transcoding between DVC and conventional coding. More specifically, low complexity encoding is possible by DVC at one end. Then, the resulting stream is decoded and conventionally re-encoded to enable low complexity decoding at the other end. Multiview video is attractive for a wide range of applications such as free viewpoint television, which is a system that allows viewing the scene from a viewpoint chosen by the viewer. Moreover, multiview can be beneficial for monitoring purposes in video surveillance. The increased use of multiview video systems is mainly due to the improvements in video technology and the reduced cost of cameras. While a multiview conventional codec will try to exploit the correlation among the different cameras at the encoder side, DVC allows for separate encoding of correlated video sources. Therefore, DVC requires no communication between the cameras in a multiview scenario. This is an advantage since communication is time consuming (i.e. more delay) and requires complex networking. Another appealing feature of DVC is the fact that it is based on a statistical framework. Moreover, DVC behaves as a natural joint source-channel coding solution. This results in an improved error resilience performance when compared to conventional coding. Further, DVC-based scalable codecs do not require a deterministic knowledge of the lower layers. In other words, the enhancement layers are completely independent from the base layer codec. This is called the codec-independent scalability feature, which offers a high flexibility in the way the various layers are distributed in a network. This thesis addresses the following topics: First, the theoretical foundations of DVC as well as the practical DVC scheme used in this research are presented. The potential applications for DVC are also outlined. DVC-based schemes use conventional coding to compress parts of the data, while the rest is compressed in a distributed fashion. Thus, different conventional codecs are studied in this research as they are compared in terms of compression efficiency for a rich set of sequences. This includes fine tuning the compression parameters such that the best performance is achieved for each codec. Further, DVC tools for improved Side Information (SI) and Error Concealment (EC) are introduced for monoview DVC using a partially decoded frame. The improved SI results in a significant gain in reconstruction quality for video with high activity and motion. This is done by re-estimating the erroneous motion vectors using the partially decoded frame to improve the SI quality. The latter is then used to enhance the reconstruction of the finally decoded frame. Further, the introduced spatio-temporal EC improves the quality of decoded video in the case of erroneously received packets, outperforming both spatial and temporal EC. Moreover, it also outperforms error-concealed conventional coding in different modes. Then, multiview DVC is studied in terms of SI generation, which differentiates it from the monoview case. More specifically, different multiview prediction techniques for SI generation are described and compared in terms of prediction quality, complexity and compression efficiency. Further, a technique for iterative multiview SI is introduced, where the final SI is used in an enhanced reconstruction process. The iterative SI outperforms the other SI generation techniques, especially for high motion video content. Finally, fusion techniques of temporal and inter-view side informations are introduced as well, which improves the performance of multiview DVC over monoview coding. DVC is also used to enable scalability for image and video coding. Since DVC is based on a statistical framework, the base and enhancement layers are completely independent, which is an interesting property called codec-independent scalability. Moreover, the introduced DVC scalable schemes show a good robustness to errors as the quality of decoded video steadily decreases with error rate increase. On the other hand, conventional coding exhibits a cliff effect as the performance drops dramatically after a certain error rate value. Further, the issue of privacy protection is addressed for DVC by transform domain scrambling, which is used to alter regions of interest in video such that the scene is still understood and privacy is preserved as well. The proposed scrambling techniques are shown to provide a good level of security without impairing the performance of the DVC scheme when compared to the one without scrambling. This is particularly attractive for video surveillance scenarios, which is one of the most promising applications for DVC. Finally, a practical DVC demonstrator built during this research is described, where the main requirements as well as the observed limitations are presented. Furthermore, it is defined in a setup being as close as possible to a complete real application scenario. This shows that it is actually possible to implement a complete end-to-end practical DVC system relying only on realistic assumptions. Even though DVC is inferior in terms of compression efficiency to the state of the art conventional coding for the moment, strengths of DVC reside in its good error resilience properties and the codec-independent scalability feature. Therefore, DVC offers promising possibilities for video compression with transmission over error-prone environments requirement as it significantly outperforms conventional coding in this case

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    Schémas de tatouage d'images, schémas de tatouage conjoint à la compression, et schémas de dissimulation de données

    Get PDF
    In this manuscript we address data-hiding in images and videos. Specifically we address robust watermarking for images, robust watermarking jointly with compression, and finally non robust data-hiding.The first part of the manuscript deals with high-rate robust watermarking. After having briefly recalled the concept of informed watermarking, we study the two major watermarking families : trellis-based watermarking and quantized-based watermarking. We propose, firstly to reduce the computational complexity of the trellis-based watermarking, with a rotation based embedding, and secondly to introduce a trellis-based quantization in a watermarking system based on quantization.The second part of the manuscript addresses the problem of watermarking jointly with a JPEG2000 compression step or an H.264 compression step. The quantization step and the watermarking step are achieved simultaneously, so that these two steps do not fight against each other. Watermarking in JPEG2000 is achieved by using the trellis quantization from the part 2 of the standard. Watermarking in H.264 is performed on the fly, after the quantization stage, choosing the best prediction through the process of rate-distortion optimization. We also propose to integrate a Tardos code to build an application for traitors tracing.The last part of the manuscript describes the different mechanisms of color hiding in a grayscale image. We propose two approaches based on hiding a color palette in its index image. The first approach relies on the optimization of an energetic function to get a decomposition of the color image allowing an easy embedding. The second approach consists in quickly obtaining a color palette of larger size and then in embedding it in a reversible way.Dans ce manuscrit nous abordons l’insertion de données dans les images et les vidéos. Plus particulièrement nous traitons du tatouage robuste dans les images, du tatouage robuste conjointement à la compression et enfin de l’insertion de données (non robuste).La première partie du manuscrit traite du tatouage robuste à haute capacité. Après avoir brièvement rappelé le concept de tatouage informé, nous étudions les deux principales familles de tatouage : le tatouage basé treillis et le tatouage basé quantification. Nous proposons d’une part de réduire la complexité calculatoire du tatouage basé treillis par une approche d’insertion par rotation, ainsi que d’autre part d’introduire une approche par quantification basée treillis au seind’un système de tatouage basé quantification.La deuxième partie du manuscrit aborde la problématique de tatouage conjointement à la phase de compression par JPEG2000 ou par H.264. L’idée consiste à faire en même temps l’étape de quantification et l’étape de tatouage, de sorte que ces deux étapes ne « luttent pas » l’une contre l’autre. Le tatouage au sein de JPEG2000 est effectué en détournant l’utilisation de la quantification basée treillis de la partie 2 du standard. Le tatouage au sein de H.264 est effectué à la volée, après la phase de quantification, en choisissant la meilleure prédiction via le processus d’optimisation débit-distorsion. Nous proposons également d’intégrer un code de Tardos pour construire une application de traçage de traîtres.La dernière partie du manuscrit décrit les différents mécanismes de dissimulation d’une information couleur au sein d’une image en niveaux de gris. Nous proposons deux approches reposant sur la dissimulation d’une palette couleur dans son image d’index. La première approche consiste à modéliser le problème puis à l’optimiser afin d’avoir une bonne décomposition de l’image couleur ainsi qu’une insertion aisée. La seconde approche consiste à obtenir, de manière rapide et sûre, une palette de plus grande dimension puis à l’insérer de manière réversible

    Privacy-Friendly Photo Sharing and Relevant Applications Beyond

    Get PDF
    Popularization of online photo sharing brings people great convenience, but has also raised concerns for privacy. Researchers proposed various approaches to enable image privacy, most of which focus on encrypting or distorting image visual content. In this thesis, we investigate novel solutions to protect image privacy with a particular emphasis on online photo sharing. To this end, we propose not only algorithms to protect visual privacy in image content but also design of architectures for privacy-preserving photo sharing. Beyond privacy, we also explore additional impacts and potentials of employing daily images in other three relevant applications. First, we propose and study two image encoding algorithms to protect visual content in image, within a Secure JPEG framework. The first method scrambles a JPEG image by randomly changing the signs of its DCT coefficients based on a secret key. The second method, named JPEG Transmorphing, allows one to protect arbitrary image regions with any obfuscation, while secretly preserving the original image regions in application segments of the obfuscated JPEG image. Performance evaluations reveal a good degree of storage overhead and privacy protection capability for both methods, and particularly a good level of pleasantness for JPEG Transmorphing, if proper manipulations are applied. Second, we investigate the design of two architectures for privacy-preserving photo sharing. The first architecture, named ProShare, is built on a public key infrastructure (PKI) integrated with a ciphertext-policy attribute-based encryption (CP-ABE), to enable the secure and efficient access to user-posted photos protected by Secure JPEG. The second architecture is named ProShare S, in which a photo sharing service provider helps users make photo sharing decisions automatically based on their past decisions using machine learning. The photo sharing service analyzes not only the content of a user's photo, but also context information about the image capture and a prospective requester, and finally makes decision whether or not to share a particular photo to the requester, and if yes, at which granularity. A user study along with extensive evaluations were performed to validate the proposed architecture. In the end, we research into three relevant topics in regard to daily photos captured or shared by people, but beyond their privacy implications. In the first study, inspired by JPEG Transmorphing, we propose an animated JPEG file format, named aJPEG. aJPEG preserves its animation frames as application markers in a JPEG image and provides smaller file size and better image quality than conventional GIF. In the second study, we attempt to understand the impact of popular image manipulations applied in online photo sharing on evoked emotions of observers. The study reveals that image manipulations indeed influence people's emotion, but such impact also depends on the image content. In the last study, we employ a deep convolutional neural network (CNN), the GoogLeNet model, to perform automatic food image detection and categorization. The promising results obtained provide meaningful insights in design of automatic dietary assessment system based on multimedia techniques, e.g. image analysis

    Biometric Systems

    Get PDF
    Biometric authentication has been widely used for access control and security systems over the past few years. The purpose of this book is to provide the readers with life cycle of different biometric authentication systems from their design and development to qualification and final application. The major systems discussed in this book include fingerprint identification, face recognition, iris segmentation and classification, signature verification and other miscellaneous systems which describe management policies of biometrics, reliability measures, pressure based typing and signature verification, bio-chemical systems and behavioral characteristics. In summary, this book provides the students and the researchers with different approaches to develop biometric authentication systems and at the same time includes state-of-the-art approaches in their design and development. The approaches have been thoroughly tested on standard databases and in real world applications

    Compression et transmission d'images avec énergie minimale application aux capteurs sans fil

    Get PDF
    Un réseau de capteurs d'images sans fil (RCISF) est un réseau ad hoc formé d'un ensemble de noeuds autonomes dotés chacun d'une petite caméra, communiquant entre eux sans liaison filaire et sans l'utilisation d'une infrastructure établie, ni d'une gestion de réseau centralisée. Leur utilité semble majeure dans plusieurs domaines, notamment en médecine et en environnement. La conception d'une chaîne de compression et de transmission sans fil pour un RCISF pose de véritables défis. L'origine de ces derniers est liée principalement à la limitation des ressources des capteurs (batterie faible , capacité de traitement et mémoire limitées). L'objectif de cette thèse consiste à explorer des stratégies permettant d'améliorer l'efficacité énergétique des RCISF, notamment lors de la compression et de la transmission des images. Inéluctablement, l'application des normes usuelles telles que JPEG ou JPEG2000 est éner- givore, et limite ainsi la longévité des RCISF. Cela nécessite leur adaptation aux contraintes imposées par les RCISF. Pour cela, nous avons analysé en premier lieu, la faisabilité d'adapter JPEG au contexte où les ressources énergétiques sont très limitées. Les travaux menés sur cet aspect nous permettent de proposer trois solutions. La première solution est basée sur la propriété de compactage de l'énergie de la Transformée en Cosinus Discrète (TCD). Cette propriété permet d'éliminer la redondance dans une image sans trop altérer sa qualité, tout en gagnant en énergie. La réduction de l'énergie par l'utilisation des régions d'intérêts représente la deuxième solution explorée dans cette thèse. Finalement, nous avons proposé un schéma basé sur la compression et la transmission progressive, permettant ainsi d'avoir une idée générale sur l'image cible sans envoyer son contenu entier. En outre, pour une transmission non énergivore, nous avons opté pour la solution suivante. N'envoyer fiablement que les basses fréquences et les régions d'intérêt d'une image. Les hautes fréquences et les régions de moindre intérêt sont envoyées""infiablement"", car leur pertes n'altèrent que légèrement la qualité de l'image. Pour cela, des modèles de priorisation ont été comparés puis adaptés à nos besoins. En second lieu, nous avons étudié l'approche par ondelettes (wavelets ). Plus précisément, nous avons analysé plusieurs filtres d'ondelettes et déterminé les ondelettes les plus adéquates pour assurer une faible consommation en énergie, tout en gardant une bonne qualité de l'image reconstruite à la station de base. Pour estimer l'énergie consommée par un capteur durant chaque étape de la 'compression, un modèle mathématique est développé pour chaque transformée (TCD ou ondelette). Ces modèles, qui ne tiennent pas compte de la complexité de l'implémentation, sont basés sur le nombre d'opérations de base exécutées à chaque étape de la compression

    Efficient and Robust Video Steganography Algorithms for Secure Data Communication

    Get PDF
    Over the last two decades, the science of secretly embedding and communicating data has gained tremendous significance due to the technological advancement in communication and digital content. Steganography is the art of concealing secret data in a particular interactive media transporter such as text, audio, image, and video data in order to build a covert communication between authorized parties. Nowadays, video steganography techniques are important in many video-sharing and social networking applications such as Livestreaming, YouTube, Twitter, and Facebook because of noteworthy developments in advanced video over the Internet. The performance of any steganography method, ultimately, relies on the imperceptibility, hiding capacity, and robustness against attacks. Although many video steganography methods exist, several of them lack the preprocessing stages. In addition, less security, low embedding capacity, less imperceptibility, and less robustness against attacks are other issues that affect these algorithms. This dissertation investigates and analyzes cutting edge video steganography techniques in both compressed and raw domains. Moreover, it provides solutions for the aforementioned problems by proposing new and effective methods for digital video steganography. The key objectives of this research are to develop: 1) a highly secure video steganography algorithm based on error correcting codes (ECC); 2) an increased payload video steganography algorithm in the discrete wavelet domain based on ECC; 3) a novel video steganography algorithm based on Kanade-Lucas-Tomasi (KLT) tracking and ECC; 4) a robust video steganography algorithm in the wavelet domain based on KLT tracking and ECC; 5) a new video steganography algorithm based on the multiple object tracking (MOT) and ECC; and 6) a robust and secure video steganography algorithm in the discrete wavelet and discrete cosine transformations based on MOT and ECC. The experimental results from our research demonstrate that our proposed algorithms achieve higher embedding capacity as well as better imperceptibility of stego videos. Furthermore, the preprocessing stages increase the security and robustness of the proposed algorithms against attacks when compared to state-of-the-art steganographic methods

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Handbook of Digital Face Manipulation and Detection

    Get PDF
    This open access book provides the first comprehensive collection of studies dealing with the hot topic of digital face manipulation such as DeepFakes, Face Morphing, or Reenactment. It combines the research fields of biometrics and media forensics including contributions from academia and industry. Appealing to a broad readership, introductory chapters provide a comprehensive overview of the topic, which address readers wishing to gain a brief overview of the state-of-the-art. Subsequent chapters, which delve deeper into various research challenges, are oriented towards advanced readers. Moreover, the book provides a good starting point for young researchers as well as a reference guide pointing at further literature. Hence, the primary readership is academic institutions and industry currently involved in digital face manipulation and detection. The book could easily be used as a recommended text for courses in image processing, machine learning, media forensics, biometrics, and the general security area

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p
    corecore