42 research outputs found

    Incrustation d'un logo dans un ficher vidéo codé avec le standard MPEG-2

    Get PDF
    Ce mémoire constitue l'aboutissement du projet de recherche de Patrick Keroulas et aborde la notion de compression vidéo, domaine en pleine ébullition avec la démocratisation de l'équipement vidéo et des réseaux de télécommunication. La question initiale est de savoir s'il est possible de modifier le contenu de l'image directement dans un flux binaire provenant d'une séquence vidéo compressée. Un tel dispositif permettrait d'ajouter des modifications en n'importe quel point d'un réseau en évitant le décodage et recodage du flux de données, ces deux processus étant très coûteux en termes de calcul. Brièvement présentés dans la première partie, plusieurs travaux ont déjà proposé une gamme assez large de méthodes de filtrage, de débruitage, de redimensionnement de l'image, etc. Toutes les publications rencontrées à ce sujet se concentrent sur la transposition des traitements de l'image du domaine spatial vers le domaine fréquentiel. Il a été convenu de centrer la problématique sur une application potentiellement exploitable dans le domaine de la télédiffusion. Il s'agit d'incruster un logo ajustable en position et en opacité dans un fichier vidéo codé avec la norme MPEG-2, encore couramment utilisée. La transformée appliquée par cet algorithme de compression est la DCT (Discrete Cosine Transform). Un article publié en 1995 traitant de la composition vidéo en général est plus détaillé car il sert de base à cette étude. Certains outils proposés qui reposent sur la linéarité et l'orthogonalité de la transformée seront repris dans le cadre de ce projet, mais la démarche proposée pour résoudre les problèmes temporels est différente. Ensuite, les éléments essentiels de la norme MPEG-2 sont présentés pour en comprendre les mécanismes et également pour exposer la structure d'un fichier codé car, en pratique, ce serait la seule donnée accessible. Le quatrième chapitre de l'étude présente la solution technique mise en oeuvre via un article soumis à IEEE Transactions on Broadcasting. C'est dans cette partie que toutes les subtilités liées au codage sont traitées : la structure en blocs de pixel, la prédiction spatiale, la compensation de mouvement au demi-pixel près, la nécessité ou non de la quantification inverse. À la vue des résultats satisfaisants, la discussion finale porte sur la limite du système : le compromis entre son efficacité, ses degrés de liberté et le degré de décodage du flux

    A network transparent, retained mode multimedia processing framework for the Linux operating system environment

    Get PDF
    Die Arbeit präsentiert ein Multimedia-Framework für Linux, das im Unterschied zu früheren Arbeiten auf den Ideen "retained-mode processing" und "lazy evaluation" basiert: Statt Transformationen unmittelbar auszuführen, wird eine abstrakte Repräsentation aller Medienelemente aufgebaut. "renderer"-Treiber fungieren als Übersetzer, die diese Darstellung zur Laufzeit in konkrete Operationen umsetzen, wobei das Datenmodell zahlreiche Optimierungen zur Reduktion der Anzahl der Schritte oder der Minimierung von Kommunikation erlaubt. Dies erlaubt ein stark vereinfachtes Programmiermodell bei gleichzeitiger Effizienzsteigerung. "renderer"-Treiber können zur Ausführung von Transformationen den lokalen Prozessor verwenden, oder können die Operationen delegieren. In der Arbeit wird eine Erweiterung des X Window Systems um Mechanismen zur Medienverarbeitung vorgestellt, sowie ein "renderer"-Treiber, der diese zur Delegation der Verarbeitung nutzt

    Enhancing Mesh Deformation Realism: Dynamic Mesostructure Detailing and Procedural Microstructure Synthesis

    Get PDF
    Propomos uma solução para gerar dados de mapas de relevo dinâmicos para simular deformações em superfícies macias, com foco na pele humana. A solução incorpora a simulação de rugas ao nível mesoestrutural e utiliza texturas procedurais para adicionar detalhes de microestrutura estáticos. Oferece flexibilidade além da pele humana, permitindo a geração de padrões que imitam deformações em outros materiais macios, como couro, durante a animação. As soluções existentes para simular rugas e pistas de deformação frequentemente dependem de hardware especializado, que é dispendioso e de difícil acesso. Além disso, depender exclusivamente de dados capturados limita a direção artística e dificulta a adaptação a mudanças. Em contraste, a solução proposta permite a síntese dinâmica de texturas que se adaptam às deformações subjacentes da malha de forma fisicamente plausível. Vários métodos foram explorados para sintetizar rugas diretamente na geometria, mas sofrem de limitações como auto-interseções e maiores requisitos de armazenamento. A intervenção manual de artistas na criação de mapas de rugas e mapas de tensão permite controle, mas pode ser limitada em deformações complexas ou onde maior realismo seja necessário. O nosso trabalho destaca o potencial dos métodos procedimentais para aprimorar a geração de padrões de deformação dinâmica, incluindo rugas, com maior controle criativo e sem depender de dados capturados. A incorporação de padrões procedimentais estáticos melhora o realismo, e a abordagem pode ser estendida além da pele para outros materiais macios.We propose a solution for generating dynamic heightmap data to simulate deformations for soft surfaces, with a focus on human skin. The solution incorporates mesostructure-level wrinkles and utilizes procedural textures to add static microstructure details. It offers flexibility beyond human skin, enabling the generation of patterns mimicking deformations in other soft materials, such as leater, during animation. Existing solutions for simulating wrinkles and deformation cues often rely on specialized hardware, which is costly and not easily accessible. Moreover, relying solely on captured data limits artistic direction and hinders adaptability to changes. In contrast, our proposed solution provides dynamic texture synthesis that adapts to underlying mesh deformations. Various methods have been explored to synthesize wrinkles directly to the geometry, but they suffer from limitations such as self-intersections and increased storage requirements. Manual intervention by artists using wrinkle maps and tension maps provides control but may be limited to the physics-based simulations. Our research presents the potential of procedural methods to enhance the generation of dynamic deformation patterns, including wrinkles, with greater creative control and without reliance on captured data. Incorporating static procedural patterns improves realism, and the approach can be extended to other soft-materials beyond skin

    Image splicing detection scheme using adaptive threshold mean ternary pattern descriptor

    Get PDF
    The rapid growth of image editing applications has an impact on image forgery cases. Image forgery is a big challenge in authentic image identification. Images can be readily altered using post-processing effects, such as blurring shallow depth, JPEG compression, homogenous regions, and noise to forge the image. Besides, the process can be applied in the spliced image to produce a composite image. Thus, there is a need to develop a scheme of image forgery detection for image splicing. In this research, suitable features of the descriptors for the detection of spliced forgery are defined. These features will reduce the impact of blurring shallow depth, homogenous area, and noise attacks to improve the accuracy. Therefore, a technique to detect forgery at the image level of the image splicing was designed and developed. At this level, the technique involves four important steps. Firstly, convert colour image to three colour channels followed by partition of image into overlapping block and each block is partitioned into non-overlapping cells. Next, Adaptive Thresholding Mean Ternary Pattern Descriptor (ATMTP) is applied on each cell to produce six ATMTP codes and finally, the tested image is classified. In the next part of the scheme, detected forgery object in the spliced image involves five major steps. Initially, similarity among every neighbouring district is computed and the two most comparable areas are assembled together to the point that the entire picture turns into a single area. Secondly, merge similar regions according to specific state, which satisfies the condition of fewer than four pixels between similar regions that lead to obtaining the desired regions to represent objects that exist in the spliced image. Thirdly, select random blocks from the edge of the binary image based on the binary mask. Fourthly, for each block, the Gabor Filter feature is extracted to assess the edges extracted of the segmented image. Finally, the Support Vector Machine (SVM) is used to classify the images. Evaluation of the scheme was experimented using three sets of standard datasets, namely, the Institute of Automation, Chinese Academy of Sciences (CASIA) version TIDE 1.0 and 2.0, and Columbia University. The results showed that, the ATMTP achieved higher accuracy of 98.95%, 99.03% and 99.17% respectively for each set of datasets. Therefore, the findings of this research has proven the significant contribution of the scheme in improving image forgery detection. It is recommended that the scheme be further improved in the future by considering geometrical perspective

    The 1993 Space and Earth Science Data Compression Workshop

    Get PDF
    The Earth Observing System Data and Information System (EOSDIS) is described in terms of its data volume, data rate, and data distribution requirements. Opportunities for data compression in EOSDIS are discussed

    Adaptive video delivery using semantics

    Get PDF
    The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

    Video coding for compression and content-based functionality

    Get PDF
    The lifetime of this research project has seen two dramatic developments in the area of digital video coding. The first has been the progress of compression research leading to a factor of two improvement over existing standards, much wider deployment possibilities and the development of the new international ITU-T Recommendation H.263. The second has been a radical change in the approach to video content production with the introduction of the content-based coding concept and the addition of scene composition information to the encoded bit-stream. Content-based coding is central to the latest international standards efforts from the ISO/IEC MPEG working group. This thesis reports on extensions to existing compression techniques exploiting a priori knowledge about scene content. Existing, standardised, block-based compression coding techniques were extended with work on arithmetic entropy coding and intra-block prediction. These both form part of the H.263 and MPEG-4 specifications respectively. Object-based coding techniques were developed within a collaborative simulation model, known as SIMOC, then extended with ideas on grid motion vector modelling and vector accuracy confidence estimation. An improved confidence measure for encouraging motion smoothness is proposed. Object-based coding ideas, with those from other model and layer-based coding approaches, influenced the development of content-based coding within MPEG-4. This standard made considerable progress in this newly adopted content based video coding field defining normative techniques for arbitrary shape and texture coding. The means to generate this information, the analysis problem, for the content to be coded was intentionally not specified. Further research work in this area concentrated on video segmentation and analysis techniques to exploit the benefits of content based coding for generic frame based video. The work reported here introduces the use of a clustering algorithm on raw data features for providing initial segmentation of video data and subsequent tracking of those image regions through video sequences. Collaborative video analysis frameworks from COST 21 l qual and MPEG-4, combining results from many other segmentation schemes, are also introduced

    Digital Image Access & Retrieval

    Get PDF
    The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

    Statistical Tools for Digital Image Forensics

    Get PDF
    A digitally altered image, often leaving no visual clues of having been tampered with, can be indistinguishable from an authentic image. The tampering, however, may disturb some underlying statistical properties of the image. Under this assumption, we propose five techniques that quantify and detect statistical perturbations found in different forms of tampered images: (1) re-sampled images (e.g., scaled or rotated); (2) manipulated color filter array interpolated images; (3) double JPEG compressed images; (4) images with duplicated regions; and (5) images with inconsistent noise patterns. These techniques work in the absence of any embedded watermarks or signatures. For each technique we develop the theoretical foundation, show its effectiveness on credible forgeries, and analyze its sensitivity and robustness to simple counter-attacks
    corecore