383 research outputs found

    An approach to summarize video data in compressed domain

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Electronics and Communication Engineering, Izmir, 2007Includes bibliographical references (leaves: 54-56)Text in English; Abstract: Turkish and Englishx, 59 leavesThe requirements to represent digital video and images efficiently and feasibly have collected great efforts on research, development and standardization over past 20 years. These efforts targeted a vast area of applications such as video on demand, digital TV/HDTV broadcasting, multimedia video databases, surveillance applications etc. Moreover, the applications demand more efficient collections of algorithms to enable lower bit rate levels, with acceptable quality depending on application requirements. In our time, most of the video content either stored, transmitted is in compressed form. The increase in the amount of video data that is being shared attracted interest of researchers on the interrelated problems of video summarization, indexing and abstraction. In this study, the scene cut detection in emerging ISO/ITU H264/AVC coded bit stream is realized by extracting spatio-temporal prediction information directly in the compressed domain. The syntax and semantics, parsing and decoding processes of ISO/ITU H264/AVC bit-stream is analyzed to detect scene information. Various video test data is constructed using Joint Video Team.s test model JM encoder, and implementations are made on JM decoder. The output of the study is the scene information to address video summarization, skimming, indexing applications that use the new generation ISO/ITU H264/AVC video

    On the data hiding theory and multimedia content security applications

    Get PDF
    This dissertation is a comprehensive study of digital steganography for multimedia content protection. With the increasing development of Internet technology, protection and enforcement of multimedia property rights has become a great concern to multimedia authors and distributors. Watermarking technologies provide a possible solution for this problem. The dissertation first briefly introduces the current watermarking schemes, including their applications in video,, image and audio. Most available embedding schemes are based on direct Spread Sequence (SS) modulation. A small value pseudo random signature sequence is embedded into the host signal and the information is extracted via correlation. The correlation detection problem is discussed at the beginning. It is concluded that the correlator is not optimum in oblivious detection. The Maximum Likelihood detector is derived and some feasible suboptimal detectors are also analyzed. Through the calculation of extraction Bit Error Rate (BER), it is revealed that the SS scheme is not very efficient due to its poor host noise suppression. The watermark domain selection problem is addressed subsequently. Some implications on hiding capacity and reliability are also studied. The last topic in SS modulation scheme is the sequence selection. The relationship between sequence bandwidth and synchronization requirement is detailed in the work. It is demonstrated that the white sequence commonly used in watermarking may not really boost watermark security. To address the host noise suppression problem, the hidden communication is modeled as a general hypothesis testing problem and a set partitioning scheme is proposed. Simulation studies and mathematical analysis confirm that it outperforms the SS schemes in host noise suppression. The proposed scheme demonstrates improvement over the existing embedding schemes. Data hiding in audio signals are explored next. The audio data hiding is believed a more challenging task due to the human sensitivity to audio artifacts and advanced feature of current compression techniques. The human psychoacoustic model and human music understanding are also covered in the work. Then as a typical audio perceptual compression scheme, the popular MP3 compression is visited in some length. Several schemes, amplitude modulation, phase modulation and noise substitution are presented together with some experimental results. As a case study, a music bitstream encryption scheme is proposed. In all these applications, human psychoacoustic model plays a very important role. A more advanced audio analysis model is introduced to reveal implications on music understanding. In the last part, conclusions and future research are presented

    Study and Implementation of Watermarking Algorithms

    Get PDF
    Water Making is the process of embedding data called a watermark into a multimedia object such that watermark can be detected or extracted later to make an assertion about the object. The object may be an audio, image or video. A copy of a digital image is identical to the original. This has in many instances, led to the use of digital content with malicious intent. One way to protect multimedia data against illegal recording and retransmission is to embed a signal, called digital signature or copyright label or watermark that authenticates the owner of the data. Data hiding, schemes to embed secondary data in digital media, have made considerable progress in recent years and attracted attention from both academia and industry. Techniques have been proposed for a variety of applications, including ownership protection, authentication and access control. Imperceptibility, robustness against moderate processing such as compression, and the ability to hide many bits are the basic but rat..

    Factor Graph Based Detection Schemes for Mobile Terrestrial DVB Systems with Long OFDM Blocks

    Get PDF
    This PhD dissertation analyzes the performance of second generation digital video broadcasting (DVB) systems in mobile terrestrial environments and proposes an iterative detection algorithm based on factor graphs (FG) to reduce the distortion caused by the time variation of the channel, providing error-free communication in very severe mobile conditions. The research work focuses on mobile scenarios where the intercarrier interference (ICI) is very high: high vehicular speeds when long orthogonal frequency-division multiplexing (OFDM) blocks are used. As a starting point, we provide the theoretical background on the main topics behind the transmission and reception of terrestrial digital television signals in mobile environments, along with a general overview of the main signal processing techniques included in last generation terrestrial DVB systems. The proposed FG-based detector design is then assessed over a simpli ed bit-interleaved coded modulation (BICM)-OFDM communication scheme for a wide variety of mobile environments. Extensive simulation results show the e ectiveness of the proposed belief propagation (BP) algorithm over the channels of interest in this research work. Moreover, assuming that low density parity-check (LDPC) codes are decoded by means of FG-based algorithms, a high-order FG is de ned in order to accomplish joint signal detection and decoding into the same FG framework, o ering a fully parallel structure very suitable when long OFDM blocks are employed. Finally, the proposed algorithms are analyzed over the physical layer of DVB-T2 speci cation. Two reception schemes are proposed which exploit the frequency and time-diversity inherent in time-varying channels with the aim of achieving a reasonable trade-o among performance, complexity and latency.Doktoretza tesi honek bigarren belaunaldiko telebista digitalaren eraginkortasuna aztertzen du eskenatoki mugikorrean, eta faktoreen grafoetan oinarritzen den hartzaile iteratibo bat proposatzen du denboran aldakorra den kanalak sortzen duen distortsioa leundu eta seinalea errorerik gabe hartzea ahalbidetzen duena. Proposatutako detektorea BICM-OFDM komunikazio eskema orokor baten gainean ebaluatu da lurreko broadcasting kanalaren baldintzak kontutan hartuz. Simulazio emaitzek algoritmo honen eraginkortasuna frogatzen dute Doppler frekuentzia handietan. Ikerketa lanaren bigarren zatian, faktoreen grafoetan oinarritutako detektorea eskema turbo zabalago baten baitan txertatu da LDPC dekodi katzaile batekin batera. Hartzaile diseinu honen abantaila nagusia da OFDM simbolo luzeetara ondo egokitzen dela. Azkenik, proposatutako algoritmoa DVB-T2 katearen baitan inplementatu da, bi hartzaile eskema proposatu direlarik seinaleak duen dibertsitate tenporal eta frekuentziala probesteko, beti ere eraginkortasunaren, konplexutasunaren eta latentziaren arteko konpromisoa mantenduz.Este trabajo de tesis analiza el rendimiento de la segunda generación de la televisión digital terreste en escenarios móviles y propone un algoritmo iterativo basado en grafos de factores para la detección de la señal y la reducción de la distorsión causada por la variación temporal del canal, permitiendo así recibir la señal libre de errores. El detector basado en grafos de factores propuesto es evaluado sobre un esquema de comunicaciones general BICM-OFDM en condiciones de transmisión propios de canales de difusión terrestres. Los resultados de simulación presentados muestran la e ciencia del algoritmo de detección propuesto en presencia de frecuencias Doppler muy altas. En una segunda parte del trabajo de investigación, el detector propuesto es incorporado a un esquema turbo junto con un decodi cador LDPC, dando lugar a un receptor iterativo que presenta características especialmente apropiadas para su implementación en sistemas OFDM con longitudes de símbolo elevadas. Por último, se analiza la implementación del algoritmo propuesto sobre la cadena de recepción de DVB-T2. Se presentan dos esquemas de recepción que explotan la diversidad temporal y frecuencial presentes en la señal afectada por canales variantes en el tiempo, consiguiendo un compromiso razonable entre rendimiento, complejidad y latencia

    Brain-Inspired Computational Intelligence via Predictive Coding

    Full text link
    Artificial intelligence (AI) is rapidly becoming one of the key technologies of this century. The majority of results in AI thus far have been achieved using deep neural networks trained with the error backpropagation learning algorithm. However, the ubiquitous adoption of this approach has highlighted some important limitations such as substantial computational cost, difficulty in quantifying uncertainty, lack of robustness, unreliability, and biological implausibility. It is possible that addressing these limitations may require schemes that are inspired and guided by neuroscience theories. One such theory, called predictive coding (PC), has shown promising performance in machine intelligence tasks, exhibiting exciting properties that make it potentially valuable for the machine learning community: PC can model information processing in different brain areas, can be used in cognitive control and robotics, and has a solid mathematical grounding in variational inference, offering a powerful inversion scheme for a specific class of continuous-state generative models. With the hope of foregrounding research in this direction, we survey the literature that has contributed to this perspective, highlighting the many ways that PC might play a role in the future of machine learning and computational intelligence at large.Comment: 37 Pages, 9 Figure

    Investigation of coding and equalization for the digital HDTV terrestrial broadcast channel

    Get PDF
    Includes bibliographical references (p. 241-248).Supported by the Advanced Telecommunications Research Program.Julien J. Nicolas

    Cellular, Wide-Area, and Non-Terrestrial IoT: A Survey on 5G Advances and the Road Towards 6G

    Full text link
    The next wave of wireless technologies is proliferating in connecting things among themselves as well as to humans. In the era of the Internet of things (IoT), billions of sensors, machines, vehicles, drones, and robots will be connected, making the world around us smarter. The IoT will encompass devices that must wirelessly communicate a diverse set of data gathered from the environment for myriad new applications. The ultimate goal is to extract insights from this data and develop solutions that improve quality of life and generate new revenue. Providing large-scale, long-lasting, reliable, and near real-time connectivity is the major challenge in enabling a smart connected world. This paper provides a comprehensive survey on existing and emerging communication solutions for serving IoT applications in the context of cellular, wide-area, as well as non-terrestrial networks. Specifically, wireless technology enhancements for providing IoT access in fifth-generation (5G) and beyond cellular networks, and communication networks over the unlicensed spectrum are presented. Aligned with the main key performance indicators of 5G and beyond 5G networks, we investigate solutions and standards that enable energy efficiency, reliability, low latency, and scalability (connection density) of current and future IoT networks. The solutions include grant-free access and channel coding for short-packet communications, non-orthogonal multiple access, and on-device intelligence. Further, a vision of new paradigm shifts in communication networks in the 2030s is provided, and the integration of the associated new technologies like artificial intelligence, non-terrestrial networks, and new spectra is elaborated. Finally, future research directions toward beyond 5G IoT networks are pointed out.Comment: Submitted for review to IEEE CS&

    Méthodes de codage et d'estimation adaptative appliquées aux communications sans fil

    Get PDF
    Les recherches et les contributions présentées portent sur des techniques de traitement du signal appliquées aux communications sans fil. Elles s’articulent autour des points suivants : (1) l’estimation adaptative de canaux de communication dans différents contextes applicatifs, (2) la correction de bruit impulsionnel et la réduction du niveau de PAPR (Peak to Average Power Ratio) dans un système multi-porteuse, (3) l’optimisation de schémas de transmission pour la diffusion sur des canaux gaussiens avec/sans contrainte de sécurité, (4) l’analyse, l’interprétation et l’amélioration des algorithmes de décodage itératif par le biais de l’optimisation, de la théorie des jeux et des outils statistiques. L’accent est plus particulièrement mis sur le dernier thème

    MPEG-SCORM : ontologia de metadados interoperáveis para integração de padrões multimídia e e-learning

    Get PDF
    Orientador: Yuzo IanoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A convergência entre as mídias digitais propõe uma integração entre as TIC, focadas no domínio do multimídia (sob a responsabilidade do Moving Picture Experts Group, constituindo o subcomitê ISO / IEC JTC1 SC29), e as TICE, (TIC para a Educação, geridas pelo ISO / IEC JTC1 SC36), destacando-se os padrões MPEG, empregados na forma de conteúdo e metadados para o multimídia, e as TICE, aplicadas à Educação a Distância, ou e-Learning (o aprendizado eletrônico). Neste sentido, coloca-se a problemática de desenvolver uma correspondência interoperável de bases normativas, atingindo assim uma proposta inovadora na convergência entre as mídias digitais e as aplicações para e-Learning, essencialmente multimídia. Para este fim, propõe-se criar e aplicar uma ontologia de metadados interoperáveis para web, TV digital e extensões para dispositivos móveis, baseada na integração entre os padrões de metadados MPEG-21 e SCORM, empregando a linguagem XPathAbstract: The convergence of digital media offers an integration of the ICT, focused on telecommunications and multimedia domain (under responsibility of the Moving Picture Experts Group, ISO/IEC JTC1 SC29), with the ICTE (the ICT for Education, managed by the ISO/IEC JTC1 SC36), highlighting the MPEG formats, featured as content and as description metadata potentially applied to the Multimedia or Digital TV and as a technology applied to e-Learning. Regarding this, it is presented the problem of developing an interoperable matching for normative bases, achieving an innovative proposal in the convergence between digital Telecommunications and applications for e-Learning, both essentially multimedia. To achieve this purpose, it is proposed to creating a ontology for interoperability between educational applications in Digital TV environments and vice-versa, simultaneously facilitating the creation of learning metadata based objects for Digital TV programs as well as providing multimedia video content as learning objects for Distance Education. This ontology is designed as interoperable metadata for the Web, Digital TV and e-Learning, built on the integration between MPEG-21 and SCORM metadata standards, employing the XPath languageDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia ElétricaCAPE
    corecore