13 research outputs found

    Modelação de redes sem fios para comunicações multimédia

    Get PDF
    Mestrado em Engenharia Electrónica e TelecomunicaçõesAs técnicas de codificação, transporte e transmissão de sinais digitais multimédia têm evoluído significativamente na última década. Têm sido publicados uma grande quantidade de estudos e normas relativos às questões acima referidas. Essencialmente, esta dissertação discute as normas da camada de transporte e apresenta resultados relativos à qualidade duma comunicação multimédia multiplexada em pacotes. Ao nível da camada de rede, o protocolo IP (Internet Protocol) tem servido não apenas para endereçamento mas também para a interligação de redes de telecomunicações, com tecnologias heterogéneas nas camadas inferiores (ligação e física). Acima da camada de rede, a informação fonte comprimida é encapsulada e transportada pelos protocolos UDP (User Datagram Protocol) e RTP (Real Time Transport Protocol). O objectivo destes protocolos de transporte/sessão é identificar a existência de erros nos dados fonte e associar informação de sincronismo e sequência. Esta informação permite o receptor minimizar os efeitos dos erros de comunicação e sincronizar diferentes conteúdos de media. Esta dissertação apresenta as técnicas de encapsulamento usadas num sistema de comunicação áudio-visual sobre um canal de onda curta HF (High Frequency). Estas técnicas foram estudadas no âmbito do projecto NAVIO (Navy Video), apoiado pela Marinha de Guerra Portuguesa. Para além disso, são apresentados alguns resultados relacionados com as perdas de pacotes que permitem avaliar a qualidade dos sinais reconstruídos na recepção. O principal objectivo do projecto é investigar e implementar um serviço de videoconferência, terra-navio. O sistema de comunicação sobre HF apresentado nesta dissertação utiliza apenas 4 kbps, dos quais 2 kbps são usados para a codificação do áudio e os outros 2 kbps para a codificação de alguns parâmetros que representam a face humana. Esta dissertação apresenta também um estudo semelhante para a rede IEEE 802.11g em resultado do trabalho realizado no projecto TRIVial (Transmissão Rádio Interactiva de Video Local) apoiado pela Philips-RCS (Remote Control System) e pela Agência de Inovação. Como resultado do conhecimento obtido, do estudo de ambas as redes em separado, é feita a análise da possibilidade de implementar um sistema constituído por duas redes locais sem fios ligadas por um canal HF. Esta análise permite prever a perda de pacotes duma comunicação multimédia caracterizada segundo o seguinte cenário: um terminal inserido numa rede local sem fios em terra a comunicar com um terminal inserido numa rede sem fios existente num navio, permanecendo ambas ligadas por um canal HF.In the last decade, the coding techniques and transmission of digital multimedia has evolved significantly. Here have been published an enormous amount of papers, reports and standards concerned with the issues mentioned above. Generally speaking, this thesis discusses some standards related to the transport layer and presents some results related to the quality of a packetized multimedia communication. At the network layer, the IP (Internet Protocol) has served, not only for routing, but also for the interconnection of diverse telecommunications networks with heterogeneous technologies at lower layers (link and physical). Above the network layer, the compressed source data is encapsulated and transported making use of UDP (User Datagram Protocol) and RTP (Real Time Transport Protocol) protocol structures. The main goal of these transport/session protocols are to signal any communications errors and to include synchronization information as well as inserting packet sequence numbers. This thesis presents the encapsulation techniques used in an audio-visual communication system over a HF (High Frequency) channel. These techniques were studied in the NAVIO (Navy Video) project framework, a project supported by the Portuguese Navy. Furthermore, this thesis presents some results related to packet loss which allow us to evaluate the quality of the reconstructed signals at the receiver. The main goal of NAVIO project is to research and develop a video conference service running on earth and war vessel stations. The communication system over HF spends 4 kbps, in which 2 kbps are used for audio encoders and the remaining 2 kbps for encoding some face animation parameters. This thesis also presents a similar detailed study for IEEE 802.11g wireless network in consequence of a research work carried out in the TRIVial (Transmissão Rádio Interactiva de Video Local) project supported by Philips- RCS (Remote Control System) and Agência de Inovação. As a result of the obtained knowledge, from both networks study, we analyse the possibility of implementing an interconnected platform thereby, two wireless local networks are connected by a HF link. This analysis allow us to estimate the multimedia packet loss under the following scenario: a terminal as part of a wireless local network in earth, connected to a terminal as part of another wireless local network in a war vessel where both networks are interconnected by a HF link

    The development of speech coding and the first standard coder for public mobile telephony

    Get PDF
    This thesis describes in its core chapter (Chapter 4) the original algorithmic and design features of the ??rst coder for public mobile telephony, the GSM full-rate speech coder, as standardized in 1988. It has never been described in so much detail as presented here. The coder is put in a historical perspective by two preceding chapters on the history of speech production models and the development of speech coding techniques until the mid 1980s, respectively. In the epilogue a brief review is given of later developments in speech coding. The introductory Chapter 1 starts with some preliminaries. It is de- ??ned what speech coding is and the reader is introduced to speech coding standards and the standardization institutes which set them. Then, the attributes of a speech coder playing a role in standardization are explained. Subsequently, several applications of speech coders - including mobile telephony - will be discussed and the state of the art in speech coding will be illustrated on the basis of some worldwide recognized standards. Chapter 2 starts with a summary of the features of speech signals and their source, the human speech organ. Then, historical models of speech production which form the basis of di??erent kinds of modern speech coders are discussed. Starting with a review of ancient mechanical models, we will arrive at the electrical source-??lter model of the 1930s. Subsequently, the acoustic-tube models as they arose in the 1950s and 1960s are discussed. Finally the 1970s are reviewed which brought the discrete-time ??lter model on the basis of linear prediction. In a unique way the logical sequencing of these models is exposed, and the links are discussed. Whereas the historical models are discussed in a narrative style, the acoustic tube models and the linear prediction tech nique as applied to speech, are subject to more mathematical analysis in order to create a sound basis for the treatise of Chapter 4. This trend continues in Chapter 3, whenever instrumental in completing that basis. In Chapter 3 the reader is taken by the hand on a guided tour through time during which successive speech coding methods pass in review. In an original way special attention is paid to the evolutionary aspect. Speci??cally, for each newly proposed method it is discussed what it added to the known techniques of the time. After presenting the relevant predecessors starting with Pulse Code Modulation (PCM) and the early vocoders of the 1930s, we will arrive at Residual-Excited Linear Predictive (RELP) coders, Analysis-by-Synthesis systems and Regular- Pulse Excitation in 1984. The latter forms the basis of the GSM full-rate coder. In Chapter 4, which constitutes the core of this thesis, explicit forms of Multi-Pulse Excited (MPE) and Regular-Pulse Excited (RPE) analysis-by-synthesis coding systems are developed. Starting from current pulse-amplitude computation methods in 1984, which included solving sets of equations (typically of order 10-16) two hundred times a second, several explicit-form designs are considered by which solving sets of equations in real time is avoided. Then, the design of a speci??c explicitform RPE coder and an associated eÆcient architecture are described. The explicit forms and the resulting architectural features have never been published in so much detail as presented here. Implementation of such a codec enabled real-time operation on a state-of-the-art singlechip digital signal processor of the time. This coder, at a bit rate of 13 kbit/s, has been selected as the Full-Rate GSM standard in 1988. Its performance is recapitulated. Chapter 5 is an epilogue brie y reviewing the major developments in speech coding technology after 1988. Many speech coding standards have been set, for mobile telephony as well as for other applications, since then. The chapter is concluded by an outlook

    MPEG-4's BIFS-Anim protocol: using MPEG-4 for streaming of 3D animations

    Get PDF
    This thesis explores issues related to the generation and animation of synthetic objects within the context of MPEG-4. MPEG-4 was designed to provide a standard that will deliver rich multimedia content on many different platforms and networks. MPEG-4 should be viewed as a toolbox rather than as a monolithic standard as each implementer of the standard will pick the necessary tools adequate to their needs, likely to be a small subset of the available tools. The subset of MPEG-4 that will be examined here are the tools relating to the generation of 3D scenes and to the animation of those scenes. A comparison with the most popular 3D standard, Virtual Reality Modeling Language (VRML) will be included. An overview of the MPEG-4 standard will be given, describing the basic concepts. MPEG-4 uses a scene description language called Binary Format for Scene (BIFS) for the composition of scenes, this description language will be described. The potential for the technology used in BIFS to provide low bitrate streaming 3D animations will be analysed and some examples of the possible uses of this technology will be given. A tool for the encoding of streaming 3D animations will be described and results will be shown that MPEG-4 provides a more efficient way of encoding 3D data when compared to VRML. Finally a look will be taken at the future of 3D content on the Internet

    Object coding of music using expressive MIDI

    Get PDF
    PhDStructured audio uses a high level representation of a signal to produce audio output. When it was first introduced in 1998, creating a structured audio representation from an audio signal was beyond the state-of-the-art. Inspired by object coding and structured audio, we present a system to reproduce audio using Expressive MIDI, high-level parameters being used to represent pitch expression from an audio signal. This allows a low bit-rate MIDI sketch of the original audio to be produced. We examine optimisation techniques which may be suitable for inferring Expressive MIDI parameters from estimated pitch trajectories, considering the effect of data codings on the difficulty of optimisation. We look at some less common Gray codes and examine their effect on algorithm performance on standard test problems. We build an expressive MIDI system, estimating parameters from audio and synthesising output from those parameters. When the parameter estimation succeeds, we find that the system produces note pitch trajectories which match source audio to within 10 pitch cents. We consider the quality of the system in terms of both parameter estimation and the final output, finding that improvements to core components { audio segmentation and pitch estimation, both active research fields { would produce a better system. We examine the current state-of-the-art in pitch estimation, and find that some estimators produce high precision estimates but are prone to harmonic errors, whilst other estimators produce fewer harmonic errors but are less precise. Inspired by this, we produce a novel pitch estimator combining the output of existing estimators

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    Novel Pitch Detection Algorithm With Application to Speech Coding

    Get PDF
    This thesis introduces a novel method for accurate pitch detection and speech segmentation, named Multi-feature, Autocorrelation (ACR) and Wavelet Technique (MAWT). MAWT uses feature extraction, and ACR applied on Linear Predictive Coding (LPC) residuals, with a wavelet-based refinement step. MAWT opens the way for a unique approach to modeling: although speech is divided into segments, the success of voicing decisions is not crucial. Experiments demonstrate the superiority of MAWT in pitch period detection accuracy over existing methods, and illustrate its advantages for speech segmentation. These advantages are more pronounced for gain-varying and transitional speech, and under noisy conditions

    Codage de parole bas débit robuste sur réseau IP utilisant la notion de Qualité de Service

    Get PDF
    Le codage de parole pour les communications temps réel a été conçu pour fonctionner sur des réseaux à commutation de circuits. Il a évolué en fonction de ces réseaux et avec eux. Avec l'avènement du transport de la voix sur les réseaux par paquets, les intégrateurs ont utilisé les systèmes disponibles, c'est-à-dire précisément les codeurs du monde de la commutation de circuits. Même si des méthodes d'augmentation de la robustesse existent pour gérer les problèmes spécifiques du transport par paquets, ces méthodes n'interagissent que peu avec le système de codage lui même. Pour maintenir la qualité de la parole au récepteur, on a construit un système de codage adaptatif qui tient compte des caractéristiques du canal et du type de sons à encoder à un moment donné. Ce système génère un flot qui permet de maintenir la qualité de la parole au récepteur dans une communication temps réel lorsque le réseau se dégrade.Le codage proprement dit et les méthodes de codage de canal se font conjointement en fonction des variations de différents paramètres de Qualité de Service, dans ce qu'on appelle un système de codage par paquets. L'étude est réalisée à partir de données réseau provenant d'Internet, comme exemple de réseau IP possédant des mécanismes encore restreints ou inexistants de gestion de la Qualité de Service

    Very low bit rate parametric audio coding

    Get PDF
    [no abstract
    corecore