36 research outputs found
Multiple description coding technique to improve the robustness of ACELP based coders AMR-WB
In this paper, a concealment method based on multiple-description coding (MDC) is presented, to improve speech quality deterioration caused by packet loss for algebraic code-excited linear prediction (ACELP) based coders. We apply to the ITU-T G.722.2 coder, a packet loss concealment (PLC) technique, which uses packetization schemes based on MDC. This latter is used with two new designed modes, which are modes 5 and 6 (18,25 and 19,85 kbps, respectively). We introduce our new second-order Markov chain model with four states in order to simulate network losses for different loss rates. The performance measures, with objective and subjective tests under various packet loss conditions, show a significant improvement of speech quality for ACELP based coders. The wideband perceptual evaluation of speech quality (WB-PESQ), enhanced modified bark spectral distortion (EMBSD), mean opinion score (MOS) tests and MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) for speech extracted from TIMIT database confirm the efficiency of our proposed approach and show a considerable enhancement in speech quality compared to the embedded algorithm in the standard ITU-T G.722.2
A novel approach to emergency management of wireless telecommunication system
The survivability concerns the service continuity when the components of a system are damaged. This concept is especially useful in the emergency management of the system, as often emergencies involve accidents or incident disasters which more or less damage the system. The overall objective of this thesis study is to develop a quantitative management approach to the emergency management of a wireless cellular telecommunication system in light of its service continuity in emergency situations â namely the survivability of the system. A particular wireless cellular telecommunication system, WCDMA, is taken as an example to ground this research.The thesis proposes an ontology-based paradigm for service management such that the management system contains three models: (1) the work domain model, (2) the dynamic model, and (3) the reconfiguration model. A powerful work domain modeling tool called Function-Behavior-Structure (FBS) is employed for developing the work domain model of the WCDMA system. Petri-Net theory, as well as its formalization, is applied to develop the dynamic model of the WCDMA system. A concept in engineering design called the general and specific function concept is applied to develop a new approach to system reconfiguration for the high survivability of the system. These models are implemented along with a user-interface which can be used by emergency management personnel. A demonstration of the effectiveness of this study approach is included.There are a couple of contributions with this thesis study. First, the proposed approach can be added to contemporary telecommunication management systems. Second, the Petri Net model of the WCDMA system is more comprehensive than any dynamic model of the telecommunication systems in literature. Furthermore, this model can be extended to any other telecommunication system. Third, the proposed system reconfiguration approach, based on the general and specific function concept, offers a unique way for the survivability of any service provider system.In conclusion, the ontology-based paradigm for a service system management provides a total solution to service continuity as well as its emergency management. This paradigm makes the complex mathematical modeling of the system transparent to the manager or managerial personnel and provides a feasible scenario of the human-in-the-loop management
A-Interface Over Internet Protocol For User-Plane Connection Optimization In GSM/EDGE Radio Access Network
This thesis will cover a detailed study about the main motivations and benefits from using IP as a transport protocol for specifically A-interface in GERAN for Circuit Switched User-Plane (CS-UP) connection, in addition to the required protocols.
The main study in this document will be around Real Time Protocol (RTP), Real Time Control Protocol (RTCP) negotiation for RTP packets multiplexing, for both cases, with and without RTP header compression. The focus will be about the communication between the Base Station Controller (BSC) and the Media GateWay (MGW), the bandwidth gain in accordance to the multiplexing delay for processing and buffering, the voice Quality of Service (QoS) and some other parameters
Utilizing DSP for IP telephony applications in mobile terminals
TÀssÀ diplomityössÀ etsitÀÀn ja mÀÀritellÀÀn optimaalinen ohjelmistoarkkitehtuuri reaaliaikaisen puheenkoodauksen mahdollistamiseksi mobiilin laitteen Internet-puheluohjelmistossa. Arkkitehtuurille asetettiin vaatimus, jonka mukaan puhelu ja siihen liittyvÀ puheen reaaliaikaisuus ei saa rajoittaa tai liikaa kuormittaa laitteen muuta toiminnallisuutta.
TyössÀ kÀytetty mobiili laite tarjoaa mahdollisuuden hyödyntÀÀ kahta prosessoria. Toinen prosessoreista on tarkoitettu yleisille kÀyttöjÀrjestelmille sekÀ ohjelmistoille ja toinen signaalinkÀsittelyoperaatioille. Suunniteltu arkkitehtuuri yhdistÀÀ nÀiden kahden prosessorin toiminnallisuuden ja mahdollistaa reaaliaikaisen puheenkoodauksen (sekÀ toisto ettÀ ÀÀnitys) mobiliisissa laitteessa.
Arkkitehtuuri toteutettiin ja sen suorituskykyÀ arvioitiin erilaisilla mittauksilla ja parametreilla. Havaittiin, ettÀ toteutus suoriutuu erinomaisesti sille asetetuista vaatimuksista. Todettiin myös, ettÀ kÀytettÀessÀ ainoastaan laitteen yhtÀ prosessoria reaaliaikavaatimus ei tÀyty. TÀmÀ johtuu puhekoodekin matemaattisesta kompleksisuudesta ja laitteen rajoitetuista ominaisuuksista.
Työn aikana jÀtettiin kaksi patenttihakemusta.In this thesis, an optimal software architecture is studied and defined for enabling a real-time speech coding scheme in an Internet telephony application of a mobile terminal. According to a requirement set for the architecture, a phone call and the related real-time speech coding shall not limit or overload other functionality of the terminal.
The mobile terminal utilized in this thesis provides a potential to take advantage of the efficiency of a dual core processor. One of the processors is designed for general purpose operating systems, and the other one for signal processing operations. The designed software architecture combines the functionality of these processors and enables real-time speech coding (both playback and capture) in the device.
The architecture was implemented and its performance was evaluated with different measurements and parameters. It was observed that the implementation outperforms the requirements set. It was also confirmed that the performance of the general purpose processor is inadequate for real-time operations with the chosen speech coder/decoder.
Two patent applications were filed by the author during the writing of this thesis
Improved Performance of Secured VoIP Via Enhanced Blowfish Encryption Algorithm
Both the development and the integration of efficient network, open source technology, and Voice over Internet Protocol (VoIP) applications have been increasingly important and gained quick popularity due to new rapidly emerging IP-based network technology. Nonetheless, security and privacy concerns have emerged as issues that need to be addressed. The privacy process ensures that encryption and decryption methods protect the data from being alternate and intercept, a privacy VoIP call will contribute to private and confidential conversation purposes such as telebanking, telepsychiatry, health, safety issues and many more. Hence, this study had quantified VoIP performance and voice quality under security implementation with the technique of IPSec and the enhancement of the Blowfish encryption algorithm. In fact, the primary objective of this study is to improve the performance of Blowfish encryption algorithm. The proposed algorithm was tested with varying network topologies and a variety of audio codecs, which contributed to the impact upon VoIP network. A network testbed with seven experiments and network configurations had been set up in two labs to determine its effects on network performance. Besides, an experimental work using OPNET simulations under 54 experiments of network scenarios were compared with the network testbed for validation and verification purposes. Next, an enhanced Blowfish algorithm for VoIP services had been designed and executed throughout this research. From the stance of VoIP session and services performance, the redesign of the Blowfish algorithm displayed several significant effects that improved both the performance of VoIP network and the quality of voice. This finding indicates some available opportunities that could enhance encrypted algorithm, data privacy, and integrity; where the balance between Quality of Services (QoS) and security techniques can be applied to boost network throughput, performance, and voice quality of existing VoIP services. With that, this study had executed and contributed to a threefold aspect, which refers to the redesign of the Blowfish algorithm that could minimize computational resources. In addition, the VoIP network performance was analysed and compared in terms of end-to-end delay, jitter, packet loss, and finally, sought improvement for voice quality in VoIP services, as well as the effect of the designed enhanced Blowfish algorithm upon voice quality, which had been quantified by using a variety of voice codecs
Recommended from our members
Multimedia delivery in the future internet
The term âNetworked Mediaâ implies that all kinds of media including text, image, 3D graphics, audio
and video are produced, distributed, shared, managed and consumed on-line through various networks,
like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white
paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked
challenges of the Networked Media in the transition to the Future of the Internet.
Internet has evolved and changed the way we work and live. End users of the Internet have been confronted
with a bewildering range of media, services and applications and of technological innovations concerning
media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace
of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more
than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so
regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected
to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising
to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged
that in a near- to mid-term future, the Internet will provide the means to share and distribute (new)
multimedia content and services with superior quality and striking flexibility, in a trusted and personalized
way, improving citizensâ quality of life, working conditions, edutainment and safety.
In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe
network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as
community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of
interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and
innovative applications âon the moveâ, like virtual collaboration environments, personalised services/
media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content
combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P
networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to
contribute towards such a vision.
Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6)
and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily
contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way
ahead in the area of Content Aware media delivery platforms
Final report on the evaluation of RRM/CRRM algorithms
Deliverable public del projecte EVERESTThis deliverable provides a definition and a complete evaluation of the RRM/CRRM algorithms selected in D11 and D15, and evolved and refined on an iterative process. The evaluation will be carried out by means of simulations using the simulators provided at D07, and D14.Preprin
Apprentissage automatique pour le codage cognitif de la parole
Depuis les années 80, les codecs vocaux reposent sur des stratégies de codage à court terme qui fonctionnent au niveau de la sous-trame ou de la trame (généralement 5 à 20 ms). Les chercheurs ont essentiellement ajusté et combiné un nombre limité de technologies disponibles (transformation, prédiction linéaire, quantification) et de stratégies (suivi de forme d'onde, mise en forme du bruit) pour construire des architectures de codage de plus en plus complexes.
Dans cette thÚse, plutÎt que de s'appuyer sur des stratégies de codage à court terme, nous développons un cadre alternatif pour la compression de la parole en codant les attributs de la parole qui sont des caractéristiques perceptuellement importantes des signaux vocaux. Afin d'atteindre cet objectif, nous résolvons trois problÚmes de complexité croissante, à savoir la classification, la prédiction et l'apprentissage des représentations. La classification est un élément courant dans les conceptions de codecs modernes. Dans un premier temps, nous concevons un classifieur pour identifier les émotions, qui sont parmi les attributs à long terme les plus complexes de la parole. Dans une deuxiÚme étape, nous concevons un prédicteur d'échantillon de parole, qui est un autre élément commun dans les conceptions de codecs modernes, pour mettre en évidence les avantages du traitement du signal de parole à long terme et non linéaire. Ensuite, nous explorons les variables latentes, un espace de représentations de la parole, pour coder les attributs de la parole à court et à long terme. Enfin, nous proposons un réseau décodeur pour synthétiser les signaux de parole à partir de ces représentations, ce qui constitue notre derniÚre étape vers la construction d'une méthode complÚte de compression de la parole basée sur l'apprentissage automatique de bout en bout.
Bien que chaque étape de développement proposée dans cette thÚse puisse faire partie d'un codec à elle seule, chaque étape fournit également des informations et une base pour la prochaine étape de développement jusqu'à ce qu'un codec entiÚrement basé sur l'apprentissage automatique soit atteint.
Les deux premiĂšres Ă©tapes, la classification et la prĂ©diction, fournissent de nouveaux outils qui pourraient remplacer et amĂ©liorer des Ă©lĂ©ments des codecs existants. Dans la premiĂšre Ă©tape, nous utilisons une combinaison de modĂšle source-filtre et de machine Ă Ă©tat liquide (LSM), pour dĂ©montrer que les caractĂ©ristiques liĂ©es aux Ă©motions peuvent ĂȘtre facilement extraites et classĂ©es Ă l'aide d'un simple classificateur. Dans la deuxiĂšme Ă©tape, un seul rĂ©seau de bout en bout utilisant une longue mĂ©moire Ă court terme (LSTM) est utilisĂ© pour produire des trames vocales avec une qualitĂ© subjective Ă©levĂ©e pour les applications de masquage de perte de paquets (PLC).
Dans les derniÚres étapes, nous nous appuyons sur les résultats des étapes précédentes pour concevoir un codec entiÚrement basé sur l'apprentissage automatique. un réseau d'encodage, formulé à l'aide d'un réseau neuronal profond (DNN) et entraßné sur plusieurs bases de données publiques, extrait et encode les représentations de la parole en utilisant la prédiction dans un espace latent. Une approche d'apprentissage non supervisé basée sur plusieurs principes de cognition est proposée pour extraire des représentations à partir de trames de parole courtes et longues en utilisant l'information mutuelle et la perte contrastive. La capacité de ces représentations apprises à capturer divers attributs de la parole à court et à long terme est démontrée.
Enfin, une structure de décodage est proposée pour synthétiser des signaux de parole à partir de ces représentations. L'entraßnement contradictoire est utilisé comme une approximation des mesures subjectives de la qualité de la parole afin de synthétiser des échantillons de parole à consonance naturelle. La haute qualité perceptuelle de la parole synthétisée ainsi obtenue prouve que les représentations extraites sont efficaces pour préserver toutes sortes d'attributs de la parole et donc qu'une méthode de compression complÚte est démontrée avec l'approche proposée.Abstract: Since the 80s, speech codecs have relied on short-term coding strategies that operate at the subframe or frame level (typically 5 to 20ms). Researchers essentially adjusted and combined a limited number of available technologies (transform, linear prediction, quantization) and strategies (waveform matching, noise shaping) to build increasingly complex coding architectures. In this thesis, rather than relying on short-term coding strategies, we develop an alternative framework for speech compression by encoding speech attributes that are perceptually important characteristics of speech signals. In order to achieve this objective, we solve three problems of increasing complexity, namely classification, prediction and representation learning. Classification is a common element in modern codec designs. In a first step, we design a classifier to identify emotions, which are among the most complex long-term speech attributes. In a second step, we design a speech sample predictor, which is another common element in modern codec designs, to highlight the benefits of long-term and non-linear speech signal processing. Then, we explore latent variables, a space of speech representations, to encode both short-term and long-term speech attributes. Lastly, we propose a decoder network to synthesize speech signals from these representations, which constitutes our final step towards building a complete, end-to-end machine-learning based speech compression method. The first two steps, classification and prediction, provide new tools that could replace and improve elements of existing codecs. In the first step, we use a combination of source-filter model and liquid state machine (LSM), to demonstrate that features related to emotions can be easily extracted and classified using a simple classifier. In the second step, a single end-to-end network using long short-term memory (LSTM) is shown to produce speech frames with high subjective quality for packet loss concealment (PLC) applications. In the last steps, we build upon the results of previous steps to design a fully machine learning-based codec. An encoder network, formulated using a deep neural network (DNN) and trained on multiple public databases, extracts and encodes speech representations using prediction in a latent space. An unsupervised learning approach based on several principles of cognition is proposed to extract representations from both short and long frames of data using mutual information and contrastive loss. The ability of these learned representations to capture various short- and long-term speech attributes is demonstrated. Finally, a decoder structure is proposed to synthesize speech signals from these representations. Adversarial training is used as an approximation to subjective speech quality measures in order to synthesize natural-sounding speech samples. The high perceptual quality of synthesized speech thus achieved proves that the extracted representations are efficient at preserving all sorts of speech attributes and therefore that a complete compression method is demonstrated with the proposed approach
Radio Communications
In the last decades the restless evolution of information and communication technologies (ICT) brought to a deep transformation of our habits. The growth of the Internet and the advances in hardware and software implementations modiïŹed our way to communicate and to share information. In this book, an overview of the major issues faced today by researchers in the ïŹeld of radio communications is given through 35 high quality chapters written by specialists working in universities and research centers all over the world. Various aspects will be deeply discussed: channel modeling, beamforming, multiple antennas, cooperative networks, opportunistic scheduling, advanced admission control, handover management, systems performance assessment, routing issues in mobility conditions, localization, web security. Advanced techniques for the radio resource management will be discussed both in single and multiple radio technologies; either in infrastructure, mesh or ad hoc networks