561 research outputs found

    A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

    Get PDF
    Many factors can affect the users’ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users’ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL

    Speech Quality Classifier Model based on DBN that Considers Atmospheric Phenomena

    Get PDF
    Current implementations of 5G networks consider higher frequency range of operation than previous telecommunication networks, and it is possible to offer higher data rates for different applications. On the other hand, atmospheric phenomena could have a more negative impact on the transmission quality. Thus, the study of the transmitted signal quality at high frequencies is relevant to guaranty the user ́s quality of experience. In this research, the recommendations ITU-R P.838-3 and ITU-R P.676-11 are implemented in a network scenario, which are methodologies to estimate the signal degradations originated by rainfall and atmospheric gases, respectively. Thus, speech signals are encoded by the AMR-WB codec, transmitted and the perceptual speech quality is evaluated using the algorithm described in ITU-T Rec. P.863, mostly known as POLQA. The novelty of this work is to propose a non-intrusive speech quality classifier that considers atmospheric phenomena. This classifier is based on Deep Belief Networks (DBN) that uses Support Vector Machine (SVM) with radial basis function kernel (RBF-SVM) as classifier, to identify five predefined speech quality classes. Experimental Results show that the proposed speech quality classifier reached an accuracy between 92% and 95% for each quality class overcoming the results obtained by the sole non-intrusive standard described in ITU-T Recommendation P.563. Furthermore, subjective tests are carried out to validate the proposed classifier performance, and it reached an accuracy of 94.8%

    Machine Learning for Multimedia Communications

    Get PDF
    Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise

    Semantics-Empowered Communication: A Tutorial-cum-Survey

    Full text link
    Along with the springing up of the semantics-empowered communication (SemCom) research, it is now witnessing an unprecedentedly growing interest towards a wide range of aspects (e.g., theories, applications, metrics and implementations) in both academia and industry. In this work, we primarily aim to provide a comprehensive survey on both the background and research taxonomy, as well as a detailed technical tutorial. Specifically, we start by reviewing the literature and answering the "what" and "why" questions in semantic transmissions. Afterwards, we present the ecosystems of SemCom, including history, theories, metrics, datasets and toolkits, on top of which the taxonomy for research directions is presented. Furthermore, we propose to categorize the critical enabling techniques by explicit and implicit reasoning-based methods, and elaborate on how they evolve and contribute to modern content & channel semantics-empowered communications. Besides reviewing and summarizing the latest efforts in SemCom, we discuss the relations with other communication levels (e.g., conventional communications) from a holistic and unified viewpoint. Subsequently, in order to facilitate future developments and industrial applications, we also highlight advanced practical techniques for boosting semantic accuracy, robustness, and large-scale scalability, just to mention a few. Finally, we discuss the technical challenges that shed light on future research opportunities.Comment: Submitted to an IEEE journal. Copyright might be transferred without further notic

    Machine Learning for Multimedia Communications

    Get PDF
    Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learning-oriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise

    Semantic and effective communications

    Get PDF
    Shannon and Weaver categorized communications into three levels of problems: the technical problem, which tries to answer the question "how accurately can the symbols of communication be transmitted?"; the semantic problem, which asks the question "how precisely do the transmitted symbols convey the desired meaning?"; the effectiveness problem, which strives to answer the question "how effectively does the received meaning affect conduct in the desired way?". Traditionally, communication technologies mainly addressed the technical problem, ignoring the semantics or the effectiveness problems. Recently, there has been increasing interest to address the higher level semantic and effectiveness problems, with proposals ranging from semantic to goal oriented communications. In this thesis, we propose to formulate the semantic problem as a joint source-channel coding (JSCC) problem and the effectiveness problem as a multi-agent partially observable Markov decision process (MA-POMDP). As such, for the semantic problem, we propose DeepWiVe, the first-ever end-to-end JSCC video transmission scheme that leverages the power of deep neural networks (DNNs) to directly map video signals to channel symbols, combining video compression, channel coding, and modulation steps into a single neural transform. We also further show that it is possible to use predefined constellation designs as well as secure the physical layer communication against eavesdroppers for deep learning (DL) driven JSCC schemes, making such schemes much more viable for deployment in the real world. For the effectiveness problem, we propose a novel formulation by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a MA-POMDP, in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively'' over a noisy channel. Moreover, we show that this framework generalizes both the semantic and technical problems. In both instances, we show that the resultant communication scheme is superior to one where the communication is considered separately from the underlying semantic or goal of the problem.Open Acces

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe

    Edge Intelligence : Empowering Intelligence to the Edge of Network

    Get PDF
    Edge intelligence refers to a set of connected systems and devices for data collection, caching, processing, and analysis proximity to where data are captured based on artificial intelligence. Edge intelligence aims at enhancing data processing and protects the privacy and security of the data and users. Although recently emerged, spanning the period from 2011 to now, this field of research has shown explosive growth over the past five years. In this article, we present a thorough and comprehensive survey of the literature surrounding edge intelligence. We first identify four fundamental components of edge intelligence, i.e., edge caching, edge training, edge inference, and edge offloading based on theoretical and practical results pertaining to proposed and deployed systems. We then aim for a systematic classification of the state of the solutions by examining research results and observations for each of the four components and present a taxonomy that includes practical problems, adopted techniques, and application goals. For each category, we elaborate, compare, and analyze the literature from the perspectives of adopted techniques, objectives, performance, advantages and drawbacks, and so on. This article provides a comprehensive survey of edge intelligence and its application areas. In addition, we summarize the development of the emerging research fields and the current state of the art and discuss the important open issues and possible theoretical and technical directions.Peer reviewe
    • 

    corecore