7,296 research outputs found

    A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations

    Get PDF
    Many factors can affect the users’ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of users’ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL

    QoE Modelling, Measurement and Prediction: A Review

    Full text link
    In mobile computing systems, users can access network services anywhere and anytime using mobile devices such as tablets and smart phones. These devices connect to the Internet via network or telecommunications operators. Users usually have some expectations about the services provided to them by different operators. Users' expectations along with additional factors such as cognitive and behavioural states, cost, and network quality of service (QoS) may determine their quality of experience (QoE). If users are not satisfied with their QoE, they may switch to different providers or may stop using a particular application or service. Thus, QoE measurement and prediction techniques may benefit users in availing personalized services from service providers. On the other hand, it can help service providers to achieve lower user-operator switchover. This paper presents a review of the state-the-art research in the area of QoE modelling, measurement and prediction. In particular, we investigate and discuss the strengths and shortcomings of existing techniques. Finally, we present future research directions for developing novel QoE measurement and prediction technique

    Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition

    Get PDF
    In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission errors. Whilst the first one has already received a lot of attention, the last two deserve further investigation in our opinion. We have found out that band-pass filtering of the recognition features improves ASR performance when distortions due to these particular communication systems are present. Furthermore, we have evaluated two alternative configurations at different bit error rates (BER) typical of these channels: band-pass filtering the LP-MFCC parameters or a modification of the RASTA-PLP using a sharper low-pass section perform consistently better than LP-MFCC and RASTA-PLP, respectively.Publicad

    Speech Quality Classifier Model based on DBN that Considers Atmospheric Phenomena

    Get PDF
    Current implementations of 5G networks consider higher frequency range of operation than previous telecommunication networks, and it is possible to offer higher data rates for different applications. On the other hand, atmospheric phenomena could have a more negative impact on the transmission quality. Thus, the study of the transmitted signal quality at high frequencies is relevant to guaranty the user ́s quality of experience. In this research, the recommendations ITU-R P.838-3 and ITU-R P.676-11 are implemented in a network scenario, which are methodologies to estimate the signal degradations originated by rainfall and atmospheric gases, respectively. Thus, speech signals are encoded by the AMR-WB codec, transmitted and the perceptual speech quality is evaluated using the algorithm described in ITU-T Rec. P.863, mostly known as POLQA. The novelty of this work is to propose a non-intrusive speech quality classifier that considers atmospheric phenomena. This classifier is based on Deep Belief Networks (DBN) that uses Support Vector Machine (SVM) with radial basis function kernel (RBF-SVM) as classifier, to identify five predefined speech quality classes. Experimental Results show that the proposed speech quality classifier reached an accuracy between 92% and 95% for each quality class overcoming the results obtained by the sole non-intrusive standard described in ITU-T Recommendation P.563. Furthermore, subjective tests are carried out to validate the proposed classifier performance, and it reached an accuracy of 94.8%

    DeepVoCoder: A CNN model for compression and coding of narrow band speech

    Get PDF
    This paper proposes a convolutional neural network (CNN)-based encoder model to compress and code speech signal directly from raw input speech. Although the model can synthesize wideband speech by implicit bandwidth extension, narrowband is preferred for IP telephony and telecommunications purposes. The model takes time domain speech samples as inputs and encodes them using a cascade of convolutional filters in multiple layers, where pooling is applied after some layers to downsample the encoded speech by half. The final bottleneck layer of the CNN encoder provides an abstract and compact representation of the speech signal. In this paper, it is demonstrated that this compact representation is sufficient to reconstruct the original speech signal in high quality using the CNN decoder. This paper also discusses the theoretical background of why and how CNN may be used for end-to-end speech compression and coding. The complexity, delay, memory requirements, and bit rate versus quality are discussed in the experimental results.Web of Science7750897508

    Understanding user experience of mobile video: Framework, measurement, and optimization

    Get PDF
    Since users have become the focus of product/service design in last decade, the term User eXperience (UX) has been frequently used in the field of Human-Computer-Interaction (HCI). Research on UX facilitates a better understanding of the various aspects of the user’s interaction with the product or service. Mobile video, as a new and promising service and research field, has attracted great attention. Due to the significance of UX in the success of mobile video (Jordan, 2002), many researchers have centered on this area, examining users’ expectations, motivations, requirements, and usage context. As a result, many influencing factors have been explored (Buchinger, Kriglstein, Brandt & Hlavacs, 2011; Buchinger, Kriglstein & Hlavacs, 2009). However, a general framework for specific mobile video service is lacking for structuring such a great number of factors. To measure user experience of multimedia services such as mobile video, quality of experience (QoE) has recently become a prominent concept. In contrast to the traditionally used concept quality of service (QoS), QoE not only involves objectively measuring the delivered service but also takes into account user’s needs and desires when using the service, emphasizing the user’s overall acceptability on the service. Many QoE metrics are able to estimate the user perceived quality or acceptability of mobile video, but may be not enough accurate for the overall UX prediction due to the complexity of UX. Only a few frameworks of QoE have addressed more aspects of UX for mobile multimedia applications but need be transformed into practical measures. The challenge of optimizing UX remains adaptations to the resource constrains (e.g., network conditions, mobile device capabilities, and heterogeneous usage contexts) as well as meeting complicated user requirements (e.g., usage purposes and personal preferences). In this chapter, we investigate the existing important UX frameworks, compare their similarities and discuss some important features that fit in the mobile video service. Based on the previous research, we propose a simple UX framework for mobile video application by mapping a variety of influencing factors of UX upon a typical mobile video delivery system. Each component and its factors are explored with comprehensive literature reviews. The proposed framework may benefit in user-centred design of mobile video through taking a complete consideration of UX influences and in improvement of mobile videoservice quality by adjusting the values of certain factors to produce a positive user experience. It may also facilitate relative research in the way of locating important issues to study, clarifying research scopes, and setting up proper study procedures. We then review a great deal of research on UX measurement, including QoE metrics and QoE frameworks of mobile multimedia. Finally, we discuss how to achieve an optimal quality of user experience by focusing on the issues of various aspects of UX of mobile video. In the conclusion, we suggest some open issues for future study

    VoIP Quality Assessment Technologies

    Get PDF
    • 

    corecore