7,296 research outputs found
A Speech Quality Classifier based on Tree-CNN Algorithm that Considers Network Degradations
Many factors can affect the usersâ quality of experience (QoE) in speech communication services. The impairment factors appear due to physical phenomena that occur in the transmission channel of wireless and wired networks. The monitoring of usersâ QoE is important for service providers. In this context, a non-intrusive speech quality classifier based on the Tree Convolutional Neural Network (Tree-CNN) is proposed. The Tree-CNN is an adaptive network structure composed of hierarchical CNNs models, and its main advantage is to decrease the training time that is very relevant on speech quality assessment methods. In the training phase of the proposed classifier model, impaired speech signals caused by wired and wireless network degradation are used as input. Also, in the network scenario, different modulation schemes and channel degradation intensities, such as packet loss rate, signal-to-noise ratio, and maximum Doppler shift frequencies are implemented. Experimental results demonstrated that the proposed model achieves significant reduction of training time, reaching 25% of reduction in relation to another implementation based on DRBM. The accuracy reached by the Tree-CNN model is almost 95% for each quality class. Performance assessment results show that the proposed classifier based on the Tree-CNN overcomes both the current standardized algorithm described in ITU-T Rec. P.563 and the speech quality assessment method called ViSQOL
QoE Modelling, Measurement and Prediction: A Review
In mobile computing systems, users can access network services anywhere and
anytime using mobile devices such as tablets and smart phones. These devices
connect to the Internet via network or telecommunications operators. Users
usually have some expectations about the services provided to them by different
operators. Users' expectations along with additional factors such as cognitive
and behavioural states, cost, and network quality of service (QoS) may
determine their quality of experience (QoE). If users are not satisfied with
their QoE, they may switch to different providers or may stop using a
particular application or service. Thus, QoE measurement and prediction
techniques may benefit users in availing personalized services from service
providers. On the other hand, it can help service providers to achieve lower
user-operator switchover. This paper presents a review of the state-the-art
research in the area of QoE modelling, measurement and prediction. In
particular, we investigate and discuss the strengths and shortcomings of
existing techniques. Finally, we present future research directions for
developing novel QoE measurement and prediction technique
Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition
In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission errors. Whilst the first one has already received a lot of attention, the last two deserve further investigation in our opinion. We have found out that band-pass filtering of the recognition features improves ASR performance when distortions due to these particular communication systems are present. Furthermore, we have evaluated two alternative configurations at different bit error rates (BER) typical of these channels: band-pass filtering the LP-MFCC parameters or a modification of the RASTA-PLP using a sharper low-pass section perform consistently better than LP-MFCC and RASTA-PLP, respectively.Publicad
Speech Quality Classifier Model based on DBN that Considers Atmospheric Phenomena
Current implementations of 5G networks consider higher frequency range of operation than previous telecommunication networks, and it is possible to offer higher data rates for different applications. On the other hand, atmospheric phenomena could have a more negative impact on the transmission quality. Thus, the study of the transmitted signal quality at high frequencies is relevant to guaranty the user Ìs quality of experience. In this research, the recommendations ITU-R P.838-3 and ITU-R P.676-11 are implemented in a network scenario, which are methodologies to estimate the signal degradations originated by rainfall and atmospheric gases, respectively. Thus, speech signals are encoded by the AMR-WB codec, transmitted and the perceptual speech quality is evaluated using the algorithm described in ITU-T Rec. P.863, mostly known as POLQA. The novelty of this work is to propose a non-intrusive speech quality classifier that considers atmospheric phenomena. This classifier is based on Deep Belief Networks (DBN) that uses Support Vector Machine (SVM) with radial basis function kernel (RBF-SVM) as classifier, to identify five predefined speech quality classes. Experimental Results show that the proposed speech quality classifier reached an accuracy between 92% and 95% for each quality class overcoming the results obtained by the sole non-intrusive standard described in ITU-T Recommendation P.563. Furthermore, subjective tests are carried out to validate the proposed classifier performance, and it reached an accuracy of 94.8%
DeepVoCoder: A CNN model for compression and coding of narrow band speech
This paper proposes a convolutional neural network (CNN)-based encoder model to compress and code speech signal directly from raw input speech. Although the model can synthesize wideband speech by implicit bandwidth extension, narrowband is preferred for IP telephony and telecommunications purposes. The model takes time domain speech samples as inputs and encodes them using a cascade of convolutional filters in multiple layers, where pooling is applied after some layers to downsample the encoded speech by half. The final bottleneck layer of the CNN encoder provides an abstract and compact representation of the speech signal. In this paper, it is demonstrated that this compact representation is sufficient to reconstruct the original speech signal in high quality using the CNN decoder. This paper also discusses the theoretical background of why and how CNN may be used for end-to-end speech compression and coding. The complexity, delay, memory requirements, and bit rate versus quality are discussed in the experimental results.Web of Science7750897508
Understanding user experience of mobile video: Framework, measurement, and optimization
Since users have become the focus of product/service design in last decade, the term User eXperience (UX) has been frequently used in the field of Human-Computer-Interaction (HCI). Research on UX facilitates a better understanding of the various aspects of the userâs interaction with the product or service. Mobile video, as a new and promising service and research field, has attracted great attention. Due to the significance of UX in the success of mobile video (Jordan, 2002), many researchers have centered on this area, examining usersâ expectations, motivations, requirements, and usage context. As a result, many influencing factors have been explored (Buchinger, Kriglstein, Brandt & Hlavacs, 2011; Buchinger, Kriglstein & Hlavacs, 2009). However, a general framework for specific mobile video service is lacking for structuring such a great number of factors. To measure user experience of multimedia services such as mobile video, quality of experience (QoE) has recently become a prominent concept. In contrast to the traditionally used concept quality of service (QoS), QoE not only involves objectively measuring the delivered service but also takes into account userâs needs and desires when using the service, emphasizing the userâs overall acceptability on the service. Many QoE metrics are able to estimate the user perceived quality or acceptability of mobile video, but may be not enough accurate for the overall UX prediction due to the complexity of UX. Only a few frameworks of QoE have addressed more aspects of UX for mobile multimedia applications but need be transformed into practical measures. The challenge of optimizing UX remains adaptations to the resource constrains (e.g., network conditions, mobile device capabilities, and heterogeneous usage contexts) as well as meeting complicated user requirements (e.g., usage purposes and personal preferences). In this chapter, we investigate the existing important UX frameworks, compare their similarities and discuss some important features that fit in the mobile video service. Based on the previous research, we propose a simple UX framework for mobile video application by mapping a variety of influencing factors of UX upon a typical mobile video delivery system. Each component and its factors are explored with comprehensive literature reviews. The proposed framework may benefit in user-centred design of mobile video through taking a complete consideration of UX influences and in improvement of mobile videoservice quality by adjusting the values of certain factors to produce a positive user experience. It may also facilitate relative research in the way of locating important issues to study, clarifying research scopes, and setting up proper study procedures. We then review a great deal of research on UX measurement, including QoE metrics and QoE frameworks of mobile multimedia. Finally, we discuss how to achieve an optimal quality of user experience by focusing on the issues of various aspects of UX of mobile video. In the conclusion, we suggest some open issues for future study
- âŠ