215 research outputs found
Planiranje VoIP/PSTN mreža primjenom programskog paketa PlanVoip
The paper describes the IP network as a basis for the voice services transfer and parameters which influence the quality services level in the voice transfer over the IP network. Among all parameters, the accent is given on delay components, special on variable components. Hereafter is described the application which we have developed for the purpose of the VoIP/PSTN network planning and analyzing. At the end, testing measurements with the developed application are made whereat the influence of the used codec and the link capacity on the subjective score of voice quality (MOS) and on the total delay of voice packet is analyzed. For the purpose of the developed application verification, a comparison analysis of the testing measurements and measurements on the real
network is carried out.U ovom radu opisana je IP mreža kao podloga za prijenos govorne usluge, te parametri koji utječu na razinu kvalitete usluge pri prijenosu govora IP mrežom. Od svih parametara, naglasak je dan na komponente kašnjenja, a poglavito na varijabilne komponente kašnjenja. U nastavku rada je opisan programski paket kojeg smo razvili u svrhu planiranja i analize VoIP/PSTN mreža. Na kraju su provedena simulacijska mjerenja s razvijenim programskim paketom pri čemu je analiziran utjecaj izabranog kodeka i kapaciteta linije na subjektivnu ocjenu kvalitete govora (MOS) i na ukupno kašnjenje govornih paketa. U svrhu verifikacije razvijenog programa provedena je i poredbena analiza rezultata simulacijskih mjerenja i mjerenja na stvarnoj mreži
Complexity management of H.264/AVC video compression.
The H. 264/AVC video coding standard offers significantly improved compression efficiency and flexibility compared to previous standards. However, the high computational complexity of H. 264/AVC is a problem for codecs running on low-power hand held devices and general purpose computers. This thesis presents new techniques to reduce, control and manage the computational complexity of an H. 264/AVC codec. A new complexity reduction algorithm for H. 264/AVC is developed. This algorithm predicts "skipped" macroblocks prior to motion estimation by estimating a Lagrange ratedistortion cost function. Complexity savings are achieved by not processing the macroblocks that are predicted as "skipped". The Lagrange multiplier is adaptively modelled as a function of the quantisation parameter and video sequence statistics. Simulation results show that this algorithm achieves significant complexity savings with a negligible loss in rate-distortion performance. The complexity reduction algorithm is further developed to achieve complexity-scalable control of the encoding process. The Lagrangian cost estimation is extended to incorporate computational complexity. A target level of complexity is maintained by using a feedback algorithm to update the Lagrange multiplier associated with complexity. Results indicate that scalable complexity control of the encoding process can be achieved whilst maintaining near optimal complexity-rate-distortion performance. A complexity management framework is proposed for maximising the perceptual quality of coded video in a real-time processing-power constrained environment. A real-time frame-level control algorithm and a per-frame complexity control algorithm are combined in order to manage the encoding process such that a high frame rate is maintained without significantly losing frame quality. Subjective evaluations show that the managed complexity approach results in higher perceptual quality compared to a reference encoder that drops frames in computationally constrained situations. These novel algorithms are likely to be useful in implementing real-time H. 264/AVC standard encoders in computationally constrained environments such as low-power mobile devices and general purpose computers
Considering Bluetooth's Subband Codec (SBC) for Wideband Speech and Audio on the Internet
The Bluetooth Special Interest Group (SIG) has standardized the subband coding (SBC) audio codec to connect headphones via wireless Bluetooth links. SBC compresses audio at high fidelity while having an ultra-low algorithm delay. To make SBC suitable for the Internet, we extend it by using a time and packet loss concealment (PLC) algorithm that is based on ITU's G.711 Appendix I. The design is novel in the aspect of the interface between codec and speech receiver. We developed a new approach on how to distribute the functionality of a speech receiver between codec and application. Our approach leads to easier implementations of high quality VoIP applications.
We conducted subjective and objective listening tests of the audio quality of SBC and PLC in order to determine an optimal coding mode and the trade-off between coding mode and packet loss rate. More precisely, we conducted MUSHRA listening tests for selected sample items. These tests results are then compared with the results of multiple objective assessment algorithms (ITU P.862 PESQ, ITU BS.1387-1 PEAQ, Creusere's algorithm). We found out that a combination of the PEAQ basic and advanced values best matches---after third order linear regression---the subjective MUSHRA results . The linear regression has coefficient of determination of R²=0.907². By comparison, our individual human ratings show a correlation of about R=0.9 compared to our averaged human rating results.
Using the combination of both PEAQ algorithms, we calculate hundred thousands of objective audio quality ratings varying audio content and algorithmic parameters of SBC and PLC. The results show which set of parameters value are best suitable for a bandwidth and delay constrained link. The transmission quality of SBC is enhanced significantly by selecting optimal encoding parameters as compared to the default parameter sets given in the standard.
Finally, we present preliminary objective tests results on the comparison of the audio codecs SBC, CELT, APT-X and ULD coding speech and audio transmission. They all allow a mono and stereo transmission of music at ultra-low coding delays (<10ms), which is especially useful for distributed ensemble performances over the Internet
Efficient compression of synthetic video
Streaming of on-line gaming video is a challenging problem because of the enormous
amounts of video data that need to be sent during game playing, especially within the
limitations of uplink capabilities. The encoding complexity is also a challenge because of
the time delay while on-line gamers are communicating.
The main goal of this research study is to propose an enhanced on-line game video
streaming system. First, the most common video coding techniques have been evaluated.
The evaluation study considers objective and subjective metrics. Three widespread video
coding techniques are selected and evaluated in the study; H.264, MPEG-4 Visual and VP-
8. Diverse types of video sequences were used with different frame rates and resolutions.
The effects of changing frame rate and resolution on compression efficiency and viewers‟
satisfaction are also presented. Results showed that the compression process and perceptual
satisfaction are severely affected by the nature of the compressed sequence. As a result,
H.264 showed higher compression efficiency for synthetic sequences and outperformed
other codecs in the subjective evaluation tests.
Second, a fast inter prediction technique to speed up the encoding process of H.264 has
been devised. The on-line game streaming service is a real time application, thus,
compression complexity significantly affects the whole process of on-line streaming. H.264
has been recommended for synthetic video coding by our results gained in codecs
comparative studies. However, it still suffers from high encoding complexity; thus a low
complexity coding algorithm is presented as fast inter coding model with reference
management technique. The proposed algorithm was compared to a state of the art method,
the results showing better achievement in time and bit rate reduction with negligible loss of
fidelity.
Third, recommendations on tradeoff between frame rates and resolution within given uplink
capabilities are provided for H.264 video coding. The recommended tradeoffs are offered as a result of extensive experiments using Double Stimulus Impairment Scale (DSIS)
subjective evaluation metric. Experiments showed that viewers‟ satisfaction is profoundly
affected by varying frame rates and resolutions. In addition, increasing frame rate or frame
resolution does not always guarantee improved increments of perceptual quality. As a
result, tradeoffs are recommended to compromise between frame rate and resolution within
a given bit rate to guarantee the highest user satisfaction.
For system completeness and to facilitate the implementation of the proposed techniques,
an efficient game video streaming management system is proposed.
Compared to existing on-line live video service systems for games, the proposed system
provides improved coding efficiency, complexity reduction and better user satisfaction
Coded Speech Quality Measurement by a Non-Intrusive PESQ-DNN
Wideband codecs such as AMR-WB or EVS are widely used in (mobile) speech
communication. Evaluation of coded speech quality is often performed
subjectively by an absolute category rating (ACR) listening test. However, the
ACR test is impractical for online monitoring of speech communication networks.
Perceptual evaluation of speech quality (PESQ) is one of the widely used
metrics instrumentally predicting the results of an ACR test. However, the PESQ
algorithm requires an original reference signal, which is usually unavailable
in network monitoring, thus limiting its applicability. NISQA is a new
non-intrusive neural-network-based speech quality measure, focusing on
super-wideband speech signals. In this work, however, we aim at predicting the
well-known PESQ metric using a non-intrusive PESQ-DNN model. We illustrate the
potential of this model by predicting the PESQ scores of wideband-coded speech
obtained from AMR-WB or EVS codecs operating at different bitrates in noisy,
tandeming, and error-prone transmission conditions. We compare our methods with
the state-of-the-art network topologies of QualityNet, WaweNet, and DNSMOS --
all applied to PESQ prediction -- by measuring the mean absolute error (MAE)
and the linear correlation coefficient (LCC). The proposed PESQ-DNN offers the
best total MAE and LCC of 0.11 and 0.92, respectively, in conditions without
frame loss, and still is best when including frame loss. Note that our model
could be similarly used to non-intrusively predict POLQA or other (intrusive)
metrics. Upon article acceptance, code will be provided at GitHub
- …