METHOD FOR SELECTING OPTIMUM QUALITY LEVELS IN THE PROCEDURE OF CODING AND SEGMENTATION OF VIDEO SIGNALS FOR ADJUSTABLE VIDEO STREAMING

Vlaović, Jelena

METHOD FOR SELECTING OPTIMUM QUALITY LEVELS IN THE PROCEDURE OF CODING AND SEGMENTATION OF VIDEO SIGNALS FOR ADJUSTABLE VIDEO STREAMING

Authors: Jelena Vlaović
Publication date: 30 June 2022
Publisher: Josip Juraj Strossmayer University of Osijek. Faculty of Electrical Engineering, Computer Science and Information Technology Osijek. Department of Communications. Chair of Multimedia Systems and Digital Television.

Abstract

MPEG standard za promjenjivo prilagodljivo strujanje putem HTTP-a (MPEG DASH) definira format videosegmenata i manifest datoteke. Razvijena je kako bi se osigurala interoperabilnost između različitih aplikacija za videostrujanje koje se koriste za prijenos videosignala od poslužitelja prema korisničkoj aplikaciji, istovremeno osiguravajući najvišu moguću kvalitetu videosignala u različitim mrežnim uvjetima. Uzimajući u obzir da su dostupna istraživanja uglavnom usredotočena na poboljšanje algoritama za prilagodljivo videostrujanje i da predložene metode za pripremu prezentacija videosignala na poslužiteljskoj strani ne uzimaju u obzir prostornu i vremensku aktivnost videosadržaja, koriste veliku računalnu snage za probno kodiranje ili su zaštićene autorskim pravima, u ovoj je disertaciji predložena nova metoda za odabir optimalnih parametara za skupove reprezentacija u postupku kodiranja i segmentacije. Prvi dio istraživanja predstavljen u ovom doktorskom radu usredotočen je na odabir parametara za kodiranje i segmentaciju videosignala metodom koja uzima u obzir prostornu i vremensku aktivnost videosadržaja i ne zahtijeva probno kodiranje svakog videosignala. Predstavljena je metodologiju korištena za razvoj metode te matematički zapisi koji omogućavaju optimalan izbor prijenosnih brzina i odgovarajućih prostornih rezolucija za reprezentacije segmentiranog videosignala na poslužiteljskoj strani. U usporedbi sa segmentacijom dostupnom u relevantnoj literaturi, segmentacija videosignala zasnovana na predloženim metodom ostvaruje veće vrijednosti indeksa strukturne sličnosti SSIM u 92% slučajeva. Tijekom razvoja i ispitivanja predloženih metoda, uočeno je da postoji nedostatak videosignala kodiranih i segmentiranih prema MPEG DASH normi te je u drugom dijelu istraživanja predložena baza od šest videosignala segmentiranih prema predloženoj metodi za odabir broja i parametara reprezentacija. Uz segmente u trajanju od 2, 6 i 10 sekundi, predložena baza videosignala uključuje izvorne videosignale, inicijalizacijske segmente i MPD datoteke u pet različitih profila. Simulacijski okvir koji se može koristiti za usporedbu različitih algoritama za prilagodljivo videostrujanje pomoću QoE parametara i za istraživanje utjecaja parametara segmentacije na ostvarenu kvalitetu videozapisa, razvijen je s obzirom da nema dostupnih rješenja koja se mogu koristiti bez pristupa Internet mreži i bez potrebe za segmentacijom videosignala pri svakoj promjeni parametara kodiranja. Predloženi okvir također uključuje modul za umjetno generiranje mrežnih ispitnih zapisa primjenom Nakagami raspodjele vjerojatnosti. 167 Usporedbom parametara koji utječu na kvalitetu prenesenog videosignala dobivenih primjenom umjetno generirane i izmjerene mrežne ispitne zapise, pokazalo se da se Nakagami raspodjela vjerojatnosti može koristiti za umjetno generiranje mrežnih ispitnih zapisa koji odgovaraju širokom rasponu izmjerenih mrežnih ispitnih zapisa za 3G i 4G mreže. Posljednji dio istraživanja proveden je korištenjem predloženog okvira za ispitivanje utjecaja razlike u MOS vrijednostima korištenim za određivanje optimalnog broja reprezentacija na parametre QoE. Predložena segmentacija postiže veće SSIM vrijednosti bez obzira na mrežne uvjete i ima manji broj i dubinu prebacivanja razina kvalitete. Predloženi okvir također je korišten kako bi se potvrdila pretpostavka da bi razlika u MOS vrijednostima koja se koristi za određivanje optimalnog broja reprezentacija trebala biti veća ako se koriste videosignali s višim prostornim i vremenskim informacijama. Iz rezultata simulacije može se zaključiti da bi za segmentiranje videosignala s višim vrijednostima SI i TI razlika MOS vrijednosti trebala biti tri (za MOS skalu od 0-100), a kada se koriste videosignali s nižim vrijednostima SI i TI, razlika MOS vrijednosti koja se koristi za određivanje optimalnog broja reprezentacija trebala bi biti jedan ili dva kako bi se osigurao kompromis između broja i dubine prebacivanja razina kvalitete, broja i trajanja zastoja u reprodukciji videosignala i SSIM vrijednosti.A standard called MPEG Dynamic Adaptive Streaming over HTTP (MPEG DASH) defines the format of video segments and the manifest file. It was developed to ensure interoperability between various video streaming application used for transferring video sequences from a server to the client while ensuring the highest possible video quality in the varying network conditions. Considering that the available research mostly focuses on improving the adaptive streaming algorithms and that the proposed methods for selecting the optimal parameters for the representation sets do not consider the spatial and temporal activity of video sequences, use a high amount of computational power for precoding, or are proprietary, there is room for improvement. The first part of the research presented in this doctoral dissertation is focused on the selection of parameters for coding and segmentation of video signals using a method that takes into account the spatial and temporal activity of video content and does not require trial coding of each video signal. Considering that some of the proposed solutions lack the information needed for the reproduction of results, this doctoral dissertation includes the methodology used for developing the methods and the notation for all methods presented in this doctoral dissertation. Compared to the segmentation available in the relevant literature, the segmentation of video signals based on the proposed methods obtains better Structural Similarity Index Measure values in 92% of cases. During the development of the proposed methods and its testing, it was noticed that there is a lack of video signals encoded and segmented according to MPEG DASH standard, in the second part of the research a database of six video signals segmented according to the proposed methods for selecting the number and parameters of representations was presented. In addition to the segments with a duration of 2, 6, and 10 seconds, the proposed database includes source video signals, initialization segments, and MPD files in five different profiles. The simulation framework that can be used for comparing different adaptation algorithms based on QoE parameters and for investigating the influence of segmentation parameters on the resulting video quality, was developed considering there weren’t any available solutions that could be used offline without the need for segmentation of video signals for every change in coding parameters. The proposed framework also includes the module for generating synthetic network traces by using Nakagami distribution. 169 By comparing the parameters that affect the quality of the uploaded video using synthetic and measured network traces, it was shown that the Nakagami distribution can be used for generating synthetic network traces that correspond to a wide range of measured network traces. The last part of the research was conducted using the proposed framework for examining the impact of the difference in MOS values used to determine the optimal number of representations on the QoE parameters. The proposed segmentation achieves higher SSIM values regardless of the network conditions and has a smaller number and depth of quality level switching. The proposed framework was also used to confirm the premise that the difference in MOS values used to determine the optimal number of representations should be higher if video signals with higher spatial and temporal information are used. From simulation results, it can be concluded that for segmenting video signals with higher values of SI and TI, the difference of MOS value should be three, and when video signals with lower values SI and TI are used, the difference of MOS value used to determine the optimal number of representations should be one or two to ensure a compromise between the number and depth of switching quality levels, the number, and duration of video playback delays and SSIM values

Similar works

Full text

Available Versions

Repository of Josip Juraj Strossmayer University of Osijek

oai:repozitorij.unios.hr:etfos...

Last time updated on 11/10/2022