Adaptive bitrate (ABR) algorithms are used to adapt the video bitrate based
on the network conditions to improve the overall video quality of experience
(QoE). Recently, reinforcement learning (RL) and asynchronous advantage
actor-critic (A3C) methods have been used to generate adaptive bit rate
algorithms and they have been shown to improve the overall QoE as compared to
fixed rule ABR algorithms. However, a common issue in the A3C methods is the
lag between behaviour policy and target policy. As a result, the behaviour and
the target policies are no longer synchronized which results in suboptimal
updates. In this work, we present ALISA: An Actor-Learner Architecture with
Importance Sampling for efficient learning in ABR algorithms. ALISA
incorporates importance sampling weights to give more weightage to relevant
experience to address the lag issues with the existing A3C methods. We present
the design and implementation of ALISA, and compare its performance to
state-of-the-art video rate adaptation algorithms including vanilla A3C
implemented in the Pensieve framework and other fixed-rule schedulers like BB,
BOLA, and RB. Our results show that ALISA improves average QoE by up to 25%-48%
higher average QoE than Pensieve, and even more when compared to fixed-rule
schedulers.Comment: Number of pages: 10, Number of figures: 9, Conference name: 24th IEEE
International Symposium on a World of Wireless, Mobile and Multimedia
Networks (WoWMoM