9 research outputs found
Implementation of an Adaptive Stream Scaling Media Proxy with the data path in the Linux Kernel
Streaming media is becoming increasingly common, and bandwidth demands are rising. Centralized distribution is becoming prohibitively hard to achieve and geographically aware hierarchical distribution has emerged as a viable alternative, and has brought with it an increased demand for CPU processing power.
There has also been an influx in an abundance of different devices on which to receive media - from HD plasma screens to 2" cell phone displays - and the need for media to be able to adapt to devices of differing capabilities are becoming obvious.
For this thesis we have implemented a media proxy using RTSP/RTP. It is implemented partially in kernelspace, which has drastically reduced the CPU-cost of relaying data from a server to multiple clients. This has been combined with a proof-of-concept implementation for adaptively scaling a SVC media stream according to available bandwidth
Recommended from our members
End-to-end 3D video communication over heterogeneous networks
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Three-dimensional technology, more commonly referred to as 3D technology, has revolutionised many fields including entertainment, medicine, and communications to name a few. In addition to 3D films, games, and sports channels, 3D perception has made tele-medicine a reality. By the year 2015, 30% of the all HD panels at home will be 3D enabled, predicted by consumer electronics manufacturers. Stereoscopic cameras, a comparatively mature technology compared to other 3D systems, are now being used by ordinary citizens to produce 3D content and share at a click of a button just like they do with the 2D counterparts via sites like YouTube. But technical challenges still exist, including with autostereoscopic multiview displays. 3D content requires many complex considerations--including how to represent it, and deciphering what is the best compression format--when considering transmission or storage, because of its increased amount of data. Any decision must be taken in the light of the available bandwidth or storage capacity, quality and user expectations. Free viewpoint navigation also remains partly unsolved. The most pressing issue getting in the way of widespread uptake of consumer 3D systems is the ability to deliver 3D content to heterogeneous consumer displays over the heterogeneous networks. Optimising 3D video communication solutions must consider the entire pipeline, starting with optimisation at the video source to the end display and transmission optimisation. Multi-view offers the most compelling solution for 3D videos with motion parallax and freedom from wearing headgear for 3D video perception. Optimising multi-view video for delivery and display could increase the demand for true 3D in the consumer market. This thesis focuses on an end-to-end quality optimisation in 3D video communication/transmission, offering solutions for optimisation at the compression, transmission, and decoder levels.Brunel University - Isambard Research Scholarshi
Συνδιάσκεψη πολλαπλών χρηστών με χρήση SSRC πολυπλεξίας και αναμετάδοση πακέτων βίντεο στο λογισμικό BareSIP
Παρουσιάζουμε το σχεδιασμό, την υλοποίηση και την ενσωμάτωση ενός πολυνηματικού
μηχανισμού συνδιάσκεψης πολλαπλών χρηστών με χρήση SSRC πολυπλεξίας, καθώς και
ενός μηχανισμού αναμετάδοσης RTP πακέτων βίντεο, στο λογισμικό ανοιχτού κώ-δικα
BareSIP. Επειδή η αρχική σχεδίαση του BareSIP δεν υποστηρίζει την ταυτόχρονη
συμμετοχή πολλαπλών χρηστών, για τη δημιουργία του μηχανισμού συνδιάσκεψης
χρησιμοποιούμε αρχικά έναν εξυπηρετητή. Ο εξυπηρετητής αυτός, πολυπλέκει τις
ροές μέσων των διαφόρων χρηστών με βάση το πεδίο SSRC της επικεφαλίδας των RTP
πακέτων και στέλνει μια ροή για κάθε τύπο μέσου στον BareSIP πελάτη. Στον
BareSIP πελάτη, προκειμένου να πραγματοποιείται συνδιάσκεψη με υψηλή απόδοση,
αναπτύ-ξαμε έναν πολυνηματικό μηχανισμό ο οποίος αποπολυπλέκει την εισερχόμενη
ροή μέ-σων βάσει του πεδίου SSRC του κάθε συμμετέχοντα και αναθέτει την
εξυπηρέτηση του καθενός σε αντίστοιχα νήματα που τρέχουν παράλληλα. Ο
μηχανισμός αυτός σχεδιά-στηκε με τρόπο τέτοιο ώστε να απαιτούνται μόνο δύο
θύρες για τις εισερχόμενες ροές μέσων, ενώ για μια ταυτόχρονη συνύπαρξη Ν
χρηστών να απαιτείται μόνο ένα στιγμι-ότυπο του κωδικοποιητή και Ν-1
στιγμιότυπα του αποκωδικοποιητή. Επιπλέον, χρησι-μοποιώντας την επέκταση
κλιμακωτής κωδικοποίησης βίντεο (Scalable Video Coding – SVC) του προτύπου
συμπίεσης H.264/AVC, η οποία προσφέρει πολλαπλά επίπεδα πιστότητας μέσω μιας
πυραμιδικής ιεραρχίας (ροές δεδομένων επιπέδου βάσης – ροές δεδομένων
βελτιωτικών επιπέδων), παρουσιάζουμε ένα μοντέλο επικοινωνίας υψηλής
ανθεκτικότητας ανάμεσα στον πομπό και το δέκτη, χρησιμοποιώντας αναμετάδοση των
RTP πακέτων βίντεο. Το μοντέλο αυτό φροντίζει για την άμεση αποκατάσταση απολε-
σθέντων RTP πακέτων βίντεο επιπέδου βάσης, ώστε η ρουτίνα αποκωδικοποίησης των
πακέτων βίντεο επιπέδου βάσης να πραγματοποιείται χωρίς διακοπές και καθυστε-
ρήσεις. Δημιουργούμε έναν πομπό ο οποίος διατηρεί διπλότυπα RTP πακέτων βίντεο
επιπέδου βάσης, προκειμένου να μπορεί να αναμεταδώσει εκείνα τα οποία ζητάει ο
δέκτης. Στην πλευρά του δέκτη, υλοποιούμε έναν αλγόριθμο ο οποίος εντοπίζει τα
απο-λεσθέντα πακέτα βάσης και στέλνει αιτήματα αναμετάδοσης μέσω RTCP NACK πακέ-
των, προκειμένου να ανακτήσει άμεσα όλη την πληροφορία του βασικού επιπέδου και
να διατηρήσει την ακεραιότητα της ροής δεδομένων του. Ο
αλγόριθμος αυτός, τέλος, φροντίζει να λάβει τις κατάλληλες αποφάσεις ανάλογα με
τις επιπτώσεις που θα είχαν οι απώλειες στο σύστημα. Ως προς τις μετρήσεις που
πραγματοποιήσαμε, για το μηχα-νισμό συνδιάσκεψης μελετήσαμε την καθυστέρηση που
προκύπτει σε επίπεδο μετά-δοσης, κωδικοποίησης και αποκωδικοποίησης ήχου και
βίντεο, καθώς αυξάνει ο αριθ-μός των συμμετεχόντων, αλλά και τις διαφορές που
παρουσιάζονται αν τροποποι-ήσουμε το packet time και την καθυστέρηση στους
jitter buffers. Από τις μετρήσεις αυτές δείξαμε ότι η συνολική καθυστέρηση του
ήχου παραμένει σχεδόν σταθερή, ανεξάρτητα από την αύξηση των συμμετεχόντων, ενώ
αντίστοιχα για το βίντεο παρατηρούμε μια μικρή αύξηση της τάξης των 2-4 msec.
Τέλος, για το μηχανισμό αναμετάδοσης πακέτων βίντεο, παρουσιάζουμε και
συγκρίνουμε το ποσοστό των ωφέλιμων frames (effective frame rate) και των
ωφέλιμων bytes στην έξοδο, τόσο με τη χρήση του μηχανισμού όσο και χωρίς αυτόν
και παρατηρούμε ότι η απόδοση με το μηχανισμό αναμετάδοσης είναι σαφώς
καλύτερη, ειδικά σε περιπτώσεις υψηλών απωλειών, όπου φαίνεται να είναι 265%
καλύτερη από αυτή χωρίς το μηχανισμό.We present the design, the implementation and the integration of a
multi-threaded mechanism for multi-user conference using SRC multiplexing and
an RTP video packet retransmission mechanism, in open source software BareSIP.
Due to the fact that the default design of BareSIP does not support the
simultaneous participation of multiple users, in order to set up the conference
mechanism, we initially use a server. This server multiplexes the media streams
of the different users based on the SSRC field of the RTP header of the packets
and sends one stream for each media type to every BareSIP client. In BareSIP
client, in order the conference to take place with high efficiency, we
developed a multi-threaded mechanism that demultiplexes the incoming media
stream based on the SSRC field of each participant and assigns the service of
each one to corresponding threads that are running in parallel. This mechanism
was designed in such a way as to require only two ports for the incoming media
streams and, for the simultaneous coexistence of N users, to require only one
encoding and N-1 decoding instances. Additionally, using the Scalable Video
Coding extension of H.264/AVC video compression standard, which offers multiple
levels of fidelity through a pyramidal hierarchy (base layer data streams –
enhancement layer data streams), we present a communication model of high
robustness between the transmitter and the receiver, using RTP video packet
retransmission. This model ensures for the immediate restoration of lost base
layer RTP video packets, so that the decoding routine of base layer video
packets will take place without interruptions and delays. We create a
transmitter that maintains duplicates of base layer RTP video packets, in order
to be able to retransmit those that are requested by the receiver. At the
receiver’s side, we implement an algorithm that detects the base layer packets
that were lost and requests for their retransmission, using RTCP NACK packets,
in order to instantly retrieve all the information of the base layer and to
maintain the integrity of its data flow. This algorithm, finally, ensures to
take the appropriate decisions depending on the impact that the losses will
have in the system. As for the results obtained by the measurements we made,
for the conference mechanism we present the delay that results in audio/video
relaying, coding and decoding level, as the number of participants is
increasing, but also the differences that appear by modifying the packet time
and the jitter buffer delay. From these measurements we showed that the total
sound delay remains almost constant, regardless of the increase of the number
of the participants, while for the total video delay there is a small increase
of 2-4 msec. Finally, for the video packet retransmission mechanism, we present
and compare the percentage of the useful frames (effective frame rate) and
bytes (effective byte rate) that appear in the output, when using and when not
using the mechanism, and we show that the performance in the case we are using
the retransmission mechanism is clearly better, especially in cases of high
losses, which seems to be 265% better than the performance in the case we are
not using the retransmission mechanism
Content-Aware Multimedia Communications
The demands for fast, economic and reliable dissemination of multimedia
information are steadily growing within our society. While people and
economy increasingly rely on communication technologies, engineers still
struggle with their growing complexity.
Complexity in multimedia communication originates from several sources. The
most prominent is the unreliability of packet networks like the Internet.
Recent advances in scheduling and error control mechanisms for streaming
protocols have shown that the quality and robustness of multimedia delivery
can be improved significantly when protocols are aware of the content they
deliver. However, the proposed mechanisms require close cooperation between
transport systems and application layers which increases the overall system
complexity. Current approaches also require expensive metrics and focus on
special encoding formats only. A general and efficient model is missing so
far.
This thesis presents efficient and format-independent solutions to support
cross-layer coordination in system architectures. In particular, the first
contribution of this work is a generic dependency model that enables
transport layers to access content-specific properties of media streams,
such as dependencies between data units and their importance. The second
contribution is the design of a programming model for streaming
communication and its implementation as a middleware architecture. The
programming model hides the complexity of protocol stacks behind simple
programming abstractions, but exposes cross-layer control and monitoring
options to application programmers. For example, our interfaces allow
programmers to choose appropriate failure semantics at design time while
they can refine error protection and visibility of low-level errors at
run-time.
Based on some examples we show how our middleware simplifies the
integration of stream-based communication into large-scale application
architectures. An important result of this work is that despite cross-layer
cooperation, neither application nor transport protocol designers
experience an increase in complexity. Application programmers can even
reuse existing streaming protocols which effectively increases system
robustness.Der Bedarf unsere Gesellschaft nach kostengünstiger und
zuverlässiger
Kommunikation wächst stetig. Während wir uns selbst immer mehr von modernen
Kommunikationstechnologien abhängig machen, müssen die Ingenieure dieser
Technologien sowohl den Bedarf nach schneller Einführung neuer Produkte
befriedigen als auch die wachsende Komplexität der Systeme beherrschen.
Gerade die Übertragung multimedialer Inhalte wie Video und Audiodaten ist
nicht trivial. Einer der prominentesten Gründe dafür ist die
Unzuverlässigkeit heutiger Netzwerke, wie z.B.~dem Internet. Paketverluste
und schwankende Laufzeiten können die Darstellungsqualität massiv
beeinträchtigen. Wie jüngste Entwicklungen im Bereich der
Streaming-Protokolle zeigen, sind jedoch Qualität und Robustheit der
Übertragung effizient kontrollierbar, wenn Streamingprotokolle
Informationen über den Inhalt der transportierten Daten ausnutzen.
Existierende Ansätze, die den Inhalt von Multimediadatenströmen
beschreiben, sind allerdings meist auf einzelne Kompressionsverfahren
spezialisiert und verwenden berechnungsintensive Metriken. Das reduziert
ihren praktischen Nutzen deutlich. Außerdem erfordert der
Informationsaustausch eine enge Kooperation zwischen Applikationen und
Transportschichten. Da allerdings die Schnittstellen aktueller
Systemarchitekturen nicht darauf vorbereitet sind, müssen entweder die
Schnittstellen erweitert oder alternative Architekturkonzepte geschaffen
werden. Die Gefahr beider Varianten ist jedoch, dass sich die Komplexität
eines Systems dadurch weiter erhöhen kann.
Das zentrale Ziel dieser Dissertation ist es deshalb,
schichtenübergreifende Koordination bei gleichzeitiger Reduzierung der
Komplexität zu erreichen. Hier leistet die Arbeit zwei Beträge zum
aktuellen Stand der Forschung. Erstens definiert sie ein universelles
Modell zur Beschreibung von Inhaltsattributen, wie Wichtigkeiten und
Abhängigkeitsbeziehungen innerhalb eines Datenstroms. Transportschichten
können dieses Wissen zur effizienten Fehlerkontrolle verwenden. Zweitens
beschreibt die Arbeit das Noja Programmiermodell für multimediale
Middleware. Noja definiert Abstraktionen zur Übertragung und Kontrolle
multimedialer Ströme, die die Koordination von Streamingprotokollen mit
Applikationen ermöglichen. Zum Beispiel können Programmierer geeignete
Fehlersemantiken und Kommunikationstopologien auswählen und den konkreten
Fehlerschutz dann zur Laufzeit verfeinern und kontrolliere
A rate control algorithm for scalable video coding
This thesis proposes a rate control (RC) algorithm for H.264/scalable video coding
(SVC) specially designed for real-time variable bit rate (VBR) applications with
buffer constraints. The VBR controller assumes that consecutive pictures within the
same scene often exhibit similar degrees of complexity, and aims to prevent unnecessary
quantization parameter (QP) fluctuations by allowing for just an incremental
variation of QP with respect to that of the previous picture. In order to adapt this
idea to H.264/SVC, a rate controller is located at each dependency layer (spatial or
coarse grain scalability) so that each rate controller is responsible for determining
the proper QP increment. Actually, one of the main contributions of the thesis is
a QP increment regression model that is based on Gaussian processes. This model
has been derived from some observations drawn from a discrete set of representative
encoding states. Two real-time application scenarios were simulated to assess the
performance of the VBR controller with respect to two well-known RC methods.
The experimental results show that our proposal achieves an excellent performance
in terms of quality consistency, buffer control, adjustment to the target bit rate, and computational complexity.
Moreover, unlike typical RC algorithms for SVC that only satisfy the hypothetical
reference decoder (HRD) constraints for the highest temporal resolution sub-stream
of each dependency layer, the proposed VBR controller also delivers HRD-compliant
sub-streams with lower temporal resolutions.To this end, a novel approach that uses a set of buffers (one per temporal resolution sub-stream) within a dependency layer has been built on top of the RC algorithm.The proposed approach aims to simultaneously control the buffer levels for overflow and underflow prevention, while maximizing the reconstructed video quality of the corresponding sub-streams. This in-layer multibuffer framework for rate-controlled SVC does not require additional dependency layers to deliver different HRD-compliant temporal resolutions for a given video source, thus improving the coding e ciency when compared to typical SVC encoder con gurations since, for the same target bit rate, less layers are encoded
Efficient delivery of scalable media streaming over lossy networks
Recent years have witnessed a rapid growth in the demand for streaming video over the Internet, exposing challenges in coping with heterogeneous device capabilities and varying network throughput. When we couple this rise in streaming with the growing number of portable devices (smart phones, tablets, laptops) we see an ever-increasing demand for high-definition videos online while on the move. Wireless networks are inherently characterised by restricted shared bandwidth and relatively high error loss rates, thus presenting a challenge for the efficient delivery of high quality video. Additionally, mobile devices can support/demand a range of video resolutions and qualities. This demand for mobile streaming highlights the need for adaptive video streaming schemes that can adjust to available bandwidth and heterogeneity, and can provide us with graceful changes in video quality, all while respecting our viewing satisfaction. In this context the use of well-known scalable media streaming techniques, commonly known as scalable coding, is an attractive solution and the focus of this thesis. In this thesis we investigate the transmission of existing scalable video models over a lossy network and determine how the variation in viewable quality is affected by packet loss. This work focuses on leveraging the benefits of scalable media, while reducing the effects of data loss on achievable video quality. The overall approach is focused on the strategic packetisation of the underlying scalable video and how to best utilise error resiliency to maximise viewable quality. In particular, we examine the manner in which scalable video is packetised for transmission over lossy networks and propose new techniques that reduce the impact of packet loss on scalable video by selectively choosing how to packetise the data and which data to transmit. We also exploit redundancy techniques, such as error resiliency, to enhance the stream quality by ensuring a smooth play-out with fewer changes in achievable video quality. The contributions of this thesis are in the creation of new segmentation and encapsulation techniques which increase the viewable quality of existing scalable models by fragmenting and re-allocating the video sub-streams based on user requirements, available bandwidth and variations in loss rates. We offer new packetisation techniques which reduce the effects of packet loss on viewable quality by leveraging the increase in the number of frames per group of pictures (GOP) and by providing equality of data in every packet transmitted per GOP. These provide novel mechanisms for packetizing and error resiliency, as well as providing new applications for existing techniques such as Interleaving and Priority Encoded Transmission. We also introduce three new scalable coding models, which offer a balance between transmission cost and the consistency of viewable quality