499 research outputs found
Recommended from our members
Multimedia delivery in the future internet
The term “Networked Media” implies that all kinds of media including text, image, 3D graphics, audio
and video are produced, distributed, shared, managed and consumed on-line through various networks,
like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white
paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked
challenges of the Networked Media in the transition to the Future of the Internet.
Internet has evolved and changed the way we work and live. End users of the Internet have been confronted
with a bewildering range of media, services and applications and of technological innovations concerning
media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace
of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more
than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so
regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected
to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising
to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged
that in a near- to mid-term future, the Internet will provide the means to share and distribute (new)
multimedia content and services with superior quality and striking flexibility, in a trusted and personalized
way, improving citizens’ quality of life, working conditions, edutainment and safety.
In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe
network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as
community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of
interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and
innovative applications “on the move”, like virtual collaboration environments, personalised services/
media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content
combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P
networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to
contribute towards such a vision.
Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6)
and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily
contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way
ahead in the area of Content Aware media delivery platforms
Trick play on Audiovisual Information for Tape, Disk and Solid-State based Digital Recording Systems
Semantischer Schutz und Personalisierung von Videoinhalten. PIAF: MPEG-kompatibles Multimedia-Adaptierungs-Framework zur Bewahrung der vom Nutzer wahrgenommenen Qualität.
UME is the notion that a user should receive informative adapted content anytime and anywhere. Personalization of videos, which adapts their content according to user preferences, is a vital aspect of achieving the UME vision. User preferences can be translated into several types of constraints that must be considered by the adaptation process, including semantic constraints directly related to the content of the video. To deal with these semantic constraints, a fine-grained adaptation, which can go down to the level of video objects, is necessary. The overall goal of this adaptation process is to provide users with adapted content that maximizes their Quality of Experience (QoE). This QoE depends at the same time on the level of the user's satisfaction in perceiving the adapted content, the amount of knowledge assimilated by the user, and the adaptation execution time. In video adaptation frameworks, the Adaptation Decision Taking Engine (ADTE), which can be considered as the "brain" of the adaptation engine, is responsible for achieving this goal. The task of the ADTE is challenging as many adaptation operations can satisfy the same semantic constraint, and thus arising in several feasible adaptation plans. Indeed, for each entity undergoing the adaptation process, the ADTE must decide on the adequate adaptation operator that satisfies the user's preferences while maximizing his/her quality of experience. The first challenge to achieve in this is to objectively measure the quality of the adapted video, taking into consideration the multiple aspects of the QoE. The second challenge is to assess beforehand this quality in order to choose the most appropriate adaptation plan among all possible plans. The third challenge is to resolve conflicting or overlapping semantic constraints, in particular conflicts arising from constraints expressed by owner's intellectual property rights about the modification of the content. In this thesis, we tackled the aforementioned challenges by proposing a Utility Function (UF), which integrates semantic concerns with user's perceptual considerations. This UF models the relationships among adaptation operations, user preferences, and the quality of the video content. We integrated this UF into an ADTE. This ADTE performs a multi-level piecewise reasoning to choose the adaptation plan that maximizes the user-perceived quality. Furthermore, we included intellectual property rights in the adaptation process. Thereby, we modeled content owner constraints. We dealt with the problem of conflicting user and owner constraints by mapping it to a known optimization problem. Moreover, we developed the SVCAT, which produces structural and high-level semantic annotation according to an original object-based video content model. We modeled as well the user's preferences proposing extensions to MPEG-7 and MPEG-21. All the developed contributions were carried out as part of a coherent framework called PIAF. PIAF is a complete modular MPEG standard compliant framework that covers the whole process of semantic video adaptation. We validated this research with qualitative and quantitative evaluations, which assess the performance and the efficiency of the proposed adaptation decision-taking engine within PIAF. The experimental results show that the proposed UF has a high correlation with subjective video quality evaluation.Der Begriff "Universal Multimedia Experience" (UME) beschreibt die Vision, dass ein Nutzer nach seinen individuellen Vorlieben zugeschnittene Videoinhalte konsumieren kann. In dieser Dissertation werden im UME nun auch semantische Constraints berücksichtigt, welche direkt mit der Konsumierung der Videoinhalte verbunden sind. Dabei soll die Qualität der Videoerfahrung für den Nutzer maximiert werden. Diese Qualität ist in der Dissertation durch die Benutzerzufriedenheit bei der Wahrnehmung der Veränderung der Videos repräsentiert. Die Veränderung der Videos wird durch eine Videoadaptierung erzeugt, z.B. durch die Löschung oder Veränderung von Szenen, Objekten, welche einem semantischen Constraints nicht entsprechen. Kern der Videoadaptierung ist die "Adaptation Decision Taking Engine" (ADTE). Sie bestimmt die Operatoren, welche die semantischen Constraints auflösen, und berechnet dann mögliche Adaptierungspläne, die auf dem Video angewandt werden sollen. Weiterhin muss die ADTE für jeden Adaptierungsschritt anhand der Operatoren bestimmen, wie die Vorlieben des Nutzers berücksichtigt werden können. Die zweite Herausforderung ist die Beurteilung und Maximierung der Qualität eines adaptierten Videos. Die dritte Herausforderung ist die Berücksichtigung sich widersprechender semantischer Constraints. Dies betrifft insbesondere solche, die mit Urheberrechten in Verbindung stehen. In dieser Dissertation werden die oben genannten Herausforderungen mit Hilfe eines "Personalized video Adaptation Framework" (PIAF) gelöst, welche auf den "Moving Picture Expert Group" (MPEG)-Standard MPEG-7 und MPEG-21 basieren. PIAF ist ein Framework, welches den gesamten Prozess der Videoadaptierung umfasst. Es modelliert den Zusammenhang zwischen den Adaptierungsoperatoren, den Vorlieben der Nutzer und der Qualität der Videos. Weiterhin wird das Problem der optimalen Auswahl eines Adaptierungsplans für die maximale Qualität der Videos untersucht. Dafür wird eine Utility Funktion (UF) definiert und in der ADTE eingesetzt, welche die semantischen Constraints mit den vom Nutzer ausgedrückten Vorlieben vereint. Weiterhin ist das "Semantic Video Content Annotation Tool" (SVCAT) entwickelt worden, um strukturelle und semantische Annotationen durchzuführen. Ebenso sind die Vorlieben der Nutzer mit MPEG-7 und MPEG-21 Deskriptoren berücksichtigt worden. Die Entwicklung dieser Software-Werkzeuge und Algorithmen ist notwendig, um ein vollständiges und modulares Framework zu erhalten. Dadurch deckt PIAF den kompletten Bereich der semantischen Videoadaptierung ab. Das ADTE ist in qualitativen und quantitativen Evaluationen validiert worden. Die Ergebnisse der Evaluation zeigen unter anderem, dass die UF im Bereich Qualität eine hohe Korrelation mit der subjektiven Wahrnehmung von ausgewählten Nutzern aufweist
Adaptive video delivery using semantics
The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
Content-prioritised video coding for British Sign Language communication.
Video communication of British Sign Language (BSL) is important for remote interpersonal communication and for the equal provision of services for deaf people. However, the use of video telephony and video conferencing applications for BSL communication is limited by inadequate video quality. BSL is a highly structured, linguistically complete, natural language system that expresses vocabulary and grammar visually and spatially using a complex combination of facial expressions (such as eyebrow movements, eye blinks and mouth/lip shapes), hand gestures, body movements and finger-spelling that change in space and time. Accurate natural BSL communication places specific demands on visual media applications which must compress video image data for efficient transmission. Current video compression schemes apply methods to reduce statistical redundancy and perceptual irrelevance in video image data based on a general model of Human Visual System (HVS) sensitivities. This thesis presents novel video image coding methods developed to achieve the conflicting requirements for high image quality and efficient coding. Novel methods of prioritising visually important video image content for optimised video coding are developed to exploit the HVS spatial and temporal response mechanisms of BSL users (determined by Eye Movement Tracking) and the characteristics of BSL video image content. The methods implement an accurate model of HVS foveation, applied in the spatial and temporal domains, at the pre-processing stage of a current standard-based system (H.264). Comparison of the performance of the developed and standard coding systems, using methods of video quality evaluation developed for this thesis, demonstrates improved perceived quality at low bit rates. BSL users, broadcasters and service providers benefit from the perception of high quality video over a range of available transmission bandwidths. The research community benefits from a new approach to video coding optimisation and better understanding of the communication needs of deaf people
Recommended from our members
Scalable and network aware video coding for advanced communications over heterogeneous networks
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityThis work addresses the issues concerned with the provision of scalable video services over heterogeneous networks particularly with regards to dynamic adaptation and user’s acceptable quality of service.
In order to provide and sustain an adaptive and network friendly multimedia communication service, a suite of techniques that achieved automatic scalability and adaptation are developed. These techniques are evaluated objectively and subjectively to assess the Quality of Service (QoS) provided to diverse users with variable constraints and dynamic resources. The research ensured the consideration of various levels of user acceptable QoS The techniques are further evaluated with view to establish their performance against state of the art scalable and non-scalable techniques.
To further improve the adaptability of the designed techniques, several experiments and real time simulations are conducted with the aim of determining the optimum performance with various coding parameters and scenarios. The coding parameters and scenarios are evaluated and analyzed to determine their performance using various types of video content and formats. Several algorithms are developed to provide a dynamic adaptation of coding tools and parameters to specific video content type, format and bandwidth of transmission.
Due to the nature of heterogeneous networks where channel conditions, terminals, users capabilities and preferences etc are unpredictably changing, hence limiting the adaptability of a specific technique adopted, a Dynamic Scalability Decision Making Algorithm (SADMA) is developed. The algorithm autonomously selects one of the designed scalability techniques basing its decision on the monitored and reported channel conditions. Experiments were conducted using a purpose-built heterogeneous network simulator and the network-aware selection of the scalability techniques is based on real time simulation results. A technique with a minimum delay, low bit-rate, low frame rate and low quality is adopted as a reactive measure to a predicted bad channel condition. If the use of the techniques is not favoured due to deteriorating channel conditions reported, a reduced layered stream or base layer is used. If the network status does not allow the use of the base layer, then the stream uses parameter identifiers with high efficiency to improve the scalability and adaptation of the video service.
To further improve the flexibility and efficiency of the algorithm, a dynamic de-blocking filter and lambda value selection are analyzed and introduced in the algorithm. Various methods, interfaces and algorithms are defined for transcoding from one technique to another and extracting sub-streams when the network conditions do not allow for the transmission of the entire bit-stream
- …