Search CORE

10 research outputs found

Investigation Report on Universal Multimedia Access

Author: Ebrahimi T.
Kasutani E.
Publication venue: Ecublens
Publication date: 14/06/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

New Frontiers in Universal Multimedia Access

Author: Ebrahimi T.
Kasutani E.
Publication venue: Ecublens
Publication date: 14/06/2006
Field of study

Universal Multimedia Access (UMA) refers to the ability to access by any user to the desired multimedia content(s) over any type of network with any device from anywhere and anytime. UMA is a key framework for multimedia content delivery service using metadata. This report consists of three parts. The first part of this report analyzes the state-of-the-art technologies in UMA, identifies the key issues and gives what are the new challenges that still remain to be resolved in UMA. The key issues in UMA include the adaptation of multimedia contents to bridge the gap between content creation and consuming, standardized metadata description that facilitates the adaptation (e.g. MPEG-7, MPEG-21 DIA, CC/PP), and UMA system designing considering its target application. The second part introduces our approach towards these challenges; how to jointly adapt multimedia contents including different modalities and balance their presentation in an optimal way. A scheme for adapting audiovisual contents and its metadata (text) to any screen is proposed to provide the best experience in browsing the desired content. The adaptation process is modeled as an optimization problem of the total value of the content provided to the user. The total content value is optimized by jointly controlling the balance between video and metadata presentation, the transformation of the video content, and the amount of the metadata to be presented. Experimental results show that the proposed adaptation scheme enables users to browse audiovisual contents with their metadata optimized to the screen size of their devices. The last part reports some potential UMA applications especially focusing on a universal access application to TV news archives as an example

Infoscience - École polytechnique fédérale de Lausanne

Personalizing quality aspects for video communication in constrained heterogeneous environments

Author: Lerouge Sam
Publication venue
Publication date: 01/01/2006
Field of study

The world of multimedia communication is drastically evolving since a few years. Advanced compression formats for audiovisual information arise, new types of wired and wireless networks are developed, and a broad range of different types of devices capable of multimedia communication appear on the market. The era where multimedia applications available on the Internet were the exclusive domain of PC users has passed. The next generation multimedia applications will be characterized by heterogeneity: differences in terms of the networks, devices and user expectations. This heterogeneity causes some new challenges: transparent consumption of multimedia content is needed in order to be able to reach a broad audience. Recently, two important technologies have appeared that can assist in realizing such transparent Universal Multimedia Access. The first technology consists of new scalable or layered content representation schemes. Such schemes are needed in order to make it possible that a multimedia stream can be consumed by devices with different capabilities and transmitted over network connections with different characteristics. The second technology does not focus on the content representation itself, but rather on linking information about the content, so-called metadata, to the content itself. One of the possible uses of metadata is in the automatic selection and adaptation of multimedia presentations. This is one of the main goals of the MPEG-21 Multimedia Framework. Within the MPEG-21 standard, two formats were developed that can be used for bitstream descriptions. Such descriptions can act as an intermediate layer between a scalable bitstream and the adaptation process. This way, format-independent bitstream adaptation engines can be built. Furthermore, it is straightforward to add metadata information to the bitstream description, and use this information later on during the adaptation process. Because of the efforts spent on bitstream descriptions during our research, a lot of attention is devoted to this topic in this thesis. We describe both frameworks for bitstream descriptions that were standardized by MPEG. Furthermore, we focus on our own contributions in this domain: we developed a number of bitstream schemas and transformation examples for different types of multimedia content. The most important objective of this thesis is to describe a content negotiation process that uses scalable bitstreams in a generic way. In order to be able to express such an application, we felt the need for a better understanding of the data structures, in particular scalable bitstreams, on which this content negotiation process operates. Therefore, this thesis introduces a formal model we developed capable of describing the fundamental concepts of scalable bitstreams and their relations. Apart from the definition of the theoretical model itself, we demonstrate its correctness by applying it to a number of existing formats for scalable bitstreams. Furthermore, we attempt to formulate a content negotiation process as a constrained optimization problem, by means of the notations defined in the abstract model. In certain scenarios, the representation of a content negotiation process as a constrained optimization problem does not sufficiently reflect reality, especially when scalable bitstreams with multiple quality dimensions are involved. In such case, several versions of the same original bitstream can meet all constraints imposed by the system. Sometimes one version clearly offers a better quality towards the end user than another one, but in some cases, it is not possible to objectively compare two versions without additional information. In such a situation, a trade-off will have to be made between the different quality aspects. We use Pareto's theory of multi-criteria optimization for formally describing the characteristics of a content negotiation process for scalable bitstreams with multiple quality dimensions. This way, we can modify our definition of a content negotiation process into a multi-criteria optimization problem. One of the most important problems with multi-criteria optimization problems is that multiple candidate optimal solutions may exist. Additional information, e.g. user preferences, is needed if a single optimal solution has to be selected. Such multi-criteria optimization problems are not new. Unfortunately, existing solutions for selecting one optimal version are not suitable in a content negotiation scenario, because they expect detailed understanding of the problem from the decision maker, in our case the end user. In this thesis, we propose a scenario in which a so-called content negotiation agent would give some sample video sequences to the end user, asking him to select which sequence he liked the most. This information would be used for training the agent: a model would be built representing the preferences of the end user, and this model can be used later on for selecting one solution from a set of candidate optimal solutions. Based on a literature study, we propose two candidate algorithms in this thesis that can be used in such a content negotiation agent. It is possible to use these algorithms for constructing a model of the user's preferences by means of a number of examples, and to use this model when selecting an optimal version. The first algorithm considers the quality of a video sequence as a weighted sum of a number of independent quality aspects, and derives a system of linear inequalities from the example decisions. The second algorithm, called 1ARC, is actually a nearest-neighbor approach, where predictions are made based on the similarity with the example decisions entered by the user. This thesis analyzes the strengths and weaknesses of both algorithms from multiple points of view. The computational complexity of both algorithms is discussed, possible parameters that can influence the reliability of the algorithm, and the reliability itself. For measuring this kind of performance, we set up a test in which human subjects are asked to make a number of pairwise decisions between two versions of the same original video sequence. The reliability of the two algorithms we proposed is tested by selecting a part of these decisions for training a model, and by observing if this model is able to predict other decisions entered by the same user. We not only compare both algorithms, but we also observe the result of modifying several parameters on both algorithms. Ultimately, we conclude that the 1ARC algorithm has an acceptable performance, certainly if the training set is sufficiently large. The reliability is better than what would be theoretically achievable by any other algorithm that selects one optimal version from a set of candidate versions, but does not try to capture the user's preferences. Still, the results that we achieve are not as good as what we initially hoped. One possible cause may be the fact that the algorithms we proposed currently do not take sequence characteristics (e.g. the amount of motion) into account. Other improvements may be possible by means of a more accurate description of the quality aspects that we take into account, in particular the spatial resolution, the amount of distortion and the smoothness of a video sequence. Despite the limitations of the algorithms we proposed, in their performance as well as in their application area, we think that this thesis contains an initial and original contribution to the emerging objective of realizing Quality of Experience in multimedia applications

Ghent University Academic Bibliography

A selective approach for energy-aware video content adaptation decision-taking engine in android based smartphone

Author: Abd Rahim Mohd Hilmi Izwan
Publication venue
Publication date: 01/07/2019
Field of study

Rapid advancement of technology and their increasing affordability have transformed mobile devices from a means of communication to tools for socialization, entertainment, work and learning. However, advancement of battery technology and capacity is slow compared to energy need. Viewing content with high quality of experience will consume high power. In limited available energy, normal content adaptation system will decrease the content quality, hence reducing quality of experience. However, there is a need for optimizing content quality of experience (QoE) in a limited available energy. With modification and improvement, content adaptation may solve this issue. The key objective of this research is to propose a framework for energy-aware video content adaptation system to enable video delivery over the Internet. To optimise the QoE while viewing streaming video on a limited available smartphone energy, an algorithm for energy-aware video content adaptation decision-taking engine named EnVADE is proposed. The EnVADE algorithm uses selective mechanism. Selective mechanism means the video segmented into scenes and adaptation process is done based on the selected scenes. Thus, QoE can be improved. To evaluate EnVADE algorithm in term of energy efficiency, an experimental evaluation has been done. Subjective evaluation by selected respondents are also has been made using Absolute Category Rating method as recommended by ITU to evaluate EnVADE algorithm in term of QoE. In both evaluation, comparison with other methods has been made. The results show that the proposed solution is able to increase the viewing time of about 14% compared to MPEG-DASH which is an official international standard and widely used streaming method. In term of QoE subjective test, EnVADE algorithm score surpasses the score of other video streaming method. Therefore, EnVADE framework and algorithm has proven its capability as an alternative technique to stream video content with higher QoE and lower energy consumption

UTHM Institutional Repository

Adaptive video delivery using semantics

Author: Steiger Olivier
Publication venue: Lausanne, EPFL
Publication date: 16/03/2005
Field of study

The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications

Infoscience - École polytechnique fédérale de Lausanne

MPEG-SCORM : ontologia de metadados interoperáveis para integração de padrões multimídia e e-learning

Author: Santos Marcelo Correia dos, 1978-
Publication venue: [s.n.]
Publication date: 02/09/2018
Field of study

Orientador: Yuzo IanoTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: A convergência entre as mídias digitais propõe uma integração entre as TIC, focadas no domínio do multimídia (sob a responsabilidade do Moving Picture Experts Group, constituindo o subcomitê ISO / IEC JTC1 SC29), e as TICE, (TIC para a Educação, geridas pelo ISO / IEC JTC1 SC36), destacando-se os padrões MPEG, empregados na forma de conteúdo e metadados para o multimídia, e as TICE, aplicadas à Educação a Distância, ou e-Learning (o aprendizado eletrônico). Neste sentido, coloca-se a problemática de desenvolver uma correspondência interoperável de bases normativas, atingindo assim uma proposta inovadora na convergência entre as mídias digitais e as aplicações para e-Learning, essencialmente multimídia. Para este fim, propõe-se criar e aplicar uma ontologia de metadados interoperáveis para web, TV digital e extensões para dispositivos móveis, baseada na integração entre os padrões de metadados MPEG-21 e SCORM, empregando a linguagem XPathAbstract: The convergence of digital media offers an integration of the ICT, focused on telecommunications and multimedia domain (under responsibility of the Moving Picture Experts Group, ISO/IEC JTC1 SC29), with the ICTE (the ICT for Education, managed by the ISO/IEC JTC1 SC36), highlighting the MPEG formats, featured as content and as description metadata potentially applied to the Multimedia or Digital TV and as a technology applied to e-Learning. Regarding this, it is presented the problem of developing an interoperable matching for normative bases, achieving an innovative proposal in the convergence between digital Telecommunications and applications for e-Learning, both essentially multimedia. To achieve this purpose, it is proposed to creating a ontology for interoperability between educational applications in Digital TV environments and vice-versa, simultaneously facilitating the creation of learning metadata based objects for Digital TV programs as well as providing multimedia video content as learning objects for Distance Education. This ontology is designed as interoperable metadata for the Web, Digital TV and e-Learning, built on the integration between MPEG-21 and SCORM metadata standards, employing the XPath languageDoutoradoTelecomunicações e TelemáticaDoutor em Engenharia ElétricaCAPE

Repositorio da Producao Cientifica e Intelectual da Unicamp

Architectural support for ubiquitous access to multimedia content

Author: Andrade Maria Teresa Magalhães da Silva Pinto de
Publication venue
Publication date: 01/01/2007
Field of study

Tese de doutoramento. Engenharia Electrotécnica e de Computadores (Telecomunicações). Faculdade de Engenharia. Universidade do Porto. 200

Repositório Aberto da Universidade do Porto

User-centric power-friendly quality-based network selection strategy for heterogeneous wireless environments

Author: Trestian Ramona
Publication venue: Dublin City University. School of Electronic Engineering
Publication date: 01/03/2012
Field of study

The ‘Always Best Connected’ vision is built around the scenario of a mobile user seamlessly roaming within a multi-operator multi-technology multi-terminal multi-application multi-user environment supported by the next generation of wireless networks. In this heterogeneous environment, users equipped with multi-mode wireless mobile devices will access rich media services via one or more access networks. All these access networks may differ in terms of technology, coverage range, available bandwidth, operator, monetary cost, energy usage etc. In this context, there is a need for a smart network selection decision to be made, to choose the best available network option to cater for the user’s current application and requirements. The decision is a difficult one, especially given the number and dynamics of the possible input parameters. What parameters are used and how those parameters model the application requirements and user needs is important. Also, game theory approaches can be used to model and analyze the cooperative or competitive interaction between the rational decision makers involved, which are users, seeking to get good service quality at good value prices, and/or the network operators, trying to increase their revenue. This thesis presents the roadmap towards an ‘Always Best Connected’ environment. The proposed solution includes an Adapt-or-Handover solution which makes use of a Signal Strength-based Adaptive Multimedia Delivery mechanism (SAMMy) and a Power-Friendly Access Network Selection Strategy (PoFANS) in order to help the user in taking decisions, and to improve the energy efficiency at the end-user mobile device. A Reputation-based System is proposed, which models the user-network interaction as a repeated cooperative game following the repeated Prisoner’s Dilemma game from Game Theory. It combines reputation-based systems, game theory and a network selection mechanism in order to create a reputation-based heterogeneous environment. In this environment, the users keep track of their individual history with the visited networks. Every time, a user connects to a network the user-network interaction game is played. The outcome of the game is a network reputation factor which reflects the network’s previous behavior in assuring service guarantees to the user. The network reputation factor will impact the decision taken by the user next time, when he/she will have to decide whether to connect or not to that specific network. The performance of the proposed solutions was evaluated through in-depth analysis and both simulation-based and experimental-oriented testing. The results clearly show improved performance of the proposed solutions in comparison with other similar state-of-the-art solutions. An energy consumption study for a Google Nexus One streaming adaptive multimedia was performed, and a comprehensive survey on related Game Theory research are provided as part of the work

Irish Universities

DCU Online Research Access Service

Scalable video compression with optimized visual performance and random accessibility

Author: Leung Raymond
Publication venue: UNSW, Sydney
Publication date: 01/01/2006
Field of study

This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

UNSWorks