389 research outputs found
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
QoE-Aware Resource Allocation For Crowdsourced Live Streaming: A Machine Learning Approach
In the last decade, empowered by the technological advancements of mobile devices
and the revolution of wireless mobile network access, the world has witnessed an
explosion in crowdsourced live streaming. Ensuring a stable high-quality playback
experience is compulsory to maximize the viewers’ Quality of Experience and the
content providers’ profits. This can be achieved by advocating a geo-distributed cloud
infrastructure to allocate the multimedia resources as close as possible to viewers, in
order to minimize the access delay and video stalls.
Additionally, because of the instability of network condition and the heterogeneity of
the end-users capabilities, transcoding the original video into multiple bitrates is
required. Video transcoding is a computationally expensive process, where generally a
single cloud instance needs to be reserved to produce one single video bitrate
representation. On demand renting of resources or inadequate resources reservation
may cause delay of the video playback or serving the viewers with a lower quality. On
the other hand, if resources provisioning is much higher than the required, the
extra resources will be wasted.
In this thesis, we introduce a prediction-driven resource allocation framework, to
maximize the QoE of viewers and minimize the resources allocation cost. First, by
exploiting the viewers’ locations available in our unique dataset, we implement a machine learning model to predict the viewers’ number near each geo-distributed cloud
site. Second, based on the predicted results that showed to be close to the actual values,
we formulate an optimization problem to proactively allocate resources at the viewers’
proximity. Additionally, we will present a trade-off between the video access delay and
the cost of resource allocation.
Considering the complexity and infeasibility of our offline optimization to respond to
the volume of viewing requests in real-time, we further extend our work, by introducing
a resources forecasting and reservation framework for geo-distributed cloud sites. First,
we formulate an offline optimization problem to allocate transcoding resources at the
viewers’ proximity, while creating a tradeoff between the network cost and viewers
QoE. Second, based on the optimizer resource allocation decisions on historical live
videos, we create our time series datasets containing historical records of the optimal
resources needed at each geo-distributed cloud site. Finally, we adopt machine learning
to build our distributed time series forecasting models to proactively forecast the exact
needed transcoding resources ahead of time at each geo-distributed cloud site.
The results showed that the predicted number of transcoding resources needed in each
cloud site is close to the optimal number of transcoding resources
A Novel Production Workflow and Toolset for Opera Co-creation Towards Enhanced Societal Inclusion of People
Opera uses all the visual and performing arts tocreate extraordinary worlds of passion and sensibility. It is rightly recognised as a great achievement of European culture. And yet a form that once inspired social and artistic revolutions is often seen as the staid preserve of the elite. With rising inequality and social exclusion, many see opera-if they think of it at all-as symbolic of what is wrong in Europe today. This paper presents technological and scientific approach of the European H2020 TRACTION project that aims to use opera as a path for social and cultural inclusion, making it once again a force for radicaltransformation. TRACTION wants to define new forms of artistic creation through which the most marginalised groups (e.g. migrants, the rural poor, young offenders and others) can work with artists to tell the stories that matter now. By combining best practices in participatory art with media technology's innovations of language, form and process, the project is defining new approaches to co-creation and innovation, exploring audiovisual formats based in european cultural heritage, such as opera
Energy-Aware Streaming Multimedia Adaptation: An Educational Perspective
As mobile devices are getting more powerful and more affordable the use of online educational multimedia is also getting very prevalent. Limited battery power is nevertheless, a major restricting factor as streaming multimedia drains battery power quickly. Many battery efficient multimedia adaptation techniques have been proposed that achieve battery efficiency by lowering presentation quality of entire multimedia. Adaptation is usually done without considering any impact on the information contents of multimedia. In this paper, based on the results of an experimental study, we argue that without considering any negative impact on information contents of multimedia the adaptation may negatively impact the learning process. Some portions of the multimedia that require a higher visual quality for conveying learning information may lose their learning effectiveness in the adapted lowered quality. We report results of our experimental study that indicate that different parts of the same learning multimedia do not have same minimum acceptable quality. This strengthens the position that power-saving adaptation techniques for educational multimedia must be developed that lower the quality of multimedia based on the needs of its individual fragments for successfully conveying learning informatio
Investigation Report on Universal Multimedia Access
Universal Multimedia Access (UMA) refers to the ability to access by any user to the desired multimedia content(s) over any type of network with any device from anywhere and anytime. UMA is a key framework for multimedia content delivery service using metadata. This investigation report analyzes the state-of-the-art technologies in UMA and tries to identify the key issues of UMA. The state-of-the-art in multimedia content adaptation, an overview of the standards that supports the UMA framework, potential privacy problems in UMA systems and some new UMA applications are presented in this report. This report also provides challenges that still remain to be resolved in UMA to make clear the potential key problems in UMA and determine which ones to solve
Rich media content adaptation in e-learning systems
The wide use of e-technologies represents a great opportunity for underserved segments of the population, especially with the aim of reintegrating excluded individuals back into society through education. This is particularly true for people with different types of disabilities who may have difficulties while attending traditional on-site learning programs that are typically based on printed learning resources. The creation and provision of accessible e-learning contents may therefore become a key factor in enabling people with different access needs to enjoy quality learning experiences and services.
Another e-learning challenge is represented by m-learning (which stands for mobile learning), which is emerging as a consequence of mobile terminals diffusion and provides the opportunity to browse didactical materials everywhere, outside places that are traditionally devoted to education.
Both such situations share the need to access materials in limited conditions and collide with the growing use of rich media in didactical contents, which are designed to be enjoyed without any restriction. Nowadays, Web-based teaching makes great use of multimedia technologies, ranging from Flash animations to prerecorded video-lectures. Rich media in e-learning can offer significant potential in
enhancing the learning environment, through helping to increase access to education, enhance the learning experience and support multiple learning styles. Moreover, they can often be used to improve the structure of Web-based courses. These highly variegated and structured contents may significantly improve the quality and the effectiveness of educational activities for learners. For example, rich media contents allow us to describe complex concepts and process flows. Audio and video elements may be utilized to add a “human touch” to distance-learning courses. Finally, real lectures may be recorded and distributed to integrate or
enrich on line materials. A confirmation of the advantages of these approaches can be seen in the exponential growth of video-lecture availability on the net, due to the ease of recording and delivering activities which take place in a traditional classroom. Furthermore, the wide use of assistive technologies for learners with disabilities injects new life into e-learning systems. E-learning allows distance and flexible educational activities, thus helping disabled learners to access resources which would otherwise present significant barriers for them. For instance, students with visual impairments have difficulties in reading traditional visual materials, deaf learners have trouble in following traditional (spoken) lectures, people with motion disabilities have problems in attending on-site programs. As already mentioned, the use of wireless technologies and pervasive computing may really enhance the educational learner experience by offering mobile e-learning services that can be accessed by handheld devices. This new paradigm of educational content distribution maximizes the benefits for learners since it enables users to overcome
constraints imposed by the surrounding environment. While certainly helpful for users without disabilities, we believe that the use of newmobile technologies may also become a fundamental tool for impaired learners, since it frees them from sitting in front of a PC. In this way, educational activities can be enjoyed by all the users, without hindrance, thus increasing the social inclusion of non-typical learners. While the provision of fully accessible and portable video-lectures may be extremely useful for students, it is widely recognized that structuring and managing rich media contents for mobile learning services are complex and expensive tasks. Indeed, major difficulties originate from the basic need to provide a textual equivalent for each media resource composing a rich media Learning Object (LO). Moreover, tests need to be carried out to establish whether a given LO is fully accessible to all kinds of learners. Unfortunately, both these tasks are truly time-consuming processes, depending on the type of contents the teacher is writing and on the authoring tool he/she is using. Due to these difficulties, online LOs are often distributed as partially accessible or totally inaccessible content. Bearing this in mind, this thesis aims to discuss the key issues of a system we have developed to deliver accessible, customized or nomadic
learning experiences to learners with different access needs and skills. To reduce the risk of excluding users with particular access capabilities, our system exploits Learning Objects (LOs) which are dynamically adapted and transcoded based on the specific needs of non-typical users and on the barriers that they can encounter in the environment. The basic idea is to dynamically adapt contents, by selecting them from a set of media resources packaged in SCORM-compliant LOs and stored in a self-adapting format. The system schedules and orchestrates a set of transcoding processes based on specific learner needs, so as to produce a customized LO that can be fully enjoyed by any (impaired or mobile) student
Adaptive video delivery using semantics
The diffusion of network appliances such as cellular phones, personal digital assistants and hand-held computers has created the need to personalize the way media content is delivered to the end user. Moreover, recent devices, such as digital radio receivers with graphics displays, and new applications, such as intelligent visual surveillance, require novel forms of video analysis for content adaptation and summarization. To cope with these challenges, we propose an automatic method for the extraction of semantics from video, and we present a framework that exploits these semantics in order to provide adaptive video delivery. First, an algorithm that relies on motion information to extract multiple semantic video objects is proposed. The algorithm operates in two stages. In the first stage, a statistical change detector produces the segmentation of moving objects from the background. This process is robust with regard to camera noise and does not need manual tuning along a sequence or for different sequences. In the second stage, feedbacks between an object partition and a region partition are used to track individual objects along the frames. These interactions allow us to cope with multiple, deformable objects, occlusions, splitting, appearance and disappearance of objects, and complex motion. Subsequently, semantics are used to prioritize visual data in order to improve the performance of adaptive video delivery. The idea behind this approach is to organize the content so that a particular network or device does not inhibit the main content message. Specifically, we propose two new video adaptation strategies. The first strategy combines semantic analysis with a traditional frame-based video encoder. Background simplifications resulting from this approach do not penalize overall quality at low bitrates. The second strategy uses metadata to efficiently encode the main content message. The metadata-based representation of object's shape and motion suffices to convey the meaning and action of a scene when the objects are familiar. The impact of different video adaptation strategies is then quantified with subjective experiments. We ask a panel of human observers to rate the quality of adapted video sequences on a normalized scale. From these results, we further derive an objective quality metric, the semantic peak signal-to-noise ratio (SPSNR), that accounts for different image areas and for their relevance to the observer in order to reflect the focus of attention of the human visual system. At last, we determine the adaptation strategy that provides maximum value for the end user by maximizing the SPSNR for given client resources at the time of delivery. By combining semantic video analysis and adaptive delivery, the solution presented in this dissertation permits the distribution of video in complex media environments and supports a large variety of content-based applications
- …