41,070 research outputs found
Quality of experience driven control of interactive media stream parameters
In recent years, cloud computing has led to many new kinds of services. One of these popular services is cloud gaming, which provides the entire game experience to the users remotely from a server, but also other applications are provided in a similar manner. In this paper we focus on the option to render the application in the cloud, thereby delivering the graphical output of the application to the user as a video stream. In more general terms, an interactive media stream is set up over the network between the user's device and the cloud server. The main issue with this approach is situated at the network, that currently gives little guarantees on the quality of service in terms of parameters such as available bandwidth, latency or packet loss. However, for interactive media stream cases, the user is merely interested in the perceived quality, regardless of the underlaying network situation. In this paper, we present an adaptive control mechanism that optimizes the quality of experience for the use case of a race game, by trading off visual quality against frame rate in function of the available bandwidth. Practical experiments verify that QoE driven adaptation leads to improved user experience compared to systems solely taking network characteristics into account
Hardware acceleration architectures for MPEG-Based mobile video platforms: a brief overview
This paper presents a brief overview of past and current hardware acceleration (HwA) approaches that have been proposed for the most computationally intensive compression tools of the MPEG-4 standard. These approaches are classified based on their historical evolution and architectural approach. An analysis of both evolutionary and functional classifications is carried out in order to speculate on the possible trends of the HwA architectures to be employed in mobile video platforms
Block-Matching Optical Flow for Dynamic Vision Sensor- Algorithm and FPGA Implementation
Rapid and low power computation of optical flow (OF) is potentially useful in
robotics. The dynamic vision sensor (DVS) event camera produces quick and
sparse output, and has high dynamic range, but conventional OF algorithms are
frame-based and cannot be directly used with event-based cameras. Previous DVS
OF methods do not work well with dense textured input and are designed for
implementation in logic circuits. This paper proposes a new block-matching
based DVS OF algorithm which is inspired by motion estimation methods used for
MPEG video compression. The algorithm was implemented both in software and on
FPGA. For each event, it computes the motion direction as one of 9 directions.
The speed of the motion is set by the sample interval. Results show that the
Average Angular Error can be improved by 30\% compared with previous methods.
The OF can be calculated on FPGA with 50\,MHz clock in 0.2\,us per event (11
clock cycles), 20 times faster than a Java software implementation running on a
desktop PC. Sample data is shown that the method works on scenes dominated by
edges, sparse features, and dense texture.Comment: Published in ISCAS 201
Design of multimedia processor based on metric computation
Media-processing applications, such as signal processing, 2D and 3D graphics
rendering, and image compression, are the dominant workloads in many embedded
systems today. The real-time constraints of those media applications have
taxing demands on today's processor performances with low cost, low power and
reduced design delay. To satisfy those challenges, a fast and efficient
strategy consists in upgrading a low cost general purpose processor core. This
approach is based on the personalization of a general RISC processor core
according the target multimedia application requirements. Thus, if the extra
cost is justified, the general purpose processor GPP core can be enforced with
instruction level coprocessors, coarse grain dedicated hardware, ad hoc
memories or new GPP cores. In this way the final design solution is tailored to
the application requirements. The proposed approach is based on three main
steps: the first one is the analysis of the targeted application using
efficient metrics. The second step is the selection of the appropriate
architecture template according to the first step results and recommendations.
The third step is the architecture generation. This approach is experimented
using various image and video algorithms showing its feasibility
DRASIC: Distributed Recurrent Autoencoder for Scalable Image Compression
We propose a new architecture for distributed image compression from a group
of distributed data sources. The work is motivated by practical needs of
data-driven codec design, low power consumption, robustness, and data privacy.
The proposed architecture, which we refer to as Distributed Recurrent
Autoencoder for Scalable Image Compression (DRASIC), is able to train
distributed encoders and one joint decoder on correlated data sources. Its
compression capability is much better than the method of training codecs
separately. Meanwhile, the performance of our distributed system with 10
distributed sources is only within 2 dB peak signal-to-noise ratio (PSNR) of
the performance of a single codec trained with all data sources. We experiment
distributed sources with different correlations and show how our data-driven
methodology well matches the Slepian-Wolf Theorem in Distributed Source Coding
(DSC). To the best of our knowledge, this is the first data-driven DSC
framework for general distributed code design with deep learning
Asynchronous spiking neurons, the natural key to exploit temporal sparsity
Inference of Deep Neural Networks for stream signal (Video/Audio) processing in edge devices is still challenging. Unlike the most state of the art inference engines which are efficient for static signals, our brain is optimized for real-time dynamic signal processing. We believe one important feature of the brain (asynchronous state-full processing) is the key to its excellence in this domain. In this work, we show how asynchronous processing with state-full neurons allows exploitation of the existing sparsity in natural signals. This paper explains three different types of sparsity and proposes an inference algorithm which exploits all types of sparsities in the execution of already trained networks. Our experiments in three different applications (Handwritten digit recognition, Autonomous Steering and Hand-Gesture recognition) show that this model of inference reduces the number of required operations for sparse input data by a factor of one to two orders of magnitudes. Additionally, due to fully asynchronous processing this type of inference can be run on fully distributed and scalable neuromorphic hardware platforms
ToyArchitecture: Unsupervised Learning of Interpretable Models of the World
Research in Artificial Intelligence (AI) has focused mostly on two extremes:
either on small improvements in narrow AI domains, or on universal theoretical
frameworks which are usually uncomputable, incompatible with theories of
biological intelligence, or lack practical implementations. The goal of this
work is to combine the main advantages of the two: to follow a big picture
view, while providing a particular theory and its implementation. In contrast
with purely theoretical approaches, the resulting architecture should be usable
in realistic settings, but also form the core of a framework containing all the
basic mechanisms, into which it should be easier to integrate additional
required functionality.
In this paper, we present a novel, purposely simple, and interpretable
hierarchical architecture which combines multiple different mechanisms into one
system: unsupervised learning of a model of the world, learning the influence
of one's own actions on the world, model-based reinforcement learning,
hierarchical planning and plan execution, and symbolic/sub-symbolic integration
in general. The learned model is stored in the form of hierarchical
representations with the following properties: 1) they are increasingly more
abstract, but can retain details when needed, and 2) they are easy to
manipulate in their local and symbolic-like form, thus also allowing one to
observe the learning process at each level of abstraction. On all levels of the
system, the representation of the data can be interpreted in both a symbolic
and a sub-symbolic manner. This enables the architecture to learn efficiently
using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl
- …