103,704 research outputs found
Interaction between high-level and low-level image analysis for semantic video object extraction
Authors of articles published in EURASIP Journal on Advances in Signal Processing are the copyright holders of their articles and have granted to any third party, in advance and in perpetuity, the right to use, reproduce or disseminate the article, according to the SpringerOpen copyright and license agreement (http://www.springeropen.com/authors/license)
Recommended from our members
Multimedia delivery in the future internet
The term âNetworked Mediaâ implies that all kinds of media including text, image, 3D graphics, audio
and video are produced, distributed, shared, managed and consumed on-line through various networks,
like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white
paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked
challenges of the Networked Media in the transition to the Future of the Internet.
Internet has evolved and changed the way we work and live. End users of the Internet have been confronted
with a bewildering range of media, services and applications and of technological innovations concerning
media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace
of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more
than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so
regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected
to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising
to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged
that in a near- to mid-term future, the Internet will provide the means to share and distribute (new)
multimedia content and services with superior quality and striking flexibility, in a trusted and personalized
way, improving citizensâ quality of life, working conditions, edutainment and safety.
In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe
network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as
community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of
interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and
innovative applications âon the moveâ, like virtual collaboration environments, personalised services/
media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content
combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P
networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to
contribute towards such a vision.
Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6)
and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily
contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way
ahead in the area of Content Aware media delivery platforms
Navigation domain representation for interactive multiview imaging
Enabling users to interactively navigate through different viewpoints of a
static scene is a new interesting functionality in 3D streaming systems. While
it opens exciting perspectives towards rich multimedia applications, it
requires the design of novel representations and coding techniques in order to
solve the new challenges imposed by interactive navigation. Interactivity
clearly brings new design constraints: the encoder is unaware of the exact
decoding process, while the decoder has to reconstruct information from
incomplete subsets of data since the server can generally not transmit images
for all possible viewpoints due to resource constrains. In this paper, we
propose a novel multiview data representation that permits to satisfy bandwidth
and storage constraints in an interactive multiview streaming system. In
particular, we partition the multiview navigation domain into segments, each of
which is described by a reference image and some auxiliary information. The
auxiliary information enables the client to recreate any viewpoint in the
navigation segment via view synthesis. The decoder is then able to navigate
freely in the segment without further data request to the server; it requests
additional data only when it moves to a different segment. We discuss the
benefits of this novel representation in interactive navigation systems and
further propose a method to optimize the partitioning of the navigation domain
into independent segments, under bandwidth and storage constraints.
Experimental results confirm the potential of the proposed representation;
namely, our system leads to similar compression performance as classical
inter-view coding, while it provides the high level of flexibility that is
required for interactive streaming. Hence, our new framework represents a
promising solution for 3D data representation in novel interactive multimedia
services
Statistical framework for video decoding complexity modeling and prediction
Video decoding complexity modeling and prediction is an increasingly important issue for efficient resource utilization in a variety of applications, including task scheduling, receiver-driven complexity shaping, and adaptive dynamic voltage scaling. In this paper we present a novel view of this problem based on a statistical framework perspective. We explore the statistical structure (clustering) of the execution time required by each video decoder module (entropy decoding, motion compensation, etc.) in conjunction with complexity features that are easily extractable at encoding time (representing the properties of each module's input source data). For this purpose, we employ Gaussian mixture models (GMMs) and an expectation-maximization algorithm to estimate the joint execution-time - feature probability density function (PDF). A training set of typical video sequences is used for this purpose in an offline estimation process. The obtained GMM representation is used in conjunction with the complexity features of new video sequences to predict the execution time required for the decoding of these sequences. Several prediction approaches are discussed and compared. The potential mismatch between the training set and new video content is addressed by adaptive online joint-PDF re-estimation. An experimental comparison is performed to evaluate the different approaches and compare the proposed prediction scheme with related resource prediction schemes from the literature. The usefulness of the proposed complexity-prediction approaches is demonstrated in an application of rate-distortion-complexity optimized decoding
ToyArchitecture: Unsupervised Learning of Interpretable Models of the World
Research in Artificial Intelligence (AI) has focused mostly on two extremes:
either on small improvements in narrow AI domains, or on universal theoretical
frameworks which are usually uncomputable, incompatible with theories of
biological intelligence, or lack practical implementations. The goal of this
work is to combine the main advantages of the two: to follow a big picture
view, while providing a particular theory and its implementation. In contrast
with purely theoretical approaches, the resulting architecture should be usable
in realistic settings, but also form the core of a framework containing all the
basic mechanisms, into which it should be easier to integrate additional
required functionality.
In this paper, we present a novel, purposely simple, and interpretable
hierarchical architecture which combines multiple different mechanisms into one
system: unsupervised learning of a model of the world, learning the influence
of one's own actions on the world, model-based reinforcement learning,
hierarchical planning and plan execution, and symbolic/sub-symbolic integration
in general. The learned model is stored in the form of hierarchical
representations with the following properties: 1) they are increasingly more
abstract, but can retain details when needed, and 2) they are easy to
manipulate in their local and symbolic-like form, thus also allowing one to
observe the learning process at each level of abstraction. On all levels of the
system, the representation of the data can be interpreted in both a symbolic
and a sub-symbolic manner. This enables the architecture to learn efficiently
using sub-symbolic methods and to employ symbolic inference.Comment: Revision: changed the pdftitl
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
- âŚ