674 research outputs found
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
Multimedia
The nowadays ubiquitous and effortless digital data capture and processing capabilities offered by the majority of devices, lead to an unprecedented penetration of multimedia content in our everyday life. To make the most of this phenomenon, the rapidly increasing volume and usage of digitised content requires constant re-evaluation and adaptation of multimedia methodologies, in order to meet the relentless change of requirements from both the user and system perspectives. Advances in Multimedia provides readers with an overview of the ever-growing field of multimedia by bringing together various research studies and surveys from different subfields that point out such important aspects. Some of the main topics that this book deals with include: multimedia management in peer-to-peer structures & wireless networks, security characteristics in multimedia, semantic gap bridging for multimedia content and novel multimedia applications
SISTEMI PER LA MOBILITĂ€ DEGLI UTENTI E DEGLI APPLICATIVI IN RETI WIRED E WIRELESS
The words mobility and network are found together in many contexts. The issue alone of modeling geographical user mobility in wireless networks has countless applications. Depending on one’s background, the concept is investigated with very different tools and aims.
Moreover, the last decade saw also a growing interest in code mobility, i.e. the possibility for soft-ware applications (or parts thereof) to migrate and keeps working in different devices and environ-ments. A notable real-life and successful application is distributed computing, which under certain hypothesis can void the need of expensive supercomputers. The general rationale is splitting a very demanding computing task into a large number of independent sub-problems, each addressable by limited-power machines, weakly connected (typically through the Internet, the quintessence of a wired network).
Following this lines of thought, we organized this thesis in two distinct and independent parts:
Part I
It deals with audio fingerprinting, and a special emphasis is put on the application of broadcast mon-itoring and on the implementation aspects. Although the problem is tackled from many sides, one of the most prominent difficulties is the high computing power required for the task. We thus devised and operated a distributed-computing solution, which is described in detail. Tests were conducted on the computing cluster available at the Department of Engineering of the University of Ferrara.
Part II
It focuses instead on wireless networks. Even if the approach is quite general, the stress is on WiFi networks. More specifically, we tried to evaluate how mobile-users’ experience can be improved. Two tools are considered. In the first place, we wrote a packet-level simulator and used it to esti-mate the impact of pricing strategies in allocating the bandwidth resource, finding out the need for such solutions. Secondly, we developed a high-level simulator that strongly advises to deepen the topic of user cooperation for the selection of the “best” point of access, when many are available. We also propose one such policy
A quick search method for audio signals based on a piecewise linear representation of feature trajectories
This paper presents a new method for a quick similarity-based search through
long unlabeled audio streams to detect and locate audio clips provided by
users. The method involves feature-dimension reduction based on a piecewise
linear representation of a sequential feature trajectory extracted from a long
audio stream. Two techniques enable us to obtain a piecewise linear
representation: the dynamic segmentation of feature trajectories and the
segment-based Karhunen-L\'{o}eve (KL) transform. The proposed search method
guarantees the same search results as the search method without the proposed
feature-dimension reduction method in principle. Experiment results indicate
significant improvements in search speed. For example the proposed method
reduced the total search time to approximately 1/12 that of previous methods
and detected queries in approximately 0.3 seconds from a 200-hour audio
database.Comment: 20 pages, to appear in IEEE Transactions on Audio, Speech and
Language Processin
Recommended from our members
Multimedia: information representation and access
[About the book]
Information retrieval (IR) is a complex human activity supported by sophisticated systems. Information science has contributed much to the design and evaluation of previous generations of IR system development and to our general understanding of how such systems should be designed and yet, due to the increasing success and diversity of IR systems, many recent textbooks concentrate on IR systems themselves and ignore the human side of searching for information. This book is the first text to provide an information science perspective on IR
A Survey on Multimedia Content Protection Mechanisms
Cloud computing has emerged to influence multimedia content providers like Disney to render their multimedia services. When content providers use the public cloud, there are chances to have pirated copies further leading to a loss in revenues. At the same time, technological advancements regarding content recording and hosting made it easy to duplicate genuine multimedia objects. This problem has increased with increased usage of a cloud platform for rendering multimedia content to users across the globe. Therefore it is essential to have mechanisms to detect video copy, discover copyright infringement of multimedia content and protect the interests of genuine content providers. It is a challenging and computationally expensive problem to be addressed considering the exponential growth of multimedia content over the internet. In this paper, we surveyed multimedia-content protection mechanisms which throw light on different kinds of multimedia, multimedia content modification methods, and techniques to protect intellectual property from abuse and copyright infringement. It also focuses on challenges involved in protecting multimedia content and the research gaps in the area of cloud-based multimedia content protection
Listening to features
This work explores nonparametric methods which aim at synthesizing audio from
low-dimensionnal acoustic features typically used in MIR frameworks. Several
issues prevent this task to be straightforwardly achieved. Such features are
designed for analysis and not for synthesis, thus favoring high-level
description over easily inverted acoustic representation. Whereas some previous
studies already considered the problem of synthesizing audio from features such
as Mel-Frequency Cepstral Coefficients, they mainly relied on the explicit
formula used to compute those features in order to inverse them. Here, we
instead adopt a simple blind approach, where arbitrary sets of features can be
used during synthesis and where reconstruction is exemplar-based. After testing
the approach on a speech synthesis from well known features problem, we apply
it to the more complex task of inverting songs from the Million Song Dataset.
What makes this task harder is twofold. First, that features are irregularly
spaced in the temporal domain according to an onset-based segmentation. Second
the exact method used to compute these features is unknown, although the
features for new audio can be computed using their API as a black-box. In this
paper, we detail these difficulties and present a framework to nonetheless
attempting such synthesis by concatenating audio samples from a training
dataset, whose features have been computed beforehand. Samples are selected at
the segment level, in the feature space with a simple nearest neighbor search.
Additionnal constraints can then be defined to enhance the synthesis
pertinence. Preliminary experiments are presented using RWC and GTZAN audio
datasets to synthesize tracks from the Million Song Dataset.Comment: Technical Repor
- …