7,308 research outputs found
Activity-driven content adaptation for effective video summarisation
In this paper, we present a novel method for content adaptation and video summarization fully implemented in compressed-domain. Firstly, summarization of generic videos is modeled as the process of extracted human objects under various activities/events. Accordingly, frames are classified into five categories via fuzzy decision including shot changes (cut and gradual transitions), motion activities (camera motion and object motion) and others by using two inter-frame measurements. Secondly, human objects are detected using Haar-like features. With the detected human objects and attained frame categories, activity levels for each frame are determined to adapt with video contents. Continuous frames belonging to same category are grouped to form one activity entry as content of interest (COI) which will convert the original video into a series of activities. An overall adjustable quota is used to control the size of generated summarization for efficient streaming purpose. Upon this quota, the frames selected for summarization are determined by evenly sampling the accumulated activity levels for content adaptation. Quantitative evaluations have proved the effectiveness and efficiency of our proposed approach, which provides a more flexible and general solution for this topic as domain-specific tasks such as accurate recognition of objects can be avoided
Sequential Complexity as a Descriptor for Musical Similarity
We propose string compressibility as a descriptor of temporal structure in
audio, for the purpose of determining musical similarity. Our descriptors are
based on computing track-wise compression rates of quantised audio features,
using multiple temporal resolutions and quantisation granularities. To verify
that our descriptors capture musically relevant information, we incorporate our
descriptors into similarity rating prediction and song year prediction tasks.
We base our evaluation on a dataset of 15500 track excerpts of Western popular
music, for which we obtain 7800 web-sourced pairwise similarity ratings. To
assess the agreement among similarity ratings, we perform an evaluation under
controlled conditions, obtaining a rank correlation of 0.33 between intersected
sets of ratings. Combined with bag-of-features descriptors, we obtain
performance gains of 31.1% and 10.9% for similarity rating prediction and song
year prediction. For both tasks, analysis of selected descriptors reveals that
representing features at multiple time scales benefits prediction accuracy.Comment: 13 pages, 9 figures, 8 tables. Accepted versio
Beat histogram features for rhythm-based musical genre classification using multiple novelty functions
In this paper we present beat histogram features for multiple level rhythm description and evaluate them in a musical genre classification task. Audio features pertaining to various musical content categories and their related novelty functions are extracted as a basis for the creation of beat histograms. The proposed features capture not only amplitude, but also tonal and general spectral changes in the signal, aiming to represent as much rhythmic information as possible. The most and least informative features are identified through feature selection methods and are then tested using Support Vector Machines on five genre datasets concerning classification accuracy against a baseline feature set. Results show that the presented features provide comparable classification accuracy with respect to other genre classification approaches using periodicity histograms and display a performance close to that of much more elaborate up-to-date approaches for rhythm description. The use of bar boundary annotations for the texture frames has provided an improvement for the dance-oriented Ballroom dataset. The comparably small number of descriptors and the possibility of evaluating the influence of specific signal components to the general rhythmic content encourage the further use of the method in rhythm description tasks
Efficient Analysis in Multimedia Databases
The rapid progress of digital technology has led to a situation
where computers have become ubiquitous tools. Now we can find them
in almost every environment, be it industrial or even private. With
ever increasing performance computers assumed more and more vital
tasks in engineering, climate and environmental research, medicine
and the content industry. Previously, these tasks could only be
accomplished by spending enormous amounts of time and money. By
using digital sensor devices, like earth observation satellites,
genome sequencers or video cameras, the amount and complexity of
data with a spatial or temporal relation has gown enormously. This
has led to new challenges for the data analysis and requires the use
of modern multimedia databases.
This thesis aims at developing efficient techniques for the analysis
of complex multimedia objects such as CAD data, time series and
videos. It is assumed that the data is modeled by commonly used
representations. For example CAD data is represented as a set of
voxels, audio and video data is represented as multi-represented,
multi-dimensional time series.
The main part of this thesis focuses on finding efficient methods
for collision queries of complex spatial objects. One way to speed
up those queries is to employ a cost-based decompositioning,
which uses interval groups to approximate a spatial object. For
example, this technique can be used for the Digital Mock-Up (DMU)
process, which helps engineers to ensure short product cycles. This
thesis defines and discusses a new similarity measure for time
series called threshold-similarity. Two time series are
considered similar if they expose a similar behavior regarding the
transgression of a given threshold value. Another part of the thesis
is concerned with the efficient calculation of reverse
k-nearest neighbor (RkNN) queries in general metric spaces
using conservative and progressive approximations. The aim of such
RkNN queries is to determine the impact of single objects on the
whole database. At the end, the thesis deals with video
retrieval and hierarchical genre classification of music
using multiple representations. The practical relevance of the
discussed genre classification approach is highlighted with a
prototype tool that helps the user to organize large music
collections.
Both the efficiency and the effectiveness of the presented
techniques are thoroughly analyzed. The benefits over traditional
approaches are shown by evaluating the new methods on real-world
test datasets
Digital Image Access & Retrieval
The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio
- …