2,725 research outputs found
A Fingerprinting System for Musical Content
Driven by the recent advances in digital entertainment technologies, digital multimedia content (such as music and movies) is becoming a major part of the average computer user experience. Through daily interaction with digital multimedia content, large digital collections of music, audio and sound effects have emerged. Furthermore, these collections are produced/consumed by different groups of users such as the entertainment, music, movie and animation industries. Therefore, the need for identification and management of such content grows proportionally to the increasing widespread availability of such media virtually ”any time and any where” over the Internet. In this paper, we propose a novel algorithm for robust perceptual hashing of musical content using balanced multiwavelets (BMW). The procedure for generating robust perceptual hash values (or fingerprints) is described in details. The generated hash values are used for identifying, searching, and retrieving musical content from large musical databases. Furthermore, we illustrate, through extensive computer simulation, the robustness of the proposed framework to efficiently represent audio content and withstand several signal processing attacks and manipulations
Fingerprinting Smart Devices Through Embedded Acoustic Components
The widespread use of smart devices gives rise to both security and privacy
concerns. Fingerprinting smart devices can assist in authenticating physical
devices, but it can also jeopardize privacy by allowing remote identification
without user awareness. We propose a novel fingerprinting approach that uses
the microphones and speakers of smart phones to uniquely identify an individual
device. During fabrication, subtle imperfections arise in device microphones
and speakers which induce anomalies in produced and received sounds. We exploit
this observation to fingerprint smart devices through playback and recording of
audio samples. We use audio-metric tools to analyze and explore different
acoustic features and analyze their ability to successfully fingerprint smart
devices. Our experiments show that it is even possible to fingerprint devices
that have the same vendor and model; we were able to accurately distinguish
over 93% of all recorded audio clips from 15 different units of the same model.
Our study identifies the prominent acoustic features capable of fingerprinting
devices with high success rate and examines the effect of background noise and
other variables on fingerprinting accuracy
Towards an All-Purpose Content-Based Multimedia Information Retrieval System
The growth of multimedia collections - in terms of size, heterogeneity, and
variety of media types - necessitates systems that are able to conjointly deal
with several forms of media, especially when it comes to searching for
particular objects. However, existing retrieval systems are organized in silos
and treat different media types separately. As a consequence, retrieval across
media types is either not supported at all or subject to major limitations. In
this paper, we present vitrivr, a content-based multimedia information
retrieval stack. As opposed to the keyword search approach implemented by most
media management systems, vitrivr makes direct use of the object's content to
facilitate different types of similarity search, such as Query-by-Example or
Query-by-Sketch, for and, most importantly, across different media types -
namely, images, audio, videos, and 3D models. Furthermore, we introduce a new
web-based user interface that enables easy-to-use, multimodal retrieval from
and browsing in mixed media collections. The effectiveness of vitrivr is shown
on the basis of a user study that involves different query and media types. To
the best of our knowledge, the full vitrivr stack is unique in that it is the
first multimedia retrieval system that seamlessly integrates support for four
different types of media. As such, it paves the way towards an all-purpose,
content-based multimedia information retrieval system
Final Research Report for Sound Design and Audio Player
This deliverable describes the work on Task 4.3 Algorithms for sound design and feature developments for audio player. The audio player runs on the in-store player (ISP) and takes care of rendering the music playlists via beat-synchronous automatic DJ mixing, taking advantage of the rich musical content description extracted in T4.2 (beat markers, structural segmentation into intro and outro, musical and sound content classification).
The deliverable covers prototypes and final results on: (1) automatic beat-synchronous mixing by beat alignment and time stretching – we developed an algorithm for beat alignment and scheduling of time-stretched tracks; (2) compensation of play duration changes introduced by time stretching – in order to make the playlist generator independent of beat mixing, we chose to readjust the tempo of played tracks such that their stretched duration is the same as their original duration; (3) prospective research on the extraction of data from DJ mixes – to alleviate the lack of extensive ground truth databases of DJ mixing practices, we propose steps towards extracting this data from existing mixes by alignment and unmixing of the tracks in a mix. We also show how these methods can be evaluated even without labelled test data, and propose an open dataset for further research; (4) a description of the software player module, a GUI-less application to run on the ISP that performs streaming of tracks from disk and beat-synchronous mixing.
The estimation of cue points where tracks should cross-fade is now described in D4.7 Final Research Report on Auto-Tagging of Music.EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC D
A Review of Audio Features and Statistical Models Exploited for Voice Pattern Design
Audio fingerprinting, also named as audio hashing, has been well-known as a
powerful technique to perform audio identification and synchronization. It
basically involves two major steps: fingerprint (voice pattern) design and
matching search. While the first step concerns the derivation of a robust and
compact audio signature, the second step usually requires knowledge about
database and quick-search algorithms. Though this technique offers a wide range
of real-world applications, to the best of the authors' knowledge, a
comprehensive survey of existing algorithms appeared more than eight years ago.
Thus, in this paper, we present a more up-to-date review and, for emphasizing
on the audio signal processing aspect, we focus our state-of-the-art survey on
the fingerprint design step for which various audio features and their
tractable statistical models are discussed.Comment: http://www.iaria.org/conferences2015/PATTERNS15.html ; Seventh
International Conferences on Pervasive Patterns and Applications (PATTERNS
2015), Mar 2015, Nice, Franc
DMCA Safe Harbors and the Future of New Digital Music Sharing Platforms
SoundCloud is an online service provider that allows users to upload, share, and download music that they have created. It is an innovative platform for both amateur and established producers and disc jockeys (DJs) to showcase their original tracks and remixes. Unfortunately, it is also a platform that lends itself to widespread copyright infringement. Looking toward potential litigation, several factors ought to be considered by SoundCloud and other similar providers. The Viacom v. YouTube case, decided in the Southern District of New York and now currently on appeal in the Second Circuit, sheds light on the potential liability service providers like SoundCloud face. It draws out the Digital Millennium Copyright Act’s (DMCA) safe harbor provisions under which SoundCloud could potentially find protection. However, SoundCloud is unique among similar service providers because it provides users with a variety of viewing, sharing and downloading options that are built into the platform. These options could lead to infringement that would not fall under a DMCA safe harbor. This Issue Brief will discuss the various arguments to be made for and against SoundCloud’s liability, and examine whether the unique utility provided by the service to users could be sustained in the face of potential litigation. Ultimately, the safeguards used by SoundCloud to filter blatant infringement, combined with the DMCA § 512(c) safe harbor, should allow this innovative platform to maintain its current model without neutering its core functionality
Listening to features
This work explores nonparametric methods which aim at synthesizing audio from
low-dimensionnal acoustic features typically used in MIR frameworks. Several
issues prevent this task to be straightforwardly achieved. Such features are
designed for analysis and not for synthesis, thus favoring high-level
description over easily inverted acoustic representation. Whereas some previous
studies already considered the problem of synthesizing audio from features such
as Mel-Frequency Cepstral Coefficients, they mainly relied on the explicit
formula used to compute those features in order to inverse them. Here, we
instead adopt a simple blind approach, where arbitrary sets of features can be
used during synthesis and where reconstruction is exemplar-based. After testing
the approach on a speech synthesis from well known features problem, we apply
it to the more complex task of inverting songs from the Million Song Dataset.
What makes this task harder is twofold. First, that features are irregularly
spaced in the temporal domain according to an onset-based segmentation. Second
the exact method used to compute these features is unknown, although the
features for new audio can be computed using their API as a black-box. In this
paper, we detail these difficulties and present a framework to nonetheless
attempting such synthesis by concatenating audio samples from a training
dataset, whose features have been computed beforehand. Samples are selected at
the segment level, in the feature space with a simple nearest neighbor search.
Additionnal constraints can then be defined to enhance the synthesis
pertinence. Preliminary experiments are presented using RWC and GTZAN audio
datasets to synthesize tracks from the Million Song Dataset.Comment: Technical Repor
- …